tools for document interoperability


This is the web diary for nfoWorks and realization of the Harmony Principles. Pursuing Harmony tracks nfoWorks research, analysis, specification, and implementation of tools for document interoperability. There is commentary on related activities that address conformance, interoperability, and harmonization of document formats.

Click for Blog Feed
Blog Feed

Recent Items
Republishing before Silence
The Real Challenge of Achieving and Sustaining Int...
Let’s Try This for a While
Hooptedoodle: Blog Aversion and Standards Ignoranc...
ODF Implementation-Support Toolkits and Libraries
Adding Pursuing Harmony to Technorati
Office Shots for Confirmed ODF Interchange Fidelit...
ODF Interoperability at The Hague
ODF and IPR/Licensing Concerns
Open Government Data: Simple Principles

This page is powered by Blogger. Isn't yours?

Locations of visitors to nfoWorks

The nfoCentrale Blog Conclave
Millennia Antica: The Kiln Sitter's Diary
nfoWorks: Pursuing Harmony
Numbering Peano
Orcmid's Lair
Orcmid's Live Hideout
Prof. von Clueless in the Blunder Dome
Spanner Wingnut's Muddleware Lab (experimental)

nfoCentrale Associated Sites
DMA: The Document Management Alliance
DMware: Document Management Interoperability Exchange
Millennia Antica Pottery
The Miser Project
nfoCentrale: the Anchor Site
nfoWare: Information Processing Technology
nfoWorks: Tools for Document Interoperability
NuovoDoc: Design for Document System Interoperability
ODMA Interoperability Exchange
Orcmid's Lair
TROST: Open-System Trustworthiness



Republishing before Silence

The nfoCentrale blogs, including Pursuing Harmony, were published through Blogger via FTP transfer to my web sites. That service is ending.

Then there will be silence as Blogger is unhooked, although the pages will remain.

No new posts or comments will work until I updated the web site to use its own blog engine. Once that migration is completed, posting will resume here, with details about what to know about the transition and any breakage that remains to be repaired.

Meanwhile, if you are curious to watch how this works out, check on Spanner Wingnut’s Muddleware Lab. It may be in various stages of disrepair, but that blog will come under new custodianship first.

Labels: ,



The Real Challenge of Achieving and Sustaining Interoperability

I recently encountered a new blog by Moritz Berger of Microsoft.  Moritz’s first two posts reveal the among amount of goo and disruption there is beneath the surface of any attempt to collaborate when using standard office-productivity formats among different desktop software products.

I take Moritz’s posts as anecdotal demonstrations of the challenges that we face in attempting interchange and collaboration on documents using different products that claim support of the same standard.  What we expect to be matter-of-fact and effortless is simply not so.  The thornier problems of converting among formats place easy-going collaboration even farther out of reach. 

In that spirit, I recommend Moritz’s latest offering: How To Enhance (Or Destroy) Collaboration.

I am not advocating abandonment of open standards development.  But it is easy to see how there is a natural tendency for communities of adopters to settle on specific products that they all agree to use in order to avoid pitfalls and surprises beyond their control.

If the promise of product substitution and effective multi-platform support is to be made real, we have to become much better at knowing what it means when a product supports a format, how that support changes with new releases, and how this all sorts out when we throw different product implementations into the mix.  We also need some way of actually knowing and safeguarding against use of product functionalities for which interoperability with other productions/versions is at best haphazard when not simply hopeless.

It looks like an uphill struggle.  We must do better.


Let’s Try This for a While

Here is my effort to find a smoother appearance for this blog.

After practicing with this for a while and making any further adjustments, I will republish the entire blog in this format for consistency and to have the Creative Commons notice appear on all pages of the blog.

[update 2010-04-16T04:00Z: I should point out that previous posts that discuss the format of the blog are in reference to a format that no longer exists on the main page and new posts.  While there remain some color-scheme issues with the updates just completed, that’s nothing like the earlier problems.  I will republish the entire blog once there is calm.  That will make earlier posts about the blog format even more peculiar.  I’m looking into ways to deal with that.]



Hooptedoodle: Blog Aversion and Standards Ignorance

This is a maintenance post.  I am twiddling with the template for this blog to see if I can like the result enough to start writing some pent up posts that I have been holding back.

The holding back is, apart from my usual procrastination, because I don’t like the design of this blog page.  I just don’t like it.

[Update 2010-02-03T00:44Z  It seems that Blogger is solving this problem for meMy love-hate relationship with Google Blogger is going to end with my long-overdue disintermediation from Blogger and graduation to self-publishing as well as the current self-hosting on my own domains and hosting-service arrangements.]

The New Is the Enemy of the What Already Works?

When I started the blog, I decided to use one of the newer ready-made Blogger templates that is all CSS’s and prettified and presumably standards-compliant in some elevated way.  My original blog and its kin have templates from back in the day when HTML tables ruled.  I understand those templates pretty well.  For this blog, I thought I’d modern up.

As I grew to despise the new layout, I turned to my usual solution: hand tweaking the template, something that can be done by code-and-fix clueless manipulation of the template:

  • I keep a copy of the progressive changes to the original Blogger-furnished template under source control and I can always revert to an earlier version if I mess up really badly.  There’s even a backup on the site, though it usually lags behind what I have locally and what Blogger is using at the moment.
  • Blogger allows preview of a new template without changing the still-in-effect template.

Both of these arrangements allow me to muck about without too much risk of completely cratering the blog.

So far, so good, right?

Maybe not. 

At Sea In More “Standard” Than I Need

I haven’t figured out how to tweak the CSS and get the result I want.  And I don’t know when modifications I make might will derail the bits and pieces that Blogger automatically inserts into these pages, following the guidance of specially-coded division classes and magical HTML elements with names like <$BlogDateHeaderDate$>

I also don’t have the experience to discern whether the original CSS is very good and what the mound of CSS declarations in the <head> element of every page are required for.  I would like to discard everything not actually being used and then simplify what is left.  I’m not sure how to do that safely.  And I don’t want to make a career out of CSS-crafting, either.  I just want my blog pages to work.

So there is the wonderful preferred “standard” for correctness in web-page operation.   But I can’t decode it enough to make my simple page layout work.

Not Backsliding Just Yet

If all else fails, I will bring over one of my old templates and turn it into one from this blog, rather than attempt to achieve my goal by hacking and hewing on the current CSS-purified design.

Unfortunately, that makes things work with, shudder, the dreaded and feared <table> elements. 

I’m not ready to do that, because one difference in the current format is that it appears to be mobile-ready.  Now, I don’t care all that much whether you can read this post on your telephone.  Still, why lose it if I’ve got it.

A greater concern is the still missing support for accessibility.  While someone may claim that giving up tables for CSS is good for accessibility, it doesn’t actually do anything for accessibility of this site.

I will keep mucking about and we will see where things end up.  It is not promising.  I’m not likely to dig out the CSS1/2 specifications to see how this all really works.  If a little trial-and-error doesn’t cut it, I’ll just struggle along anyhow.

The Old Dog’s Old Trick

I didn’t mind learning HTML.  I didn’t mind learning enough of HTML 4.01 transitional to get along.  Why am I avoiding the latest and greatest or even the recent and still breathing approaches?

I think the difference is that there is no novelty any longer, after acquiring what I needed that was good enough at the time.  What’s next is simply different, but for what I do not noticeably better or interesting.  The old dog doesn’t want the new shiny thing because the old shiny thing was working just fine.  It’s not a new trick, it’s a different trick, and novel only for those whom it their first trick.

And I haven’t given up just yet.  Not in a rush about it either.

Meanwhile, this is a test post to exercise the blog template de jur.




ODF Implementation-Support Toolkits and Libraries

I have no appraisal of the relative maturity and quality of the various toolkits that are emerging on the ODF scene (and likewise with regard to OOXML).  However, it is important to have a cataloging of what there is.  This is a random start.  I will add to this post and build an nfoWorks catalog page later:

  • lpOD: languages & platforms OpenDocument Project (also Français). 
    Definition of a Free Software API implementing the ISO/IEC 26300 standard.
    Development, for higher level use cases, in Python, Perl and Ruby languages.
    of a top-down oriented API.  Licensing is under Free Software Foundation (FSF) versions.

My interest

An important resource for ways to harmonize document formats involves attention to the libraries and models employed for constructing document-centric software and their applications.  This applies for the development of testing and conformance tools as well as for implementation of format-supporting software products.  Indeed, one might reasonably expect that such tools would be a companion demonstration of implementation-support quality.

In the interesting case of OpenDocument Format, the availability of open-source code bases for implementations is both a risk (in that deviations or omissions in support for the standards is are perpetuated through code mimicry) and an opportunity for faster tooling and testing.  Of course, closed-source implementations (and related toolkits) have their own dangers in this regard, while denying public inspection of the code.  I suspect that implementation notes are required in all cases to ensure understanding of intentions and interpretations as well as limitations and the different ways that discretionary matters are handled.

For ODF, the continuing work on toolkits and on independent open-source implementations is providing important diversity.  This can inform the search for a harmonious profile and perhaps suggest adaptations that encourage harmonious implementations.  Diversity across platforms and programming models may also help in the recognition and abstraction of essentials away from implementation incidentals.  That can also be valuable in ensuring that harmonization is on essentials and not accidents of implementation.

I will be reviewing available toolkits, libraries, and APIs as I define my own around interface contracts for abstracted levels of document models and processing support.  I expect some cross-fertilization while adhering to a model that is concentrated on harmony.

Labels: , ,



Adding Pursuing Harmony to Technorati



Just a little housekeeping.  This is a little secret message between me and technorati. 

Labels: ,



Office Shots for Confirmed ODF Interchange Fidelity

The new service received a fair amount of attention at the recent ODF Interoperability Plugfest.  Taking a page from the “test your site with all browsers” tools that are available, Office Shots will take an uploaded ODF document and show how it renders in different ODF-supporting products.  To deal with the problem of confirming appearance of the document back to the submitter, the rendering by each application is captured in PDF.

This is a fledgling service, currently in limited beta.  It is sponsored by the same Dutch organizations that sponsored the ODF Plugfest.

The power of the service is its user-relevant confirmation of the fidelity with which a document of interest is rendered by different ODF-supporting software/platform combinations.  It is an easy way for evaluators to verify whether their important documents are rendered successfully in interchange among ODF products.  It also allows the subjective determination of success to be left in the hands of the users who know what qualifies as acceptable fidelity in each particular case.

One of the most-difficult situations in interchange of documents is when the receiver is seeing something materially different than what the sender (1) had in mind and (2) expects has been communicated.  For the parties to communicate about a suspected difficulty, they need to use a “channel” that differs from the one that has apparently failed.  Screen shots serve that purpose.  PDF is also valuable in the case where a PDF can be extracted that accurately-enough reflects what is intended and/or what is being seen.

Office Shots provide a way to proactively check, either because a problem is suspected with a local rendition or to ensure that a document and the choice of implementation-supported features is treated consistently by a variety of other implementations/platforms.

One can imagine that, over time, we could see Office Shots support links for troubleshooting specific discrepancies, finding practices for avoiding many of them, and easy reporting of problems to development teams.

Office Shots promises to provide a terrific reality-based approach to confirming the interoperability of ODF implementations as far as presentation fidelity is concerned.  This is also a first-line check on confirming difficulties with round-trip inter-product fidelity preservation.  (Of course, if the goal is solely presentation fidelity, PDF and other final-form formats may be preferable, especially when long-term preservation is also a consideration.)

I look forward to the impetus that Office Shots will provide to user recognition of practical ODF interoperability considerations.  I also think it will provide important stimulus and confirmation for developers who want to improve the interoperable use of their ODF-supporting software.

Beside the site, there are other discussions of the project and its potential:

  • Glyn Moody: ODF and the Art of Interoperability.  Open Enterprise (blog), ComputerworldUK, 2009-06-19.
  • Sander Marechal: Easily testing ODF compatibility (odp, pdf).  Presentation to the ODF Plugfest, 2009-06-15.  [In this case, the PDF renders more poorly than the ODP on my computer.  I assume the problem is in the production of the PDF via the ODP implementation, yet another Officeshots interoperability case.]
  • Sander Marechal:  Product submission,, 2009-02-06.

Labels: ,

Construction Structure (Hard Hat Area)
Creative Commons License You are navigating nfoWorks.
This work is licensed under a
Creative Commons Attribution 2.5 License.

template created 2008-08-13-18:06 -0700 (pdt)
$$Author: Orcmid $
$$Date: 13-11-11 19:13 $
$$Revision: 5 $