2008-08-18
Interoperable ODF: Finding Ground Truth
Jesper Lund Stocholm has found his files from the Microsoft Document Interoperability Initiative ODF Workshop. His post, "DII ODF Workshop - the good stuff", shares the nitty-gritty on-the-ground experience of transferring ODF documents from OpenOffice.org to Microsoft's pre-beta Office 2007 SP2 implementation and back again. There's a download of eleven test files, each in two forms, along with PDFs of how they render. There's an OpenOffice.org version of each document. Then there's the Microsoft Office 2007 SP2 pre-beta ODF saving of the same document. This is enough to discern how the the two applications handle application-specific features from other applications and express application-specific features of their own.
There are some great lessons becoming available with regard to interoperable use of document formats. Here's what I see in terms of the Microsoft Office and OpenOffice.org implementations of ODF:
- Being standard is not the same as being interoperable.
Lund Stocholm points out, "The result of the validation is that all files generated by Microsoft Office 2007 SP2 are valid ODF 1.1-files." The validation is essentially syntactical and that is not going to deal with all of the tolerated implementation variability, semantic bugs, and need for out-of-band agreements where the specification is (purposely and perhaps valuably) left wishy-washy.
- There's a tremendous amount of binary information packaged in OO.o 2.4 and Office 2007 ODF document implementations.
This information is carried in outside-of-ODF namespaces and MIME types for which there is no mutual agreement. This can be reconciled among the different implementations, and we might expect more harmony before Office 2007 SP2 ships, assuming there are no intellectual-property difficulties not covered by existing non-assertion covenants. This is a tricky area with socio-political and competition-law ramifications (illustrated by how no one seems to be bothered by the amount of binary material used in OO.o's implementation of ODF).
- ODF-specification versioning is going to bother us for years, if not forever.
Version churn is going to be a serious problem until those able to insist on demonstrable interoperability among applications compel some rational process for dealing with specification and implementation incompatibilities and defects, The stakes are now raised for achieving useful up- and down-level accommodation of specification and (deviating but widespread) implementation versions. Although I can see no way the ODF spreadsheet-formula problem could have been avoided, in particular, we must face two painful situations:- XML namespaces for ODF are not dealt with as contracted interfaces with explicit discrimination of additions and changes between versions of the specifications.
- Requiring private agreement on spreadsheet formulas through at least ODF 1.1 is going to force dealing with at least three versions in the future, something like
- a Microsoft Excel formula namespace (better: an ECMA-376 or IS 29500 one),
xmlns:msoxl="http://schemas.microsoft.com/office/excel/formula"
- an OpenOffice.org formula namespace,
xmlns:oooc="http://openoffice.org/2004/calc"
- the default ODF OpenFormula namespace when finally introduced into ODF
- versions of the above with their individual defects and incompatible implementations
- It's the application [stupid?]
People don't deal with formats and the nuances of format versions, allowed options, and private agreements. People deal with software and the quality (and fidelity) of the electronic document that the software provides. Expecting individual users to be self-consciously attentive to limitations on conformance and interoperability is even more hopeless than demanding meticulous adherence to security policies and practices in ordinary office work. What people do want is for their interoperability case (however articulated) to just work. In reality, even "Save as ..." is asking too much.- The first part of this lesson is going to involve recognition of the degree to which end-users are going to address interoperability by choosing specific software and believing interoperability is achieved, the ever-popular solution.
- The second part of this lesson is recognition of the distance between the current state and one with broader interoperability and confident substitution of alternative software choices. The differences among major ODF implementations will reveal how easy it is to lose interoperability while conforming to the current specifications.
- Ultimately, we may have to accept that we are unwilling to pay the price for significant interoperability assurance except under extraordinary circumstances. The "cost of interoperability" debate is ahead of us.
I don't foresee the Harmony Principles alleviating this situation in any way. At best, I expect it to help us appreciate the cost of interoperability and its improvement over time.
Labels: interoperability, ODF, versioning