tools for document interoperability
This is the web diary for nfoWorks and realization of the Harmony Principles. Pursuing Harmony tracks nfoWorks research, analysis, specification, and implementation of tools for document interoperability. There is commentary on related activities that address conformance, interoperability, and harmonization of document formats.
The nfoCentrale Blog Conclave
nfoCentrale Associated Sites
Technorati Tags: open source, open formats, interoperability, open development, OpenDocument Format, ODF-OOXML Harmonization, ODF
I have no appraisal of the relative maturity and quality of the various toolkits that are emerging on the ODF scene (and likewise with regard to OOXML). However, it is important to have a cataloging of what there is. This is a random start. I will add to this post and build an nfoWorks catalog page later:
An important resource for ways to harmonize document formats involves attention to the libraries and models employed for constructing document-centric software and their applications. This applies for the development of testing and conformance tools as well as for implementation of format-supporting software products. Indeed, one might reasonably expect that such tools would be a companion demonstration of implementation-support quality.
In the interesting case of OpenDocument Format, the availability of open-source code bases for implementations is both a risk (in that deviations or omissions in support for the standards
For ODF, the continuing work on toolkits and on independent open-source implementations is providing important diversity. This can inform the search for a harmonious profile and perhaps suggest adaptations that encourage harmonious implementations. Diversity across platforms and programming models may also help in the recognition and abstraction of essentials away from implementation incidentals. That can also be valuable in ensuring that harmonization is on essentials and not accidents of implementation.
I will be reviewing available toolkits, libraries, and APIs as I define my own around interface contracts for abstracted levels of document models and processing support. I expect some cross-fertilization while adhering to a model that is concentrated on harmony.
Technorati Tags: Officeshots.org, ODF, interoperability, document interchange, document fidelity, confirmable experience
The new Officeshots.org service received a fair amount of attention at the recent ODF Interoperability Plugfest. Taking a page from the “test your site with all browsers” tools that are available, Office Shots will take an uploaded ODF document and show how it renders in different ODF-supporting products. To deal with the problem of confirming appearance of the document back to the submitter, the rendering by each application is captured in PDF.
This is a fledgling service, currently in limited beta. It is sponsored by the same Dutch organizations that sponsored the ODF Plugfest.
The power of the service is its user-relevant confirmation of the fidelity with which a document of interest is rendered by different ODF-supporting software/platform combinations. It is an easy way for evaluators to verify whether their important documents are rendered successfully in interchange among ODF products. It also allows the subjective determination of success to be left in the hands of the users who know what qualifies as acceptable fidelity in each particular case.
One of the most-difficult situations in interchange of documents is when the receiver is seeing something materially different than what the sender (1) had in mind and (2) expects has been communicated. For the parties to communicate about a suspected difficulty, they need to use a “channel” that differs from the one that has apparently failed. Screen shots serve that purpose. PDF is also valuable in the case where a PDF can be extracted that accurately-enough reflects what is intended and/or what is being seen.
Office Shots provide a way to proactively check, either because a problem is suspected with a local rendition or to ensure that a document and the choice of implementation-supported features is treated consistently by a variety of other implementations/platforms.
One can imagine that, over time, we could see Office Shots support links for troubleshooting specific discrepancies, finding practices for avoiding many of them, and easy reporting of problems to development teams.
Office Shots promises to provide a terrific reality-based approach to confirming the interoperability of ODF implementations as far as presentation fidelity is concerned. This is also a first-line check on confirming difficulties with round-trip inter-product fidelity preservation. (Of course, if the goal is solely presentation fidelity, PDF and other final-form formats may be preferable, especially when long-term preservation is also a consideration.)
I look forward to the impetus that Office Shots will provide to user recognition of practical ODF interoperability considerations. I also think it will provide important stimulus and confirmation for developers who want to improve the interoperable use of their ODF-supporting software.
Beside the Officeshots.org site, there are other discussions of the project and its potential:
Technorati Tags: ODF-OOXML Harmonization, OpenDocument Format, interoperability, conformance, testing, validation, verification, open standards
It is sponsored by a neutral (ODF-supporting) organization. It is attended by major implementers of ODF-supporting products, including IBM, Microsoft, and Sun Microsystems.
In short, all of the right people are in the same room, some for the first time, and I am so envious that I am not among them. There should be a great deal of creative tension.
I will be watching for materials and progress reports. There is already Doug Mahugh’s useful pre-event post on how Microsoft tested the ODF implementation in Office 2007 SP2 to ensure that it only produced standard-conforming documents and failed in ways that did not introduce security exploits against the Office System or documents of its users.
I have been meaning to post more about my involvement with ODF and how it is fueled by my interest in the harmonious level at which we can start and expand interoperability based around standard, open formats for office-productivity applications. I will do that separately. For now, I just want to register my excitement for the positive stage that participation at this meeting represents.
[Update 2009-06-16-18:56Z There are little odds and ends available from the ODF Plugfest so far, and I will compile some links here for safe-keeping. I am sure there will be additional blog posts and reports by more attendees after they have had some time for reflection]
[Update 2009-06-17-17:11Z with a few more straggling in]
[Update 2009-06-18-17:51Z as other posts show up]
[Update 2009-06-23-14:55Z with some stragglers]
[Update 2009-06-24-18:55Z and one more interesting appraisal]
[Update 2009-06-27-21:40Z and the hits keep on coming …]
[Update 2009-07-01-15:25Z wrapping up, with anything more on plugfests in future posts]
Here are some apple-orange notions that have come to my attention in an oddly-convergent way.
New OASIS Technical Committee IPR Mode
OASIS has just announced the pending addition of a 4th IPR Mode to the set that technical committees can use as the way intellectual property (mainly essential claims of patents) will be made available to adopters of a TC-produced specification:
The ODF TC operates under the RF on Limited Terms Mode, the most-generous mode available until now. As stated under the OASIS IPR Policy, a TC may not change its IPR Mode without closing and submitting a new charter. I don’t expect such a shut-down and restart to happen, especially before ODF 1.2 becomes a ratified OASIS Standard.
Many will welcome this new mode. I know that my willingness to participate in OASIS Technical Committee activities increases exponentially as we move down the list. The RF on Limited Terms and the new Non-Assertion modes are the only ones that I have no hesitation about.
The Non-Assertion Mode is comparable to everyone obligated by the IPR mode having automatically made an equivalent of the Microsoft Open-Specification Promise with regard to the specifications produced by the TC during their participation.
Of course contributors, participants, and anyone else can provide non-assertion covenants with regard to any specification, as Sun Microsystems did for ODF in September, 2005.
Implementation License Models and Interoperability
The licenses under OASIS IPR modes apply to implementations of the applicable specifications, such as ODF.
I have recently been dealing with provisions of the ODF specification that do not seem to be understandable on their own, not even by consulting referenced source materials. In that case, there is no way to ensure interoperability without consulting an implementation or two. In complex cases (such as figuring out how to decrypt an ODF document that is encrypted using the approach sketched in the ODF specification), it is actually necessary to inspect code to determine what the missing but essential details might be. (It would be better to find implementation descriptions that explain how the specification is being satisfied, but too often the code is the only reliable implementation description.)
When the code is available in an open-source implementation, it may be possible to reverse-engineer an implementation-independent interoperable interpretation. That is what I would look for, assuming that I could master such code well enough to resolve questions the specification leaves open.
Consulting code works for detective work around clarification and hole-filling of the specification. If I want to make an implementation based on that interpretation, I must be especially careful about the license on that code. For example, LGPL and GPL code and other reciprocal-license open-source software is not useful to me in producing software under a license that I prefer (Open BSD, Apache, etc.). I am cautious about digging around in voluminous code anyhow, but I am particularly wary about risking that I might copy GPL code.
In this case, I am reluctant to rely too strongly on an abstracted interpretation unless the specification itself is updated and issued with an interpretation I can then safely rely on.
In effect, specifications that are sufficient for implementation-independent achievement of interoperability, along with royalty-free licenses or covenants, provide the ultimate clean-room support for achievement of unencumbered independent implementations.
That’s what I’m after.
Technorati Tags: Jesper Lund Stocholm, ODF, OpenDocument Format, Microsoft Office 2007, Document Interoperability Initiative
Jesper Lund Stocholm has found his files from the Microsoft Document Interoperability Initiative ODF Workshop. His post, "DII ODF Workshop - the good stuff", shares the nitty-gritty on-the-ground experience of transferring ODF documents from OpenOffice.org to Microsoft's pre-beta Office 2007 SP2 implementation and back again. There's a download of eleven test files, each in two forms, along with PDFs of how they render. There's an OpenOffice.org version of each document. Then there's the Microsoft Office 2007 SP2 pre-beta ODF saving of the same document. This is enough to discern how the the two applications handle application-specific features from other applications and express application-specific features of their own.
There are some great lessons becoming available with regard to interoperable use of document formats. Here's what I see in terms of the Microsoft Office and OpenOffice.org implementations of ODF:
I don't foresee the Harmony Principles alleviating this situation in any way. At best, I expect it to help us appreciate the cost of interoperability and its improvement over time.
created 2008-08-13-18:06 -0700 (pdt)