Welcome to the Hunter-Gatherer stage of nfoWorks,
where the necessary resources are identified and collected.
This is a prelude to organization of software tools, examining
documents from available implementations, and creating test
utilities.
We're also announcing a document
interoperability initiative to ensure that the
documents that are created by users are fully
exchangeable, regardless of the tools that they are
using.
--
Bob Muglia, Senior Vice President, Server and
Tools Business,
Microsoft Corporation, February 21, 2008 [1]
nfoWorks explores just how well documents can
be made fully exchangeable when using a mix of different
OpenDocument and Office Open XML implementations.
The first question: What prerequisites and restraints must be
satisfied to ensure that documents are fully exchangeable and
users can be confident that is the case?
The next question: Is there enough harmonization for
users to willingly create, collaborate, and preserve their work
using only harmonious features of documents?
1. Harmony Principles
(0.1 beta)
First sketched on February 7, 2008 [2],
the Harmony Principles govern software tools and products that
accept harmonious versions of standard document formats and
faithfully produce harmonious versions in any of those formats.
1.1
Conditions of Satisfaction
1.1.1 Users can be confident their creations honor the
format standards and depend only on their harmonious features.
The Harmony Principles are honored by default.
1.1.2
Users can be confident that documents confined to a particular
profile of harmonious features can be interchanged and interoperated
with via any software programs that honor the Harmony Principles for
the class.
1.1.3 In the event that a document relies on features beyond
the harmonious level supported by a software product,
profile-allowed limitation to supported features is explicit,
automatic, and user-understandable.
1.2 The
Principles
1.2.1 Interoperability Classes: Applicability. The
Harmony Principles apply where there are standard formats intended
to carry electronic-document information of essentially the same
nature and having compatible needs for fidelity. These are
natural categories for interoperability.
For example, the word processing documents of
Office Open XML format (OOXML) are in an interoperability class with the
text documents of OpenDocument format (ODF). One would not
expect to find Rich Text Format (RTF, a format under a private
authority) or SGML (an open standard for a different approach to
documents) in this class. It is conceivable that there may be
future additions to a class (e.g., DocBook, a specific type of
SGML/XML document that might be amenable to harmonization profiling
and translation) or, more likely, provision of import and export
translators to formats outside of the class. Spreadsheet documents would be in a different interoperability class.
1.2.2 Standards Compliant. Only open-standard formats are supported
in an interoperability class.
1.2.3 Always Harmonious. No features of a
standard format are supported that are not perfectly and
recognizably represented in all of the formats of the class.
Harmonious features are accurately expressible in any of the
formats. Unsupported features encountered in input documents
are suppressed in a graceful way; an understandable account is
provided.
1.2.4
Specifically Interoperable. Standards and their harmonious features evolve; so do individual software
implementations. Programs and their users can establish
profiles that limit the (versions of) harmonious features relied on
in particular documents. Programs will conform their
employment of features to the requirements of such profiles.
1.3 Becoming Definite
These principles are abstract and indefinite in many ways.
Further development will provide definite characterizations,
especially for the various qualities to be satisfied by harmonious
documents.
1.3.1 Identification of Interoperability Classes. The
initial interoperability classes will be more-sharply defined.
1.3.2 Identification of Open-Standard Formats.
The qualification of public, open standards will need to be
tightened. The preference is for free availability of an ISO
International Standard specification (including ISO/IEC ones).
Other cases will be by exception. Specific criteria will be
required.
1.3.3 Specified, Measurable Fidelity and Interoperability.
There are different kinds and degrees of faithfulness and
suitability in a given context. The notions of fidelity and
harmonization will need to be made definite. The simply-stated
goal is exact matching of content, interpretation, and presentation.
Context, qualifications, and explicit measures are required.
1.3.4 Profile Verification.
There must be definite ways to verify the conformance of an
electronic document to the standard for its format and also to a
prescribed profile of harmonious features.
1.3.5 Degradation of
Excluded Features. Graceful degradation of excluded
features must be defined, accounting for how users are to understand
and to influence what happens.
1.3.6 How Much Harmony Is Enough?
We do not know what the threshold (or thresholds) might be under
important usage conditions. Exploration starts
with seeing how few harmonious features are enough to be useful for
anything,
expanding until there's an acceptable minimal set for for one or
more categories of usage. The value
of additional levels needs to be assessed and the levels
well-defined. The impact of look and feel and user experience
will need to be understood and addressed if it is found to be a
critical factor. There are other external conditions that may
also have to be considered.
2.
Deliverables
All deliverables for nfoWorks are provided under
open-source license and made freely available for download and use.
Source code and related development materials will be published and
tracked on a public open-source development site. The initial
research activity will identify the opportunities to reuse and
contribute to existing work. Unique nfoWorks
deliverables will address the specific achievement of the Harmony
Principles.
2.1
Software Libraries, Utilities, Tests, and Reference Implementations
2.1.1 Existing Freely-Available Materials. Existing
materials will be identified, along with guidance for obtaining them
from authoritative sources. Some materials will be mirrored on
the nfoWare site as a convenience and for reference.
It is not intended that nfoWare provide general
redistribution of existing material.
2.1.2 Building Libraries. Libraries will be founded on
native code (in C and C++ source code) that can be ported between
different operating-system and hardware platforms. The Library
APIs and interfaces will be amenable to integration into
higher-level libraries that deliver frameworks for use in Java,
.NET, Python and other programming and integration models that
support native-layer "interop." All libraries will be
constructed with and useable with freely-available tools and
compiler systems.
2.1.3 Utilities, Test Programs, and Tests. Primarily
designed for command-line and batch operation, utilities and test
programs will be developed in appropriate higher-level languages
when possible. Test data, sample documents, and stress cases
will be provided as they are developed.
2.1.4 Samples and Reference Implementations. Samples
and skeletal applications will be developed and provided to
demonstrate use and employment of the Harmony Principles and
nfoWorks libraries. A reference implementation for a
document-processing desktop application may be considered.
Performance, usability, and general fit and finish is not required
to be at a level of quality or support required of end-user
productivity software.
2.1.5 Reusability, Not Product, Ambitions. The
ambition for nfoWorks is to provide libraries and
utilities of a quality level that has them be valuable and
attractive for use in closed-source and open-source products where
realization of Harmony Principles is important. The
nfoWorks deliverables are meant to be a source of
consistency for document interoperability. It is more
important to encourage and support adoption in products than
engaging in direct product delivery. Community adoption and
participation is preferred.
2.2
Analysis and Guidance
2.2.1 Determination of Harmonization. An important
output of nfoWorks activity is documented analysis of
the selected specifications and the way that features are partially
or entirely harmonizable or excluded from harmonization (whether as
a temporary expedient or until standards-development provides better
resolution).
2.2.2 Harmonization Guidance. The nfoWorks
experience will lead to guidance on how to safely navigate the
feature sets of different standards and their major implementations.
The analysis and guidance will be used in recommendations concerning
profile agreements and in suggestions to standards-development
projects conducting maintenance on standards.
2.2.3 Harmonization of Harmonization. Wherever
possible, nfoWorks will be aligned with other harmonization
activities and producing collaborative results, whether at
nfoWorks or other readily-accessible location, is primary.
2.3
Documentation and Specifications
2.3.1 Specified Protocols, Interfaces, and Behavior.
The nfoWorks libraries and profile guidance will be
supported by careful specifications that can be confirmed by
inspection and tests.
2.3.2 Documentation for Usage and Adaptation.
Documentation sets will provide information and examples of usage of
the libraries and utilities. There will also be documentation
to support adaptation of nfoWorks software for
customization and extension.
3.
Incremental Development
To provide steady progress and definite results, an increment
approach will be applied. This involves iterations of
additions and expanded functionality to a software base that is
always working, whether or not considered particularly useful.
3.1 The
Least that Can Possibly Work
3.1.1 Starting with the first iteration, software
deliverables can be built and deployed. The software will
perform a complete, end-to-end process, no matter how rudimentary
the features may be. The idea is to demonstrate a simple
harmonization case and then expand the set of
demonstrably-harmonized features.
3.1.2 At every iteration, the least that is needed to
provide some minimal feature set or feature expansion will be
introduced and tested.
3.1.3 Intermediate results may be incorporated in
specialized document-processing software and utilities as further
demonstration of usability. The purpose is to build confidence
in the operation of the software and encourage its adaptation to
practical purposes. The primary effort will be toward
increasing the set of harmonious feature implementations and
translations.
3.2
Availability of Tools, Test, Software and Experience
3.2.1 The material and the results of each iteration are
made available in development folders of this site.
3.2.2 The code base, all changes, and downloadable results
are maintained as open-source software and kept available and
current on an open-source project site with full source-code control
and bug-tracking support.
3.2.3 Everything needed to reproduce the construction and
confirmation of software and tests is provided. Those wanting
to contribute to the tools or make specialized versions of their own
can begin by replicating the construction of the appropriate
version.
4.
Start-Up Activities
Initial activities focus on gathering information and
resources that are needed for the commencement of analysis and
experiments. When a starter set of initial materials has
accumulated, new activities can start in parallel. Research
and collection of information and resources will continue.
The results of these activities will be observable in the growth
of the nfoWorks Notes
Catalog.
4.1
Gathering of Specifications, Analyses and Sources
4.1.1 nfoWorks notes will provide references
to the sources and the specifications of relevant standards.
There will also be cache's of the material for preservation and
reference in nfoWorks activities. Only
freely-redistributable ones will be accessible on nfoWorks.
Instructions will be provided for independently obtaining any of
those materials that remain publicly available, whether free or for
sale.
4.1.2 Related analysis efforts will also be catalogued and
tracked in nfoWorks notes. The notes will
provide information on participating in the related work and
on obtaining available materials. Some materials may be cached on
nfoWorks as well. The efforts associated with
standards development are explored first, followed by relevant
privately-conducted but public efforts.
4.2
Collection of Usable Software, Documentation, and Examples
4.2.1 We are interested in freely-available software
provided for working with standard formats, including software for
translating to and from other (standard) formats.
4.2.2 The initial effort consists of cataloging what is
available and identifying how it is obtained. Software that is
free to use and redistribute without limitation is preferred.
Source code that has no limitation on derivative works and their
licensing is desired when that code is usefully adaptable for
nfoWorks use.
4.3 Collection of
Supporting Tools and Utilities
4.3.1 Tools and utilities for building nfoWorks
software and tests are cataloged and collected.
4.3.2 When the tools and utilities are of general use and
useable for more than nfoWorks, the collected software
and supporting documentation may be hosted elsewhere. Notes at nfoWorks will
link to the general information and relate it to the specific use in
nfoWorks projects.
4.3.3 It is required that all of the software-development
projects for nfoWorks be freely reproducible, with or
without modification. All tools and utilities used will be
ones that are freely available and having no limitation on their use.
4.4
Overlap with Other Activities
4.4.1 Some of these efforts are ongoing and will continue
beyond the commencement of analysis and experimentation efforts.
4.4.2 Other effort can proceed once there is a "starter set"
of the initial resources.
4.4.3 There are also non-nfoWare efforts
underway, and these will impact the pace of nfoWare development.
For now, nfoWare is moving along in a
leisurely pace.
5.
Related Work
5.1
Availability and License Considerations
5.1.1 Preference is given to related work that is available
to the public. Works that are freely-available are the first
choice and the ideal case consists of material under a Creative
Commons Attribution license, or equivalent. (Public domain
works with a known authorship will be treated the same.)
5.1.2 For software projects that provide source code, that
code will be relied upon only if it is furnished under a BSD
Template license or license compatible with the BSD Template license
furnished for
nfoWorks software deliverables.
5.1.3 Software tools and utilities having reciprocal
licenses (e.g., the GNU Public License, GPL) will be used and
redistributed in binary form without modification.
5.1.4 Proprietary (e.g., Microsoft Office) and
reciprocally-licensed (e.g., OpenOffice.org) products will be relied
upon only to the extent that there are APIs and SDKs that are
compatible with introduction of plug-ins and extensions using
nfoWorks deliverables. Of course, many such products
will be operated as part of testing and determination of
harmonization achievement. Those products will not be
redistributed via nfoWorks.
5.2 Standards Development
5.2.1
DIN NIA Working Group on Translation 29500-26300. This
working group of the DIN (German Standards National Body) mirror of
ISO/IEC JTC1 SC34 proposes to identify the differences in IS 29500
(OOXML) and IS 26300 (OpenDocument) that need to be understood to
accomplish harmonization and interoperability. An initial
working paper is available. This work will also track the
existing translation projects.
5.2.2
ISO/IEC
JTC1 SC34 Subcommittee on Document Description and Processing
Languages. SC34 proposes to create 3 working groups.
One each for work on IS 29500 and IS 26300 and another for
harmonization (Resolution 4 of
March 2008 plenary meeting). The DIN NIA Working Group has
presented its approach for consideration in the harmonization work
(1.7MB
PDF file). There is also an important effort to capture
all comments and known defects in IS 29500 so they are preserved for
the maintenance activity. The next actions for establishment
of IS 29500 maintenance and harmonization studies are expected at
the SC34 Plenary meeting in Korea at the end of September, 2008.
5.2.3
OASIS Open Document Format for Office Applications (OpenDocument) TC
5.2.4
Ecma TC45 - Office Open XML Formats
5.3 Translation Projects
Some of these projects are dormant; the extent of completion and
usable material has not been determined. There are additional
projects that remain to be identified.
5.3.1
OpenXML/ODF Translator Add-in for Office
5.3.2
Binary (doc, xls, ppt) to OpenXML Translator
5.3.3
Open XML to DAISY XML Translator M3 Beta
5.3.4
OpenOffice
Filter to Microsoft Word XML
5.3.5
OpenOffice.org Writer Pre-Export Filter
5.3.6
UOF Converter for OpenOffice.org
5.4 Industry Initiatives
It appears to be quite easy to locate interoperability
initiatives in which Microsoft is a participant or the sponsoring
agent. Industry initiatives are distinguished from
advocacy efforts that do not involve pro-active achievement of
interoperability arrangements, conformance testing, and other
efforts to establish document interoperability (whether exclusively
or as an in-scope focus). Non Microsoft-centric industry
initiatives are of interest too.
5.4.1
Interop Vendor Alliance
5.4.2 Document Interoperability Initiative
5.4.3
Interoperability Forum
5.5 Product SDKs and Import-Export Functions
Product SDKs provide opportunities for rapid construction of
fixtures that work with a product-supported format.
Provisions for, and examples of import-export functions provide
additional insight into ways for working with formats and
potentially introducing harmonization via import-export or
more-tightly integrated document processing.
5.6 Other Efforts
There are efforts under governmental and academic institutions.
Their relevance to (and requirements for) harmonization efforts will
be assessed. There are also individual contributors with
information on blogs, web sites, and wikis.
6.
References and Resources
- [1]
Microsoft.
- Steve Ballmer, Ray Ozzie, Bob Muglia, Brad Smith:
Press Conference Call on Microsoft's Strategic Changes in Technology
and Business Practices to Expand Interoperability.
(transcript), PressPass -- Information for Journalists,
microsoft.com, February 12, 2008. Available at <http://www.microsoft.com/presspass/press/2008/feb08/02-21ConCallTranscript.mspx>,
accessed 2008-04-09-14:33 -0700.
- [2]
Dennis E. Hamilton
- ODF-OOXML: nfoWorks for Harmony? (web log
post), Professor von Clueless in the Blunder Dome,
orcmid.com, 2008-02-07. Available at <http://orcmid.com/BlunderDome/clueless/2008/02/odf-ooxml-nfoworks-for-harmony.asp>,
accessed 2008-04-09-14:41 -0700
- nfoWorks Activity:
- The nfoWorks Diary
is the place to watch for reports on nfoWorks activity and links to
relevant new material.
- nfoWorks Notes and Resources:
- Although organized chronologically, there is always
a
current catalog of available notes
and resource material that are found on this site. With
completion of the basic "bootstrapping" process, the gathering of
technical resources will now commence.