nfoWorks: tools for document interoperability

d140502 nfoWorks devNote
 Annotating XML Documents
Embellishing HTML Replicas


0.00 2017-06-14 20:22

{Ed.Note: This is boilerplate as a placeholder, allowing this page to be linked to before its content is perfected.

  The following topics are discussed:

   1. Variations in the derivation of the Baseline HTML, such as the use of classes or in-line styling of various kinds.  How the embellishments are very much specific to the nature of the XML document and the intended use of the annotated replica.
   2. Focus on schemas and RNG schemas in this particular case.
   3. Links in external URLs.
   4. Placing permalinks on definitions, observing that normal styles are used.
   5. Adding cross-references to definitions.
   6. Element-name permalinks with emphasis.
   7. Attribute definitions with emphasis as well and taking advantage of line numbers to prevent duplications.
   8. The grouped Attribute case.
   9. Treatment of datatypes and external references.
 10. Include references to the editor used and the definitions of the regular expressions and replacement rules.}

The derivation of an HTML rendition of an XML document is complicated by the fact that the XML document has characters with special significance in HTML.  In addition, HTML does not preserve whitespace as written in the XML document.  It is necessary to make adjustments for preserving spaces and line breaks and also "escape" XML document characters "&", "<", and ">" so they appear literally without recognition as HTML-significant special codes.

It is also a bit complicated to convert the plaintext line-number fields to  HTML fields that render as permalinks having the line number as text. 

The entire process can be accomplished with scripts and small programs, as is also the case for creation of a baseline plaintext replica.   All of the steps can be automated.  Here, a manual procedure is used to demonstrate "by hand" what the critical steps are for any automation.

The procedure given here is a continuation of the manual procedure already-provided for creating baseline plaintext replicasThe same XML document is used by way of example [n140504b3]..The examples begin with the Microsoft Office 2013 Excel spreadsheet that is a by-product of the plaintext replica procedure [n140504d2].

The objective is simple: Provide an HTML text that renders the same as the plaintext rendition (e.g., [n140504d1]), but with the line-number fields replaced by permalinks to those very lines.   Any further annotation is accomplished by editing the baseline HTML replica using appropriate tools.

This procedure illustrates two variations for HTML.  The simplest is a version that is preformatted and must be included in HTML <pre> elements in order to present with the original whitespace, including line breaks, as illustrated by [n140504d5].  The second is a version that is ordinary HTML for inclusion in HTML <p> (and equivalent) elements, with space and line breaks preserved using special HTML markup, as illustrated by [n140504d6].

1. Prerequisites
2. The Starting Point

9. References

1. Prerequisites

{Ed.Note: Update to the basic starting point being the baseline HTML.  In this particular case, the ordinary HTML version is used.  The basic difference is the handling of &nbsp; versus \x20.  This depends on having an editor that provides good regular expression search and replace machinery.  Link to the editor used and the regular expression definitions too.}

2. The Starting Point

{Ed.Note: Update to the basic starting point being the baseline HTML.  In this particular case, the ordinary HTML version is used.  The basic difference is the handling of &nbsp; versus \x20.  Also, having a practice copy and verifying a search-replace process alongside the draft derived HTML is important.}

There is an existing baseline plaintext replica of the original XML document, in accordance with the procedure for that level of replica.  The document was re-indented as necessary and individual line numbers added to the beginning of each line.  The file is also named as .txt so that it is not confused with the XML document it replicates (e.g., [n140504d1]).

{Ed.Note: For the visuals, I want before and after illustrations, where possible.  It might be illustrated with snippets or with screen captures, with the transformation step in the middle.}

9. References

{Ed.Note: These will be pruned and the actual variant referenced, along with the resources used and where their specifications/documentation is found.}

Hamilton, Dennis E.  OpenDocument v1.2 Manifest Schema baseline ordinary HTML replica.  Derived from [n140504d4] on 2014-05-14.  This version results from all of the steps in the procedure here, having the original white-space preserved in ordinary HTML via presence of  &nbsp; character entities and <br /> line-break elements.

Hamilton, Dennis E.
Annotating XML Documents: Embellishment HTML Replicas.  nfoWorks devNote page d140502f 0.00, June 7, 2014.  Accessed at <>.
Revision History:
0.00 2014-06-07-08:36 Initial Placeholder
Provide initial placeholder content for illustrated embellishing that is carried out in the case of a specific schema. 

Construction Structure (Hard Hat Area)
Creative Commons License You are navigating nfoWorks.
This work is licensed under a
Creative Commons Attribution 2.5 License.

created 2014-06-07-08:36 -0700 (pdt)
$$Author: Orcmid $
$$Date: 17-06-14 20:22 $
$$Revision: 32 $