1.8 uDoc Element Types

uDoc provides three major kinds of elements, all easily modified: block, text, and inline.

These simple rules solve a problem that dates back to SGML, which Norman Walsh, the creator of DocBook, calls “pernicious mixed content”, where an element contains both block elements and text. The issue is with whitespace (spaces, tabs, and returns). Block elements are often surrounded by whitespace, to prevent overlong lines and to show nesting level with indentation. So if they are mixed with text, is the whitespace preserved in outputs, as is needed for text, or not? There is no clear answer. Ensuring that text and block elements are never mixed in the same wrapper element solves it cleanly.

Note:  Authors are not required to use text elements explicitly if the processor can determine where they should start and end from the content. So processors should surround “pernicious” text in block elements with <p>...</p> tags, so that the block element contains a text element, not text directly. This makes what processors should do with whitespace simple: start the <p> after any leading whitespace, and end it before any trailing whitespace.

Block elements are one of six subtypes: root, group, structure, data, reference, and definition.

Root elements determine the kind of document they contain, doc, map, or lib:

Group elements include <div>, <branch>, <sect>, <comment>, and <doctext>:

Structure elements include <table>, <fig>, lists, and <note>:

Data elements contain metadata, not intended for display as is, such as publication information like ISBN and milestone dates for a project. A processor should make the data element info available for use in assembled content such as a title page. The data elements include <data>, <author>, <copyright>, <publisher>, <date>, <alias>, <code>, <start>, and <end>:

Reference elements point to content defined elsewhere that is to be incorporated. They nest and do thereby contribute to the content hierarchy in maps. They include <ref>, <textref>, <coderef>, <docref>, <mapref>, <defref>, <elemref>, <condref>, <varref>, <keyref>, <relref>, and <doclist>:

Definition elements define reusable content such as external resources. They include <def>, <cond>, <element>, <variable>, <key>, <output>, <genlist>, <glossdef>, and <tset>:

Text elements can be in a block element, and are one of several types: <p>, <pre>, <title>, <usage>, <quote>, <cite>, <desc>, <alt>, and <area>. They can contain text and inline elements, but not block or other text elements:

Inline elements can be used only within text elements or within each other:

Typographic elements are inline elements that modify the presentation of the contained text. They should be used sparingly; a <ph> with @class, or perhaps a new semantic element, is preferred:

Previous Topic:  1.7 uDoc Metadata

Next Topic:  1.9 uDoc Files

Parent Topic:  Chapter 1. Why Use uDoc?

Sibling Topics:

1.1 uDoc Alternatives

1.2 uDoc Error Recovery

1.3 uDoc Interoperability

1.4 uDoc Hierarchies

1.5 uDoc Development

1.6 uDoc Tag Minimization

1.7 uDoc Metadata

1.9 uDoc Files