1.8 uDoc Element Types
uDoc provides three major kinds of elements, all easily
modified: block, text, and inline.
- Block elements provide structure. They can contain
other block elements and text elements without restriction, but cannot
contain text itself or inline elements.
- Text elements provide content. They can contain
plain text and inline elements, but not block elements or other text
elements.
- Inline elements provide fine-grain control of
text content. They can contain only plain text and other inline elements.
These simple rules solve a problem that dates back
to SGML, which Norman Walsh, the creator of DocBook, calls “pernicious mixed content”, where an element
contains both block elements and text. The issue is with whitespace (spaces, tabs, and returns). Block elements
are often surrounded by whitespace, to prevent overlong lines and to
show nesting level with indentation. So if they are mixed with text,
is the whitespace preserved in outputs, as is needed for text, or not?
There is no clear answer. Ensuring that text and block elements are never
mixed in the same wrapper element solves it cleanly.
Note: Authors are not required
to use text elements explicitly if the processor can determine where
they should start and end from the content. So processors should surround
“pernicious” text in block elements with <p>...</p> tags, so that
the block element contains a text element, not text directly. This makes
what processors should
do with whitespace simple: start the <p> after any leading
whitespace, and end it before any trailing whitespace.
Block elements are one of six subtypes:
root, group, structure, data, reference, and definition.
Root elements determine the kind of document
they contain, doc, map, or lib:
- <doc> is a unit
of content like a DITA topic, the basic building block, and is not nestable
- <map> organizes
a set of docs into a project, containing refs to docs and other maps,
and project metadata
- <lib> is a library,
a place to store text content and defs for re-use in docs and maps
Group elements include <div>, <branch>, <sect>, <comment>, and
<doctext>:
Structure elements include <table>, <fig>, lists, and
<note>:
- <table> contains
col, row, and cell
- <fig> contains
image, imagemap, object, and param
- list: <ul> (unordered),
<ol> (ordered),
<sl> (simple),
or <pl> (pairs), contains
<li> items
- <note> has a <title>, one or
more paragraphs, and can have an image, for warnings and the like
Data elements contain metadata, not intended for display as is, such as publication
information like ISBN and milestone dates for a project. A processor
should make the data element info available for use in assembled content
such as a title page. The data elements include <data>, <author>, <copyright>, <publisher>, <date>, <alias>, <code>, <start>, and <end>:
- <data> contains
info that is not to be rendered
- <author> has names
and other related info
- <copyright> (or
copyleft) identifies restrictions on distribution
- <publisher> has
name and contact info
- <date> info is
in yyyy-mm-dd format
- <alias> is used
for CSH (Context-Sensitive Help) addressing
- <code> suggests
code to use for its parent, usually a def
- <start> and <end> define a
range that can cross element boundaries, effectively providing overlapping
virtual elements
Reference elements point to content defined
elsewhere that is to be incorporated. They nest and do thereby
contribute to the content hierarchy in maps. They include <ref>, <textref>, <coderef>, <docref>, <mapref>, <defref>, <elemref>, <condref>, <varref>, <keyref>, <relref>, and <doclist>:
- <ref> is for general-purpose
use
- <defref> applies
a def or set of defs to the current doc or project
- <dirref> brings
in files matching @query in the directory named in @src or @key
- <dbref> applies
@query to the contents of the database named in @src or @key
- <webref> sends
@query to the search engine named in @src or @key
- These are used in docs:
- <textref> provides
block text transclusion (like DITA conrefs)
- <coderef> handles
preformed code transclusion (supports RFC
5147)
- <relref> identifies
the subject groups the doc belongs to
- These are used only in maps:
- <docref> brings
in a <doc> file, either
via @src or @key
- <mapref> brings
in another <map> file, either
via @src or @key
- <fileref> brings
in a file in final format, either via @src or @key
- <codedocref> brings
in a file as plain text, with the file name as the <title>
- <elemref> adds
new <element> defs
to the current project
- <condref> applies
a set of <conditions>, like
DITA ditavals
- <varref> specifies
a library to check for <variable> definitions
- <keyref> specifies
a library to check for <key> definitions
- <glossref> specifies
a library to check for <glossary> definitions
- <doclist> identifies
a generated list to use. Predefined doclists include <contents>, <figures>, <tables>, <index>, and <glossary>.
Definition elements define reusable content
such as external resources. They include <def>, <cond>, <element>, <variable>, <key>, <output>, <genlist>, <glossdef>, and
<tset>:
- <def> is for general-purpose
use
- <key> has @keys and @src, and can also specify
another @key, possibly in a different
@project
- <tset> defines
tab stops for use within the contained text elements
- These are used only in maps:
- <conditions> defines
a set of conditions for the current scope, such as <map> or <branch>
- <output> contains
definitions (or references to them) specific to an output type
- These are used only in libraries:
- <element> has @name and @props, and contains
<usage> and <attr>s
- <variable> has
@id and one or more text
elements
- <genlist> defines
the @sort and items for a
generated list such as an LOF or LOT
- <glossdef> defines
a <glossary> item,
used when generating a <glossary>
Text elements can be in a block
element, and are one of several types: <p>, <pre>, <title>, <usage>, <quote>, <cite>, <desc>, <alt>, and <area>. They can
contain text and inline elements, but not block or other text elements:
- <p> is a paragraph,
the primary text component.
Whitespace within it is normalized to a single space
in text, retaining a single space if present before and after any inline
elements that provide text; leading and trailing space is trimmed off
- <pre> is also a
paragraph, but it retains all whitespace.
It is for preformatted text such as code samples. If
there is whitespace before the pre tag itself, that sets the left margin
for the following text, so that the <pre> element can
be indented to show nesting like any other block or text element
- <title> is a paragraph
used to provide the content for refs to its containing block
- <usage> is a short
paragraph used in defs for a readable description of purpose
- <quote> is a paragraph
used to represent one or more full paragraphs of quoted content
- <cite> identifies
the source of a <quote> or <comment>
- <desc> is a paragraph
description for an <object>, <image>, or <table> (for mouseover)
- <alt> contains
text to show in place of an <image> that is
not available
- <area> contains
text for mouseover display, and links, in an <imagemap>
Inline elements can be used only
within text elements or within each other:
Typographic elements are inline elements that
modify the presentation of the contained text. They should be used sparingly;
a <ph> with @class, or perhaps a
new semantic element, is preferred:
- <b> is Bold
- <i> is Italic
- <u> is Underline
- <du> is Double
Underline
- <o> is Overline
- <s> is Strikeout
- <sup> is SUPerscript
- <sub> is SUBscript
- <tt> is TeleType,
rendered in a monospaced font like Courier
- <q> is Quote, rendered
with language-specific quotation marks
Previous Topic: 1.7 uDoc Metadata
Next Topic: 1.9 uDoc Files
Parent Topic: Chapter 1. Why Use uDoc?
Sibling Topics:
1.1 uDoc
Alternatives
1.2 uDoc Error Recovery
1.3 uDoc Interoperability
1.4 uDoc
Hierarchies
1.5 uDoc Development
1.6 uDoc Tag Minimization
1.7 uDoc Metadata
1.9 uDoc Files