Standard Generalized Markup Language (SGML) is the ISO standard 8879:1986 that describes a syntax for marking up documents with tags that describe the purpose of the text rather than the appearance on the page. This form of markup facilitates document interchange between different platforms and applications. SGML allows the management of information as data objects rather than text on a page.
In an SGML document the main structural components are called elements. The organization and structure of a document and the meaning of elements are described in the Document Type Definition ( DTD ). Elements are the tags that identify the content. Element names may be descriptive of the content for ease of use. For example <para> for paragraphs. Elements can have attributes which are used to modify or refine the properties or characteristics of the element. Within the DTD a valid context for each element is defined and a framework is provided for the types of elements that constitute a compliant document.
Another component of the DTD is entities. Entities are a collection of characters that can be referenced as a unit. Entities are similar to constants in a programming language such as C. They can be defined and referenced. An entity can represent one character or symbol which does not appear on a standard keyboard, a word or group of words, or an entire separate sgml marked-up file. Entities allow reuse of standard text.
There is no single standard DTD , but the de facto standard for the computer industry is the DocBook DTD , developed and maintained by the Davenport Group. Within Sun, the SolBook DTD , which is a proper subset of DocBook DTD , is used when writing reference manual pages. The SolBook DTD contains a number of tags that are designed for the unique needs of the reference pages.
Elements are defined with a hierarchical structure that gives a structure to the document. The following is a description of some of the elements from the SolBook DTD which are used for reference pages.
The first line in an SGML file that identifies the location of the DTD that is used to define the document. The <!DOCTYPE string is what the SGML -aware man(1) command uses to identify that a file is formatted in SGML rather than nroff( 1).
The top layer element that contains a reference page is <refentry>. All of the text and other tags must be contained within this tag.
The next tag in a reference page is <refmeta>, which is a container for several other tags. They are:
This is the title of the reference page. It is equivalent to the name of the reference page's file name, without the section number extension.
This is the section number that the reference page resides in. The contents may be a text entity reference.
There are one or more <refmiscinfo> tags which contain meta information. Meta information is information about the reference page. The <refmiscinfo> tag has the class attribute. There are four classes that are routinely used.
This is the date that the file was last modified. By consensus this date is changed only when the technical information on the page changes and not simply for an editorial change.
This is the section title of the reference page; for example User Commands. The value of this attribute may be a text entity reference.
This is the name of the software product that the topic discussed on the reference page belongs to. For example UNIX commands are part of the SunOS x.x release. The value of this attribute may be a text entity reference.
This is the architectural platform limitation of the subject discussed on the reference page. If there are no limitations the value used is generic . Other values are sparc and x86.
This attribute contains the Sun Microsystems copyright. Any other copyrights that may pertain to the individual reference page file should be entered as separate <refmiscinfo> entries. The value of this attribute may be a text entity reference.
This tag contains the equivalent information to the .TH macro line in an nroff(1) reference page. <refnamediv> contains three tags. These tags contain the text that is before and after the `–' (dash) on the NAME line.
These are the names of the topics that are discussed in the file. There may be more than one <refname> for a page. The first <refname> must match the name of the file and the <refentrytitle>. If there are more than one <refname> tags, each is separated by a `,' (comma). The comma is generated by the publisher of sgml files, so it should not be typed. This is referred to as auto-generated text.
The text after the dash on the NAME line is contained in this tag. This is a short summary of what the object or objects described on the reference page do or are used for. The dash is also auto-generated and should not be typed in.
In some cases the <refentrytitle> is a general topic descriptor of a group of related objects that are discussed on the same page. In this case the first tag after the <refnamediv> is a <refdiscriptor>. The <refname> tags follow. Only one <refdiscriptor> is allowed, and it should match the <refentrytitle>.
The SYNOPSIS line of the reference page is contained by this tag. There is a <title> that usually contains an entity reference. The text is the word SYNOPSIS. There are several tags within <refsynopsisdiv> that are designed specifically for the type of synopsis that is used in the different reference page sections. The three types are:
Used for commands and utilities pages.
Used for programming interface pages.
Used for pages that do not fall into the other two categories.
This tag is equivalent to the .SH nroff macro. It contains a <title> element that is the title of the reference page section. Section names are the standard names such as DESCRIPTION, OPTIONS, PARAMETERS, SEE ALSO, and others. The contents of the <title> may be a text entity reference.
This tag is equivalent to the .SS nroff macro. It contains a <title> element that contains the text of the sub-section heading. <refsect2> tags may also be used within a <refsynopsisdiv> as a sub-section heading for the SYNOPSIS section.
There are a number of block elements that are used for grouping text. This is a list of some of these elements.
This tag is used to contain a paragraph of text.
This tag is used to create two column lists. For example descriptions for command options, where the first column lists the option and the second column describes the option.
An list of items in a specific order.
A list of items that are marked with a character such as a bullet or a dash.
Formatted program output as produced by a program or command. This tag is a container for lines set off from the main text in which line breaks, tabs, and leading white space are significant.
A segment of program code. Line breaks and leading white space are significant.
This tag contains the layout and content for tabular formatting of information. <table> has a required <title>.
This tag is the same as the <table> tag except the <title> is not required.
This tag contains examples of source code or usage of commands. It contains a required <title>.
This tag is the same as the <example> tag except the <title> is not required.
The inline elements are used for tagging text.
An executable program or the entry a user makes to execute a command.
A subroutine in a program or external library.
Contains any literal string.
An argument passed to a computer program by a function or routine.
An untitled mathematical equation occurring in-line.
A hypertext link to text within a book, in the case of the reference manual it is used to cross reference to another reference page.
A hypertext link used to create cross references to books other than the reference manual.
A cross reference to another part of the same reference page.