3 Templates

This chapter provides a description of HTML Export templates and how they are used.

Much of the power, flexibility and complexity of Export products are realized through its use of templates to drive the export process. Templates give the developer (or the developer's customer) flexibility in the visual and navigational properties of the resulting output. Templates also isolate the HTML Export code from the ever-changing face of HTML and its associated plug-ins, components and scripting languages.

The template macros and the elements they reference are so tightly intertwined that discussing one without the other is almost impossible. Before either is read in-depth, it is recommended that the reader skim The Included Sample Templates, and Macro Reference.

This chapter includes the following sections:

3.1 What Is a Template?

A template is simply an HTML file that may include a special macro language. This language allows the template writer to insert, repeat through, condition on, and link to various elements in the source document.

The following is the code for a very simple template:

{## unit}{## header}
<html>
<body>
{## /header}
<p>Here is the document you requested.
{## insert element=property.title} by 
{## insert element=property.author}</p>
<p>Below is the document itself</p>
{## insert element=body}

{## footer}

</body>
</html>
{## /footer}{## /unit}
{## unit}{## header}
<html>
<body>
{## /header}
<p>Here is the document you requested.
{## insert element=property.title} by 
{## insert element=property.author}</p>

<p>Below is the document itself</p>
{## insert element=body}
{## footer}
</body>
</html>
{## /footer}{## /unit}

The {## unit}, {## /unit}, {## header}, {## /header}, {## footer} and {## /footer} macros can be ignored for the moment. Their purpose is described in Units - Breaking Documents by Content Size.

The remainder of the file is a regular HTML with the exception of three macros in the form {## insert element=xxx}. HTML Export uses this template plus the source file to create its output. To accomplish this, HTML Export reads through the template file, writing it byte for byte to the output file unless character mapping is performed on the template (for an explanation of template character mapping, see Unicode Templates). This continues until the template contains a properly formatted macro. HTML Export reads the macro and executes the macro's command. Usually this means inserting an HTML version of some element from the source file into the output file. HTML Export then continues reading the template and executing macros until the end of the template file is reached.In the previous example, the first {## insert} macros use the element syntax (described in Macro Reference) to insert the title of the document. The second macro inserts the author of the document and the third macro inserts the entire body of the document. The resulting HTML might look like this (HTML that is the result of a macro is in bold):

<html>
<body>
<p>Here is the document you requested.</p>
<p>A Poem by Phil Boutros</p>
<p>Below is the document itself</p>
<p>Roses are red</p>
<p>Violets are blue</p>
<p>I'm a programmer</p>
<p>and so are you</p>
</body>
</html>

3.2 The Included Sample Templates

By default, the templates included with HTML Export convert files of type PR into images that are always 640 pixels wide. Users who wish to change this setting will need to edit the templates to remove the ## option macro that sets this limit.

When you install HTML Export, a template directory is created that contains sample templates. These templates (with the exception of those in the tutorial directory) are tailored for publishing and indexing applications, and they are completely brandable. To brand a template, you can alter its .CSS file so that the template's color scheme matches your company's color scheme. You can also overwrite the existing logo.gif file with your company's logo. Some of the template directories contain readme.txt files that contain more information about modifying those templates.

The following is a list of templates contained in this directory:

  • \template\HTML Export\standard: The standard template features convenient navigation elements, including a table of contents and a preview window, to help users quickly access a document's information.

  • \template\HTML Export\navigation: The navigation template has many of the same features as the standard template, such as convenient navigation elements, and adds a drop-down table of contents.

  • \template\HTML Export\newsletter: The newsletter template supports all document types except archives. It displays the content in a style similar to a news web site. The table of contents contains each top level heading (the "Heading 1" style). When a user clicks these hyperlinks, the corresponding section's content fills the main window.

  • \template\HTML Export\noframes: The noframes template displays an entire document in a single frame, with table-of-contents style navigation. It is ideal for use in the most straightforward publishing applications.

  • \template\HTML Export\tableofcontents: The tableofcontents template is simpler than the standard or the navigation templates, and contains fewer navigation elements. It shows a table of contents on the left side of the screen, and the selected document content on the right.

  • \template\HTML Export\textonly: The textonly template is designed for use by developers wishing to convert documents for inclusion in an index for a search engine. It should not be used in publishing applications. All of the document's elements, including properties, headers and footers, are converted.

  • \template\HTML Export\tutorial: This is a directory of templates containing comment text intended to help users interested in more thoroughly understanding the HTML Export template language.

3.3 The Document Tree and Its Elements

HTML Export uses the concept of a document tree to make various pieces and attributes of the source file individually addressable from within a template. The nodes of the document tree are used to generate a path to a specific element in the tree. A period is used to separate the nodes in this path. For example, the path of the author property of a document is property.author. There are two types of elements: leaf elements and repeatable elements.

Figure 3-1 The Document Tree

This shows the Document Tree.

3.3.1 Leaf Elements

Leaf elements are single identifiable pieces of the source file, like the author property (property.author) or the preface of the document (body.contents.preface). This type of element is a valid target for inserting, testing and linking using the {## insert}, {## if} and {## link} macros. The last node in this type of path must be a valid leaf node in the document tree. Valid leaf nodes are shown in italics.

3.3.2 Repeatable Elements

Repeatable elements have multiple instances associated with them, like the footnotes in a document (sections.1.footnotes). This type of element may not be directly inserted, tested or linked to but its instances may be looped through using the {## repeat} macro. The last node in this type of path must be a valid repeatable node in the document tree. Valid repeatable nodes are shown in bold.

Some templates use {## repeat} loops to generate one output file per repeatable element. For example, a template may render a presentation file as a group of output files, with one output file for each slide. When an input file contains an exceptionally large number of sections, it is possible for an operating system to run out of file handles. See your operating system's documentation or system administrator to find out how many open file handles are allowed. To avoid this extremely rare problem, set a value for the maxreps attribute of the {## repeat} macro or configure the operating system to allow more file handles.

3.3.3 Element Definitions

The following is a list of all elements and a short description of each (for a description of valid values for x, see Indexes and Structure-Based Breaking):

  • sheets

    Type: Repeatable

    Description: See sections later in this list.

  • slides

    Type: Repeatable

    Description: See sections later in this list.

  • sections

    Type: Repeatable

    Description: Sections are used to represent the highest level of abstraction within the source file. In general, word processor documents will have only one section, the document itself. Spreadsheets have one section for each sheet or chart. Presentations have one section for each slide. Archives have one section for each item in the archive. Graphics generally have one section but may have more as in a multi-page TIFF. For convenience and readability, sheets and slides are synonymous with sections.

  • sections.x.body

    Type: Leaf

    Description: This element represents the main textual area of the source file.

    For word processing documents, it includes the entire document excluding footnote, endnotes, headers, footers and annotations. (Footnote/endnote references are always included automatically in the body. If the template includes footnotes/endnotes, then these references provide a link to the note. Annotation references are not placed in the body unless the template includes annotations, in which case they provide links to the annotations.)

    For emails, this is the message itself.

    For spreadsheets, it includes the entire sheet.

    For graphics, it includes any text that actually appears as text in the file format.

    For multimedia files, the body does not exist at this time.

    For archive formats, the meaning is arctype-specific. When arctype is file, this is the summation (as needed):

    sections.x.path +
    

    the directory separator character being used +

    sections.x.basename
    

    Note that sections only exist for entries in the archive file that have files associated with them. In particular, entries in the archive file that are for directories are ignored.

    Also note that directory separators are OS-dependent. For example, Windows uses back slashes (\) and allows forward slashes (/), UNIX uses the forward slash, and Macs use a colon (:). The directory separator being used depends on how the directory separator is coded in the archive file.

    When arctype is message, cal, task or journal, this is the subject of the file. When arctype is contact, this is the name of the contact. When arctype is note, this is the contents of the note. When arctype is attach, this is the filename of the attachment or a link to the extracted and converted attachment. When arctype is fieldsfile, this is the list of fields.

    This element is empty when the input file is a multimedia file.

  • sections.x.to

    Type: Leaf

    Description: "To" addresses from an email or email archive.

  • sections.x.from

    Type: Leaf

    Description: "From" addresses from an email or email archive.

  • sections.x.cc

    Type: Leaf

    Description: "CC" addresses from an email or email archive.

  • sections.x.content

    Type: Leaf

    Description: Same as sections.x.decompressedfile. For archive files, the meaning is arctype-specific. When arctype is file, the file in the archive is extracted and converted. For all other arctypes, this is the contents of the item.

    Note that this element may not be inserted into a document. If it is used with the {##insert} template macro, a template error will be returned.

  • sections.x.image

    Type: Leaf

    Description: This element represents a graphic image of the content of the section. It is valid only for bitmap, drawing, chart and presentation sections.

  • sections.x.bodyorimage

    Type: Leaf

    Description: This element exists to make it easy to build templates that handle a range of section types. In word processor documents, spreadsheet and database sections, and archive elements, bodyorimage is synonymous with body. In bitmap, drawing, multimedia, chart and presentation sections, bodyorimage is synonymous with image. For multimedia files, bodyorimage does not exist at this time.

  • sections.x.type

    Type: Leaf

    Description: This element is normally used only for query purposes, but it may be inserted as well. For further details on how to use this in an {## if} macro, see Conditional: {## if}_ {## elseif}_ and {## else}.

  • sections.x.arctype

    Type: Leaf

    Description: For archive formats, this describes what kind of archive. Currently defined archive types include:

    file

    message

    contact

    cal

    note

    task

    journal

    attach

    fieldsfile

  • sections.x.fullname

    Type: Leaf

    Description: This is the full name (including path, if applicable) of a file in an archive if the arctype for the archive is file. For archive formats, this is synonymous with body. For all other formats, it is not defined.

  • sections.x.basename

    Type: Leaf

    Description: For archive formats where the arctype is file, this is the file name for the item in the archive without any path info. This element is undefined for all other input file types.

  • sections.x.title

    Type: Leaf

    Description: Same as sections.x.body.title. For word processor documents, this element is the text marked with the title style. This may be different than the property.title. For archive files, this is the same as sections.x.body. For all other types, this element will be the "name" of the section. For example, if the source file is a spreadsheet, this element will be the name of the sheet as it appears on the spreadsheet application's navigation tabs.

    For archive formats, this is synonymous with body.

    For email and email archive sections, this is the subject of the subject field of the email.

    For multimedia files, this does not exist at this time.

  • sections.x.path

    Type: Leaf

    Description: For archive files where the arctype is file, this is any path information provided for the current archive item. Does not include a trailing directory separator character. This element may be the empty string (" "). This element is undefined for all other input file types.

  • sections.x.itemnum

    Type: Leaf

    Description: For archive formats, this is the (unsorted) entry number of the current file in the archive. The first entry is itemnum one ("1"), not zero ("0"). All entries in archive files have an associated itemnum. However, not all entries in archive files have an associated section number. This is because archive entries for directories are skipped when sections are generated by HTML Export. Therefore, inserting this element is not functionally equivalent to {## insert number=sections.x.value}. This element is undefined for all other input file types.

  • sections.x.reflink

    Type: Leaf

    Description: For archive formats, this is a URL composed of

    the input file name

    +

    the subdocument spec for the archive entry

    The intent of this element is to provide a string that can be passed to DAOpenDocument in a future export to a specific entry in the archive file currently being exported. The target of the reflink is not necessarily converted into HTML. In this usage scenario:

    1. The original export is run producing the reflink.

    2. The user clicks on the reflink in the output document

    3. The OEM's program interprets the reflink and passes it to a DAOpenDocument. It then runs HTML Export and serves the output back to the user.

    Users of redirected IO should also note that they must handle the IOGetInfo call for IOGETINFO_PATHNAME. It must return a path name for the archive file that HTML Export can use to build the reflink. In addition, the calling program will need to be able to correctly interpret the resultant reflink to be sure it can subsequently be passed to a future call to DAOpenDocument.

    This element is undefined for all other input file types.

  • sections.x.decompressedfile

    Type: Leaf

    Description: For archive formats, this extracts the file in the archive and converts it. Note that this element may not be inserted into a document. If it is used with {## insert}, a template error will be returned.

    This element is undefined for non-archive input file types.

    For archive formats, this is arctype-specific. When arctype is file, the file is converted to the designated output format. When arctype is message, this is the contents of the email. When arctype is contact, this is the contents of the contact info. When arctype is cal, this is the contents of the calendar entry. When arctype is note, this is the contents of the note. When arctype is task, this is the contents of the task. When arctype is journal, this is the contents of the journal entry. When arctype is attach, this is the contents of the attachment. When arctype is fieldsfile, this is the list of fields.

  • sections.x.size

    Type: Leaf

    Description: This applies to all archive types except those of type fieldslist.

    This is the uncompressed file size of the entry in the archive.

    This element is undefined for all other input file types.

  • sections.x.date

    Type: Leaf

    Description: For archive formats, this is arctype-specific. When arctype is file, this is the file modification time stamp for this entry in the archive. When arctype is message, this is the time the message was last modified. When arctype is cal, this is the start time/date of the event. When arctype is task, this is the due date for the task. When arctype is journal, this is the start time. When arctype is attach, this is the date of the attachment. This value is undefined for the contact and note arctypes.

    For email sections, this is the submitted time field from the email.

    This element is undefined for archives of type fieldsfile.

  • sections.x.mailfields

    Type: Repeatable

    Description: For email sections, this is used to iterate through the complete set of fields available in emails. This includes all of the named fields (like sections.x.date) as well as fields that are not explicitly named (like "bcc"). This is undefined for all other section types.

  • sections.x.mailfields.x.body

    Type: Leaf

    Description: For email sections, this element is the value of a field from the email. This is undefined for all other section types.

  • sections.x.mailfields.x.name

    Type: Leaf

    Description: For email sections, this element is the name of a field from the email. This is undefined for all other section types.

  • sections.x.body.title

    Type: Leaf

    Description: For word processor documents, this element is the text marked with the title style. This may be different than the property.title.

    For archive formats, this is synonymous with body.

    For multimedia formats, this does not exist at this time.

    For all other document types, this element will be the "name" of the section. For example, if the source file is a spreadsheet, this element will be the name of the sheet as it appears on the spreadsheet application's navigation tabs.

  • sections.x.body.contents

    Type: Leaf

    Description: For word processor documents, this is the same as sections.x.body. This is to maintain backwards compatibility with templates written before sections.x.body.title was legal for word processor documents, a feature added in the 7.0 release.

    For multimedia files, this does not exist at this time.

    For all other document types, this is the same as the body minus the title, if a title exists.

  • sections.x.body.contents.preface

    Type: Leaf

    Description: Text between the top of the body and the first heading.

  • sections.x.body.contents.headings

    Type: Repeatable

    Description: Headings are labels in a word processor document inserted by the author to give a document structure (for further details of headings, see Breaking Documents by Structure). HTML Export reads this structure and, through the use of the headings element, allows the developer to access it.

  • sections.x.body.contents.headings.x.body…

    Type: Leaf with Leafs and Repeatables below

    Description: Under each heading, the structure of a complete document from body down is repeated. For more information on how these elements map to parts of a document, see Breaking Documents by Structure.

  • sections.x.body.contents.headings.x.footnotes…

    Type: Repeatable with Leafs below

    Description: Only footnotes contained in this heading.

  • sections.x.body.contents.headings.x.endnotes…

    Type: Repeatable with Leafs below

    Description: Only endnotes contained in this heading.

  • sections.x.body.contents.headings.x.annotations…

    Type: Repeatable with Leafs below

    Description: Only annotations contained in this heading.

  • sections.x.grids

    Type: Repeatable

    Description: Only valid for spreadsheet and database formats. This permits access to the "grids" inside a section or sheet of a spreadsheet or database file.

  • sections.x.grids.x.body

    Type: Repeatable

    Description: Only valid for spreadsheet and database formats. This permits access to the "grids" inside a section or sheet of a spreadsheet or database file.

  • sections.x.arcfields

    Type: Repeatable

    Description: All of the supported fields in the archive including the named fields such as sections.x.date and sections.x.basename. Each arcfield is a name/value pair.

  • sections.x.arcfields.x.body

    Type: Leaf

    Description: Value of the data for a given field in an archive file. Not defined for non-archive files.

  • sections.x.arcfields.x.name

    Type: Leaf

    Description: Name of the data field from an archive file. Not defined for non-archive files.

  • sections.x.footnotes

    Type: Repeatable

    Description: All footnotes.

  • sections.x.footnotes.x.body

    Type: Leaf

    Description: The complete footnote reference and content text.

  • sections.x.footnotes.x.reference

    Type: Leaf

    Description: The reference number for the footnote.

  • sections.x.footnotes.x.content

    Type: Leaf

    Description: The content text for the footnote.

  • sections.x.endnotes…

    Type: Repeatable with Leafs below

    Description: Same definitions as footnotes.

  • sections.x.annotations

    Type: Repeatable

    Description: All annotations. In templates, the term "annotations" refers to annotations made inside an authoring application (for example, "comments" in a Microsoft Word document) and do not refer to the annotations created via the Export Annotation API.

  • sections.x.annotations.x.body

    Type: Leaf

    Description: The complete annotation reference and content text.

  • sections.x.annotations.x.reference

    Type: Leaf

    Description: The reference text for the annotation.

  • sections.x.annotations.x.content

    Type: Leaf

    Description: The content text for the annotation.

  • sections.x.slidenotes

    Type: Repeatable

    Description: All slide notes.

    It should be noted that exporting the slide notes will slow down the conversion process for PowerPoint files.

  • sections.x.slidenotes.x.body

    Type: Leaf

    Description: The notes for the current slide.

    Developers are encouraged to write slide notes at the end of the output file for performance reasons (PowerPoint files keep slide notes at the end of the file, not next to each slide). Not doing so will slow conversion, as the technology will be forced to perform excessive seeking in the input file.

  • sections.x.slidenotes.x.reference

    Type: Leaf

    Description: The slide note text for the annotation.

  • sections.x.slidenotes.x.content

    Type: Leaf

    Description: The content text for the slide note.

  • sections.x.headers

    Type: Repeatable

    Description: All headers.

  • sections.x.headers.x.body

    Type: Leaf

    Description: Text of the header.

  • sections.x.footers

    Type: Repeatable

    Description: All footers.

  • sections.x.footers.x.body

    Type: Leaf

    Description: Text of the footer.

  • property.all

    Type: Repeatable

    Description: This permits access to all properties including those specifically accessible through property elements described in this table, and includes both the " name" and the " body" of the property. The properties supported depend on file format. See the Outside In Content Access Developer Guide for a list of possible predefined properties. Some file formats also allow for additional user-definable properties.

    At this time, only properties may be extracted from multimedia files.

  • property.all.x.name

    Type: Leaf

    Description: Descriptive name for the property.

  • property.all.x.body

    Type: Leaf

    Description: Text of the property.

  • property.album

    Type: Leaf

    Description: Album property of the source file. Valid only for multimedia files.

  • property.artist

    Type: Leaf

    Description: Artist property of the source file. Valid only for multimedia files.

  • property.author

    Type: Leaf

    Description: Author property of the source file.

  • property.title

    Type: Leaf

    Description: Title property of the source file.

  • property.subject

    Type: Leaf

    Description: Subject property of the source file.

  • property.keywords

    Type: Leaf

    Description: Keywords property of the source file.

  • property.comment

    Type: Leaf

    Description: Comment property of the source file.

  • property.others

    Type: Repeatable

    Description: This permits access to all properties not specifically accessible through property elements described in this table, and includes both the "name" and the " body" of the property. The other properties supported depend on file format. See the Outside In Content Access Developer Guide for a list of possible predefined properties. Some file formats also allow for additional user-definable properties.

    At this time, only properties may be extracted from multimedia files.

  • property.others.x.name

    Type: Leaf

    Description: Descriptive name for the property.

  • property.others.x.body

    Type: Leaf

    Description: Text of the property.

  • pragma.charset

    Type: Leaf

    Description: The text string associated with the character set of the characters that HTML Export is generating. In order for HTML Export to correctly code the character set into the output it generates, all templates should include a <meta> tag that uses the {## insert} macro as follows:

    <meta HTTP-EQUIV="Content-Type" CONTENT="text/html; charset={## insert element=pragma.charset}" />
    

    If the template does not include this line, the user will have to manually select the correct character set in their browser.

  • pragma.cssfile

    Type: Leaf

    Description: This element is used to insert the name of the Cascading Style Sheet (CSS) file into HTML documents. This name is typically used in conjunction with an HTML <link> tag to reference styles contained in the CSS file generated by HTML Export.

    When used with the {## insert} macro, this pragma will generate the URL of the CSS file that is created. This macro must be used with {# insert} inside every template file that inserts contents of the source file and when the selected HTML flavor supports CSS. The CSS file will only be created if the selected HTML flavor supports CSS.

    When used with the {## if} macro, the conditional will be true if the selected HTML flavor supports Cascading Style Sheets or not.

    NOTE: If CSS is required for the output, the following code must be used:

    {## if element=pragma.embeddedcss}

    or

    {## if element=pragma.cssfile}

    However, HTML Export does not differentiate between the two, as the choice of using embedded CSS vs. external CSS is the template author's decision and the author may even wish to mix the two in the output.

    An example of how to use this pragma that works when exporting either CSS or non-CSS flavors of HTML would be as follows:

    {## if element=pragma.cssfile}

    <link rel="stylesheet"

       href="{## insert

          element=pragma.cssfile}">

    </link>

    {## /if}

  • pragma.embeddedcss

    Type: Leaf

    Description: This element is used to insert CSS style definitions in a single block in the <head> of the document.

    When used with the {## insert} macro, this pragma will insert the block of CSS style definitions needed for use later in the file. This macro must be used inside every output HTML file where {## insert} is used to insert document content.

    When used with the {## if} macro, the conditional will be true if the selected HTML flavor supports CSS.

    NOTE: If CSS is required for the output, the following code must be used:

    {## if element=pragma.embeddedcss}

    or

    {## if element=pragma.cssfile}

    However, HTML Export does not differentiate between the two, as the choice of using embedded CSS vs. external CSS is the template author's decision and the author may even wish to mix the two in the output.

    If a style is used anywhere in the input document, that style will show up in the embedded CSS generated for all the output HTML files generated for the input file. Consider a template that splits its output into multiple HTML files. In this example, the input file contains the "MyStyle" style. It does not matter if during the conversion only one output HTML file actually references the "MyStyle" style. The "MyStyle" style definition will still show up in the embedded CSS for all the output files, including those files that never reference this style.

  • pragma.jsfile

    Type: Leaf

    Description: This element is used to insert the name of the JavaScript file into HTML documents. This name is typically used in conjunction with an HTML <script> tag to reference JavaScript contained in the .js file generated by HTML Export.

    When used with the {## insert} macro, this pragma will generate the URL of the JavaScript file that is created. This macro must be used with {## insert} inside every template file that inserts contents of the source file when:

    The selected HTML flavor supports JavaScript.

    The javaScriptTabs option has been set to true.

    The JavaScript file will only be created if the selected HTML flavor supports JavaScript.

    When used with the {## if} macro, the conditional will depend upon whether the selected HTML flavor supports JavaScript or not.

  • pragma.sourcefilename

    Type: Leaf

    Description: The name of the source document being exported. Note that this does not include the path name. When exporting documents inside of archive files, this is the name of the file inside the archive. For example, if the first file inside of archive.zip is myfile.doc, then exporting archive.zip?item.1 would use myfile.doc as the pragma.sourcefilename.

3.3.4 Default Nodes

For convenience, certain nodes in an element path may be skipped because they represent the obvious default behavior. These nodes include the sections node (sections.current.body.title is equivalent to body.title), and the body and contents nodes (body.contents.headings.1.body is equivalent to headings.1.body). Please note that these nodes may not be skipped if they are the last node in the path (headings.1.body is not equivalent to headings.1). For further examples, see Breaking Documents by Structure.

3.4 Macro Reference

Macros are commands to HTML Export within the template. Despite their casual similarity to HTML tags, they are not bound by any of the rules tags would usually follow inside an HTML file. Macros may appear anywhere in the template file, except inside another macro.

In the documentation and examples, the pieces of a macro are always shown delimited by spaces, however semicolons may also delimit them. This option was added to accommodate certain editors. In these editors, URLs entered into dialog boxes may not have non-quoted spaces. This makes it difficult or impossible to use the {## link} macro in these situations.

For example:

{## insert element=sections.1.body}

may also be written

{##;insert;element=sections.1.body}

Note that template macro string parameters and options support sprintf style escaped characters. This means that characters such as \x22, \r and %% are supported. Also note that most template attribute values may be quoted. The exception is template element strings, which may not be quoted at this time.

For example:

{## anchor aref="next"
format="<a href=\"%url\">Next</a><br/>\r\n"}

3.4.1 Units: {## unit}, {## header}, and {## footer}

If a template file is going to make use of the {## unit} macro at all, a {## unit} macro must be the first macro in the template file. It delimits the beginning and end of each unit. Unit boundaries are used when determining where to break the document when breaking based on content size.

A unit consists of a header, a footer (both of which are optional), and a body (which may be empty). To ensure that the header is the first item in the template and the footer is the last item, text between the {## unit} tag and the {## header} tag will be ignored, as will text between the {## /footer} tag and the {## /unit} tag, including whitespace. The header and footer of a unit will be output in every page containing that unit, enclosing that portion of the unit's body that is able to fit in a particular page. The entire template is a unit that may contain additional units.

An overview of using units in templates with examples is provided in Units - Breaking Documents by Content Size.

Syntax

{## unit [BREAK]}
   [{## header}
      any HTML
   {## /header}]

      any HTML

   [{## footer}
      any HTML
   {## /footer}]
{## /unit}

Attributes

BREAK

This optional attribute of the unit macro will force page actions in HTML Export and non-page actions in other export products. It forces a break (page break in HTML Export) before inserting the unit contents unless doing so would cause the body of the first page to be empty. One situation where this attribute would be useful would be to force a page break between each section of a document, perhaps to get one presentation slide per page.

The {## unit} macro and its BREAK attribute are ignored when SCCOPT_EX_PAGESIZE or pagesize (Transformation Server) is set to zero.

It is sometimes important to make sure that a break does not occur in the midst of text that is intended to be on the same page. To prevent breaks like this from occurring, enclose the text that should be kept on the same page inside a nested {## unit}{## header} pair. For example, to prevent a page break from occurring while a link is being created, the template author might write something like the following:

{## unit}{## header}
<a href="{## link element=sections.current.body}">Link</a>
{## /header}{## /unit}

3.4.2 Insert Element: {## insert}

This macro inserts an element of the source document into the output file at the current location.

Syntax

{## insert [ELEMENT=element [WIDTH=width] [HEIGHT=height] 
[SUPPRESS=suppress] [TRUNCATE=truncate]] | [NUMBER=number] 
[URLENCODE]}

Attributes

ELEMENT

This attribute describes which part of the source document should be placed in the output file at the location of the macro. For the possible values for this attribute, see The Included Sample Templates.

Note the name of the element being inserted may not be enclosed in quotes.

Example:

{## insert element=sections.1.body}

WIDTH

This optional attribute defines the width in pixels of the element being inserted. It is currently only valid for the image element. If the WIDTH attribute is not present but the HEIGHT attribute is, the width of the image will be calculated automatically based on the shape of the element. If both the WIDTH and HEIGHT attributes are not present, the image's original dimensions are used. If the image's original dimensions are unknown, the defaults assume a HEIGHT and WIDTH of 200.

Example:

{## insert element=slides.1.image width=400}

HEIGHT

This optional attribute defines the height in pixels of the element being inserted. It is currently only valid for the image element. If the HEIGHT attribute is not present, but the WIDTH attribute is, the height of the image will be calculated automatically based on the shape of the element.

Example:

{## insert element=slides.1.image height=400}

SUPPRESS

This optional attribute allows certain things to be suppressed from the output. This is very useful if elements need to be inserted in contexts where HTML is not appropriate, such as passing information to Java applets, ActiveX controls or populating parts of a form.

Possible values are as follows:

  • TAGS: All HTML tags will be suppressed from the output of the element, however the text may still contain HTML character codes like &quot; or &#123;

    For embedded graphics such as those found in word processing sections and spread sheets, both the URL and the <img> tag will be suppressed. Because there would be no way to access the resulting converted embedded graphic, conversion of the graphic is not performed.

    Example:

    <form method="POST">
    <input type="text" size="20" name="author" value="{## insert 
        element=property.author suppress=tags}">
    </form>
    
  • BOOKMARKS: Turns off all bookmarks in the inserted section. Bookmarks automatically precede many inserted elements so that other template elements may link to them. suppress=bookmarks is provided to prevent problems with nested <a> tags. Note that this represents a subset of the suppression behavior provided by suppress=tags.

  • INVALIDXMLTAGCHARS :Drops from the output all characters that are not allowed in XML tag names. This is designed to allow template authors to {## insert} custom document property names inside angle brackets ("<" and ">") to create XML tags. Most characters in Unicode and its subset character sets may be used as part of XML tag names. Illegal tag characters include "control" characters such as line feed and carriage return. Additionally, there are special rules for what characters can be the first character in a tag name. See the XML specification for a description of legal tag name characters.

    Example:

    {## repeat sections.property.others}
    <{## insert element=property.others.current.name
       suppress=invalidxmltagchars}>
    <{## insert element=property.others.current.body 
       suppress=invalidxmltagchars}>
    </{## insert element=property.others.current.name 
       suppress=invalidxmltagchars}>
    {/## repeat}
    

    produces something similar to the following:

    <MyProperty>PropertyValue</MyProperty>
    

TRUNCATE

When set, this attribute forces a maximum length in characters for the inserted element. This allows elements to be truncated rather than broken across pages when the page size option is in use. Truncated elements will end with the truncation identifier which is "…" (three periods). All elements that have a truncate value will be no more than the specified number of characters in length including the length of the truncation identifier. In HTML Export, elements are inserted in their entirety if no truncation size is specified. The value of this attribute must be greater than or equal to 5 characters. In other products, elements are simply specified.

An example of a situation where element truncation is useful is to limit the size of entries when building a table of contents.

The TRUNCATE attribute implies suppression of tags for the insert. It also auto applies the no source formatting option for the insert.

Note that the TRUNCATE attribute cannot be used with custom elements, because the custom element definition precludes the existence of any other attributes to {## insert}.

The TRUNCATE attribute has three special aspects to its behavior when grids are being inserted:

  • When truncation is in effect, the truncation size refers to the number of characters of content in each cell - not the number of characters in the grid as a whole.

  • While truncation normally causes all markup tags to be suppressed, when grids are in use, the table tags are retained (assuming that the output flavor supports tables).

  • Users are reminded that only one grid size may be selected for each spreadsheet sheet or database inserted. The size of the grid will be based in part on the TRUNCATE value if one or both the grid dimensions are not specified and the SCCOPT_EX_PAGESIZE or pageSize option (Transformation Server) is in use. In this situation, if a grid from a single sheet is inserted in more than one place in the template, and there are differing TRUNCATE values, then the grid dimensions will be based on the largest TRUNCATE value specified.

NUMBER

This attribute allows the developer to retrieve the total instance count or the current index value of any repeatable element. This can be very useful for writing JavaScript, BasicScript, etc. Four special keywords ("count", "countb0", "value" and "valueb0") don't appear in the document tree but can be used as nodes in the following special cases:

  • count / countb0: When appended to a repeating element and used with the NUMBER attribute, these nodes allow the developer to insert a text representation of the number of instances of the given repeatable element. count gives the count assuming the first index is 1 and countb0 gives it assuming the first index is 0. For example, if a presentation has three slides, the following template fragment:

    <p>{## insert number=slides.count}</p>
    <p>{## insert number=slides.countb0}</p>
    

    will produce the following text:

    <p>3</p>
    <p>2</p>
    
  • value / valueb0: When appended to a repeating element and used with the NUMBER attribute these nodes allow the developer to insert a text representation of the current value of the index of the given repeatable element. value gives the count assuming the first index is 1 and valueb0 gives it assuming the first index is 0. For example, if the current value of the index on slides is 2, the following template fragment:

    <p>{## insert number=slides.current.value}</p>
    <p>{## insert number=slides.current.valueb0}</p>
    

    will produce the following text:

    <p>2</p>
    <p>1</p>
    

URLENCODE

This optional attribute causes the inserted element to be URL encoded. As such, it is ignored unless it is specified as part of an insert that contains a file name. The following elements may be URL encoded:

  • pragma.sourcefilename

  • pragma.cssfile

  • pragma.embeddedcss

  • pragma.jsfile (HTML Export only)

In addition, the following elements will be URL encoded when the section type is "Archive" or "AR":

  • sections.x.fullname

  • sections.x.basename

  • sections.x.body

  • sections.x.title

  • sections.x.reflink

For all other {## insert}s, this attribute is ignored. As such, OEMs should note that HTML Export does not modify any URLs coming out of the input documents being converted. These URLs continue to be passed through as is. This attribute is also ignored if the URL was created using the EX_CALLBACK_ID_CREATENEWFILE callback. Such URLs are assumed to already be URL encoded.

A Note on Inserting Properties

Because of the special ways that properties are used in documents, property strings are inserted into the output files a little differently than other {## insert} macros. First, the property is always inserted as if the SCCOPT_EX_NOSOURCEFORMATTING or noSourceFormatting (Transformation Server) option were set. This prevents formatting characters such as newlines from interfering with the property strings. Second, the property is always inserted as if the template specified suppress=tags. This provides the template writer with maximum control over how property strings are presented.

3.4.3 Conditional: {## if}, {## elseif}, and {## else}

These macros allow areas of the template to be used or ignored based on information about an element of the source file.

Syntax

{## if ELEMENT=element [CONDITION=Exists|NotExists]
[VALUE=value]}
   any HTML
{## /if}

or

{## if ELEMENT=element [[CONDITION=Exists|NotExists] |
[VALUE=value]]}
   any HTML
{## else}
   any HTML
{## /if}

or

{## if ELEMENT=element [[CONDITION=Exists|NotExists] |
[VALUE=value]]}
   any HTML
{## elseif ELEMENT=element [[CONDITION=Exists|NotExists] |
[VALUE=value]]}}
   any HTML
{## else}
   any HTML
{## /if}

Note that multiple instances of {## elseif} may be used after {## if}. In addition, {## else} is not required when using {## elseif}.

Attributes

ELEMENT

This attribute describes which part of the source file should be tested. For the possible values for this attribute, see The Document Tree and Its Elements. If neither the CONDITION nor VALUE attribute exists, the element is tested for existence.

CONDITION

Defines the condition the element is tested for, possible values are Exists and NotExists.

VALUE

Defines the values the element should be tested against. The VALUE attribute is currently valid only for the sections.x.type element for testing of the type of a section of the source file. Possible values include:

  • ar: Archive

  • bm: Bitmap

  • ch: Chart

  • db: Database

  • dr: Drawing

  • em: Email

  • mm: Multimedia

  • pr: Presentation

  • ss: Spreadsheet

  • wp: Word processor document

Example 1:

{## if element=property.comment}
   <p><b>Comment property exists</b></p>
{## else}
   <p><i>Comment property does not exist</i></p>
{## /if}

{## if element=sections.1.type value=wp}
   <p><b>The source file is a word processor file</b></p>
{## /if}

{## if element=sections.1.type value=ss}
   <p>Spreadsheet</p>
{## elseif element=sections.1.type value=ar}
   <p>Archive</p>
{## elseif element=sections.1.type value=ch}
   <p>Chart</p>
{## else}
   <p>Not ss, ar, or ch</p>
{## /if}

Example 2:

{## if element=sections.current.type value=pr
   condition=notexists}
   <p>We can do something here for all document types
   other than presentations.</p>
{## else}
   <p>This is used only for presentations.</p>
{## /if}

3.4.4 Loop: {## repeat}

This macro allows an area of the template to be repeated, once for each occurrence of an element.

Syntax

{## repeat ELEMENT=element [MAXREPS=maxreps] [SORT=sort]}
   any HTML
{## /repeat}

Attributes

ELEMENT

This attribute describes which part of the source file should be repeated on. It must be a repeatable element. For the possible values for this attribute, see The Document Tree and Its Elements.

When using HTML Export, any HTML may be defined between the {## repeat} macro and its closing {## /repeat} macro. This HTML will be repeated once for each instance for the element specified. In addition, the index variable current may be used in any other {##} macro as the element-index of the element being repeated. For instance, the following HTML in the template will produce a list of the footnotes in a document:

<html>
<body>
<p>Here are the footnotes</p>
{## repeat element=footnotes}
   {## insert element=footnotes.current.body}
{## /repeat}
<p>No more footnotes</p>
</body>
</html>

Similarly, the following HTML in the template will insert the names of all the items in an archive:

{## repeat element=sections}
   {## insert element=sections.current.fullname}
{## /repeat}

MAXREPS

This attribute limits the total number of loops the repeat statement may make to the value specified. It is useful for preventing exceptionally large documents from producing an unwieldy amount of output.

SORT

This optional attribute defines whether to sort the output or not. This attribute is ignored if the input file is not an archive file of arctype file. All sorts are done based on the character encoding of the values in the input file. The sorts are also case insensitive at this time. Valid values of the sort attribute are:

  • fullname: Sort by sections.current.fullname

  • basename: Sort by sections.current.basename

  • none: No sorting is done. This is the default.

3.4.5 Linking with Structured Breaking: {## link}

This macro generates a relative URL to a piece of the document produced by HTML Export. Normally this URL would then be encapsulated by the template with HTML anchor tags to create a link. {## link} is particularly powerful when used within a {## repeat} loop.

Syntax

{## link ELEMENT=element [TOP]}

or

{## link TEMPLATE=template}

or

{## link ELEMENT=element TEMPLATE=template [TOP]}

Attributes

ELEMENT

Defines the element that is the target for the link. The URL that the {## link} macro generates will point to the first instance of this element in the output file. If this attribute is not present, the resulting URL will link to any output file that was produced with the specified template. If such a file does not exist, the specified template will be used to generate a file.

Remember that each element has one or more index values, some of which may be variables. An example of this type of index variable is the "current" in sections.current.body. Use of {## link} affects the value of those index variables, which may cause subtle side effects in the behavior of the linked template file. For a description of how {## link} affects the index of inserted elements, see Indexes and Structure-Based Breaking.

TEMPLATE

The name of a template file which must exist in the same directory as the original template file. If this attribute is not present, the current template will be used. If an element was specified in the {## link}, then the template must contain a {## insert} statement using that element.

It is important to note that while the template language is normally case insensitive, the case of the template file names specified here is important. The file name specified for the template is passed as is to the operating system. On operating systems such as UNIX, if the wrong case is given for the template file name, the template file will not be found and an error will be returned.

TOP

This attribute is only meaningful if an element is specified in the {## link} command. When this attribute exists, the generated URL will not contain a bookmark, and therefore the resulting link will always jump to the top of the HTML file (HTML Export) or file containing the specified element. This is useful if the top of the template has navigation or other information that the developer would like the user to see.

{## link} Usage Scenarios

Using the first syntax shown at the beginning of this section, a URL for the element bookmark is inserted in the document. Normally this syntax is used to create intradocument links to aid navigation. An example would be creating a link to the next section of the document.

In the second syntax, a URL is created to an output file generated by the specified template. This template is run on the same source document, but may extract different parts of the document. Normally, in this syntax, the "main" template contains a link to a second HTML file. This second file is generated using the template specified by the {## link} command and contains other document elements. As an example, the "main" template could produce a file containing the body of the document and a link to the second HTML file, which contains the footnotes and endnotes.

The third and most powerful syntax also produces the URL of a file generated by the specified template. This template is then expected to contain an insertion of the specified element. Normally this syntax is used with repeatable elements. It allows the author to generate multiple output files with sequential pieces of the document. As such it provides a way to break large documents up into smaller, more readable pieces. An example of where this syntax would be used is a template that generates a "table of contents" in one HTML file (perhaps a separate HTML frame). The entries in the table are then links to other HTML files generated by different templates.

Note that a {## link} statement which specifies a template does not always result in a new file being created. New files are only created if the target of the link does not exist yet. So if for example two {## link} statements specify the same element and template, only one HTML file is produced and the same URL will be used by both {## link} statements.

{## link} Archive File Example

The following template generates a list of links to all the extracted and converted files from the source archive file (represented by decompressedFile in the following example):

{## repeat element=sections}
   <p><a href="{## link
   element=sections.current.decompressedFile}">
   {## insert Element=sections.current.fullname}</a></p>
{## /repeat}

{## link} Presentation File Example

The following example (template.htm) uses the first syntax to generate a set of HTML files, one for each slide in a presentation. Each slide will include links to the previous and next slides and the first slide. Note the use of {## if} macros so the first and last slides do not have Previous and Next links respectively:

template.htm

   <html>
   <body>
   {## insert element=slides.current.image width=300}
   <hr />
   {## if element=slides.previous.image}
      <p><a href={## link element=slides.previous.image}>
   previous</a></p>
   {## /if}
   {## if element=slides.next.image}
      <p><a href={## link element=
      slides.next.image}>Next</a></p>
   {## /if}
   </body>
   </html>

Due to the side effects of {## link} using the element attribute, there can be some confusion over what values "current", "previous" and "next" have when each {## link} is processed. To better illustrate how this template works, consider running it on a presentation that contains three slides:

First Output File

Because no template is specified in the {## link} statements, template.htm is (re)used as the template for all {## link} statements. For the first slide, nothing interesting happens until slides.next is encountered. Because slides.current is 1 in this case, slides.next refers to slides.2 and the {## link} is performed on slides.2.image. This {## link} fills in the anchor tag with the URL for the output file containing the second slide. Because no file containing slides.2 exists, {## link} opens a new file.

Second Output File

For the second slide the template is rerun. slides.current now refers to slides.2, slides.previous refers to slides.1 and slides.next refers to slides.3. The {## insert} statement will insert the second slide.

The {## if} statement referring to slides.previous succeeds. Because the file containing slides.1 already exists, no additional file is created. The anchor tag will be filled in with the URL for the first output file.

The {## if} statement referring to slides.next also succeeds and the anchor tag will be filled in with the URL for the output file containing the third slide. Because no file containing slides.3 exists, {## link} opens a new file.

Third Output File

For the third slide the template is rerun. slides.current now refers to slides.3 and slides.previous refers to slides.2. slides.next refers to slides.4, which does not exist. The {## insert} statement will insert the third slide.

The {## if} statement referring to slides.previous succeeds. Because the file containing slides.2 already exists, no additional file is created. The anchor tag will be filled in with the URL for the second output file.

The {## if} statement referring to slides.next fails. At this point processing is essentially complete.

3.4.6 Linking with Content Size Breaking: {## anchor}

This macro generates a relative URL to a piece of the document produced by HTML Export when doing document breaking based on content size.

Syntax

{## anchor AREF=type [STEP=stepval] FORMAT="anchorfmt" [ALTLINK="element"] [ALTTEXT="text"]}

Attributes

AREF

Indicates the relation of the target of the link to the current file. Allowable values for this attribute are:

  • InsertStart: First page of the inserted element

  • InsertEnd: Last page of the inserted element

  • Next: Next page in the inserted element

  • Prev: Previous page in the inserted element

  • FirstFile: First page created for the entire document

  • LastFile: Last page created for the entire document

STEP

This attribute is used to insert a link to "fast forward/rewind" through the output pages. This attribute may only be used if AREF is "next" or "prev". It is specified as a non-zero positive integer. For example, to insert a link to skip ahead 5 pages in a document, the following statement could be used:

{## unit aref="next" step="5"
format="<p><a href=\"%url\">Next</a></p>"}

If not specified, the default value of "step" is one (1), which corresponds to the next/previous page. This attribute has no meaning when aref equals "insertstart", "insertend", "firstfile" or "lastfile".

FORMAT

This is an sprintf style format string specifying the text to output as the link. HTML Export replaces the %url format specifier with the target URL into the format string. For example:

{## anchor aref="next"
format="<a href=\"%url\">Next</a><br/>\r\n"}

ALTLINK

An attribute used to specify the target of the anchor if it cannot be resolved based on the anchor type. For example, the final file of a breakable element has no "next" file, and thus would resolve to nothing. However, if the altlink attribute is specified, the anchor will be generated using a URL to the first file found containing the specified element.

Note that no EX_CALLBACK_ID_ALTLINK callback will be made if an EX_CALLBACK_ID_ALTLINK attribute is specified in the {## anchor} statement.

For example:

{## anchor aref=next format="<a href=\"%url\">Next</a>"
altlink=headings.next.body}

ALTTEXT

Text to be output if the anchor cannot be resolved. If this attribute is not specified, no text will be output if the anchor target does not exist. For example:

{## anchor aref=next format="<a href=\"%url\">Next</a>"
alttext="Next"}

3.4.7 Comment Put in the Output File: {## ignore}

This macro causes {##} statements in an area of the template file to be ignored by the template parser. Any text between the {## ignore} and {## /ignore} tags will be written to the output file as-is. This macro allows {##} statements in an area of the template to be commented out for debugging purposes, or to actually write out the text of another {##} macro. However, the browser will parse any HTML tags inside the ignored block and the text will be formatted accordingly. This macro can ignore all {##} macros except for an {## /ignore} macro. No escape sequence has been implemented for this purpose. As a result, {## ignore} statements cannot be nested. If they are nested, a run time template parser error will occur.

Syntax

{## ignore}
   any HTML or other {##} macros
{## /ignore}

To fully comment out a section of the template, surround the {## ignore} statements with HTML comments.

For example:

<!--{## ignore} everything between here and 
the end HTML comment will be commented out.
{/## ignore}-->

3.4.8 Comment Not Put in the Output File: {## comment}

The {## comment} macro allows the template writer to include comments in the template without including them in the final output files. {## comment} provides the functionality of {## ignore}, but the text inside the {## comment} block is not rendered to the output files and is not included in page size calculations. Like {## ignore}, {## comment} macros may not be nested.

Syntax

{## comment}
   any HTML or other {##} macros
{## /comment}

3.4.9 Including Other Templates: {## include}

This command allows other templates to be inserted into the current template. It works in a manner similar to the C/C++ # include directive.

Syntax

{## include TEMPLATE=template}

Attributes

TEMPLATE

This attribute gives the name of the template to insert.

3.4.10 Setting Options Within the Template: {## option}

This macro sets an option to a given value. All {## option} statements are executed in the order in which they are encountered. Remember when using this template macro that the {## unit} tag must be the first template macro in any template.

Options set in the template have template scope. This means that, for example, if a {## link} macro references another template, options in the referenced template are not affected by the option settings from the parent template. Similarly, when the files contained in an archive file are converted, Export recursively calls itself to perform the exports of the child documents in the archive. Each child document is converted using a copy of the parent template, and that copy does not inherit the option values from the parent template.

The strings used to specify options from inside templates correspond to the option names. See the Options documentation for more details.

Options set using {## option} in the template are not inherited by the exports performed on files within archives. Each child export receives a fresh copy of all option values as originally set with DASetOption.

Remember that setting an option in the template overrides any option value set by an application within the scope of the template.

See HTML Export C/C++ Options for a description of how to treat a hyperlink in a Word input document, using the {## option} in the template.

Syntax

{## option OPTION=value}

The supported OPTION attributes and their values are listed in a table in the "Attributes" section that follows.

Attributes

OPTION

  • graphic_type: Allows one to set the type of graphics produced. It may have the following values: gif, jpeg, bmp, png, none, fi_gif, fi_jpegfif, fi_bmp, fi_png, fi_none.

    Note:

    Some of the Outside In Viewer Technology's import filters can be optimized to ignore certain types of graphics. To take advantage of this optimization, the option must be set before EXOpenExport is called. Setting this option via the template will happen after EXOpenExport is called, and will therefore not invoke this optimization. The only way to get the benefits of this optimization is to use DASetOption for the SCCOPT_GRAPHIC_TYPE set to FI_NONE. This is described in SCCOPT_GRAPHIC_TYPE.

  • html_graphictype: This is a deprecated version of the graphic_type template option. Note that it only supports values that start with "fi_:" fi_gif, fi_jpegfif, fi_bmp, fi_png, fi_none.

  • gif_interlaced: Allows one to set the GIF interlacing. The value can be one of the following: 0, 1, true or false.

  • jpeg_quality: Allows one to set the quality of the JPEG images being created. The quality can be from 1 to 100.

  • graphic_sizemethod: Allows one to set the type of graphic smoothing performed when graphics are resized. The value can be: sccgraphic_quicksizing, sccgraphic_smoothsizing, sccgraphic_grayscalesizing. For more information, see SCCOPT_GRAPHIC_SIZEMETHOD.

  • graphic_outputdpi: Specifies the dots-per-inch, or dpi of graphics created for the output. Can be from 0 to 2400. A value of 0 for the dpi means that the dpi of the original graphic found in, or referenced by the input document is used.

  • graphic_sizelimit: Specifies the total number of pixels in the output's graphics. Can be any value from 1 to 4,294,967,295. A value of 0 means that there is no size limit. See SCCOPT_GRAPHIC_SIZELIMIT for more details.

  • graphic_widthlimit: Specifics the width, in pixels, of the output's graphics. Can be any value from 1 to 4,294,967,295. A value of 0 means that there is no width limit. See SCCOPT_GRAPHIC_WIDTHLIMIT for more details.

  • graphic_heightlimit: Specifics the height, in pixels, of the output's graphics. Can be any value from 1 to 4,294,967,295. A value of 0 means that there is no height limit. See SCCOPT_GRAPHIC_HEIGHTLIMIT for more details.

    This only sets the upper limit for the height of an image. Images with smaller heights are not increased in size to match the height limit.

  • fontflags: Allows the template to suppress various font attributes, either singly, in combination, all or none. The single flag suppression values are: suppress_size, suppress_color, suppress_face. The combination suppression values are: suppress_sizecolor, suppress_sizeface, suppress_colorface. The suppress all flags value is suppress_all, and the suppress no flags value is suppress_none. See SCCOPT_EX_FONTFLAGS for more details.

  • gridrows: Specifies the number of rows to a grid. Only applicable to spreadsheet and database files used for input. Must be a number zero or greater. See SCCOPT_EX_GRIDROWS for more details.

  • gridcols: Specifies the number of columns to a grid. Only applicable to spreadsheets and databases used as an input file. Must be a number zero or greater. See SCCOPT_EX_GRIDCOLS for more details.

  • gridadvance: Specifies how HTML Export outputs grids using input from a spreadsheet or database input file. A value of "down" causes HTML Export to output the next grid by traversing down a spreadsheet's columns. A value of "across" causes the next grid to be output by traversing across the rows. See SCCOPT_EX_GRIDADVANCE for more details.

  • gridwrap: Specifies how HTML Export reacts when reaching the edge of a spreadsheet or database. This is used in conjunction with gridadvance. When the traversal method specified by gridadvance reaches an edge, and there is more cells to be traversed, then if this option is "true," HTML Export scans back to the beginning of the next set of grids to output. If this option is "false," then no more grids will be output. The value can be: 0, 1, false or true. See SCCOPT_EX_GRIDWRAP for more details.

  • EX_LINKTARGET: Support for this option is limited to Microsoft Word documents.

    Some input documents contain links. Template authors may have a preference for how the browser should select which frame or window to open those source document links in. This option allows the template author to do so by specifying a value to use for the target attribute of the links HTML Export generates in these cases. This single target value will be applied to all such links encountered in the source document. It does not affect the links generated by HTML Export for navigation generated because of template macros.

    If this option is not set, then no target attribute will be included in links from the source document.

    The value of the target attribute is expected to be able to be inserted by HTML Export directly into the output of the conversion. Under some circumstances, however, HTML Export may need to perform character mapping from the template to the output character set:

    • Templates written in a SBCS for conversions to DBCS will pad the text to form WORD sized characters, but will not perform any character mapping. In the unlikely event that this poses a problem, users should write their templates in UTF-8 or Unicode.

    • Templates written in Unicode for conversions will do character mapping to the appropriate output character set.

    For example, consider a document that contains a link to www.outsideinsdk.com. The template author wishes to change the browser's default behavior from opening the link in the current window to opening the link in a new window. Therefore, the template writer sets this option to _blank with the following line in the template:

    {## option EX_LINKTARGET=_blank}
    

    HTML Export will then generate the following link to the Oracle web page when the document is converted (HTML related to text formatting has been removed for clarity):

    <a href="http://www.outsideinsdk.com/" target="_blank">www.outsideinsdk.com</a>
    

    The following are valid values for the target= attribute in HTML:

    • _blank: The user agent should load the designated document in a new, unnamed window.

    • _self: The user agent should load the document in the same frame as the element that refers to this target.

    • _parent: The user agent should load the document into the immediate FRAMESET parent of the current frame. This value is equivalent to _self if the current frame has no parent.

    • _top: The user agent should load the document into the full, original window (thus canceling all other frames). This value is equivalent to _self if the current frame has no parent.

    The default is for this option not to be set. In that case, no target= attribute will be generated for links from the source document.

  • ex_linktargetoverride: Link target attribute values may be specified in both the source document and in the template via the EX_LINKTARGET template-only option. This option determines how to resolve such conflicts.

    The option has two settings (neither is case-sensitive):

    • Fallback: The value specified in the EX_LINKTARGET option is a fallback to use when the source document does not specify a link target attribute value. This is the default setting for this option if it is not set.

    • Override: The value specified in the EX_LINKTARGET option will always be used, overriding any link target attribute value(s) specified by the source document.

    Sample usage:

    {## option EX_LINKTARGET="_self"}{## option EX_LINKTARGETOVERIDE="Override"}
    

    This option is ignored if the EX_LINKTARGET option has not been set. The default for this option is to not be set. In that case, the value specified by the EX_LINKTARGET option is used as a fallback.

  • ex_toc: Output the Word document's table of contents. This can have the value of: 0, 1, false or true. This only applies to input files that are Word documents, and only if they contain a table of contents.

3.4.11 Copying Files: {## copy} (HTML Export Only)

The {## copy} macro is used to copy extra, static files into the output directory along with the output from the converted document. For example, if a template author has added a company logo that was not in the original input document, {## copy} can be used to make it a part of the converted output document. Other examples include graphics used to mimic "buttons" for navigation, outside CSS files, or a piece of Java code to be run.

Syntax

{## copy FILE=file}

Attributes

FILE

This is the name of the file to be copied. If a relative path name is specified as part of the file, then it must be relative to the directory containing the root template file.

For example:

{## copy FILE=uparrow.gif}

The {## copy} macro may occur anywhere inside a template. If the {## copy} is inside a {## if}, then the {## copy} will only be executed if the condition is TRUE. In {## repeat} loops, the {## copy} will only be performed if the loop is executed one or more times. In addition, if the {## repeat} loops more than once, HTML Export detects this and the {## copy} is executed only once.

As its name suggests, the {## copy} macro is a straight file copy. Therefore, no conversions are performed as part of the copy. For example, graphics formats are not changed and graphics are not resized. Template authors should also remember to use {## graphic} when graphics and other files are copied so that space will be created for the external graphic in the text buffer size calculations.

Because the only action HTML Export takes is to copy the requested file, it is up to the template author to make use of the copied file at another point in the template. For example, a graphic file may be copied and then the template can use an <img> tag which references the copied graphic. The following snippet of template code would do this:

{## copy FILE=Picture.JPG
{## graphic PATH=Picture.JPG}
<img src="Picture.JPG">

The OEM should also know that if the file copy fails, HTML Export will continue and no error will be reported back to the OEM.

3.4.12 Deprecated Template Macros (HTML Export Only)

Previous releases of HTML Export used different macro syntax where template macros were expected to start with {Inso} rather than {##}. In addition some words that had been abbreviated must now be spelled out ("insert" instead of "ins"). The old syntax will continue to be supported for the foreseeable future. However, it has been deprecated. The old Inso macros and their new equivalents are as follows:

  • {insoins} is now {## insert}

  • {insoif} ... {/insoif} is now {## if} ... {## /if}

  • {insoelseif} ... {/insoelseif} is now {## elseif} ... {## /elseif}

  • {insoelse} ... {/insoelse} is now {## else} ... {## /else}

  • {insoignore} ... {/insoignore} is now {## ignore} ... {## /ignore}

  • {insolink} is now {## link}

  • {insorep} ... {/insorep} is now {## repeat} ... {## /repeat}

It should be noted that templates may not mix the old style of Inso macro in with the new {##} style in the same template.

It should also be noted that no new or future features that export will include support the old syntax. Thus for example, the old syntax has not been extended to include support for the new {## unit} macros.

3.5 Breaking Documents by Structure

One of the most powerful features of the template architecture is the ability to break long word processor documents up into logical pieces and create powerful navigation aids to access them.

To understand how this is done, the developer must first understand the document tree as it relates to word processor documents. The somewhat complex graphic that follows attempts to show how the elements in the tree relate to a real-world document.

Figure 3-2 Correlation between Element Tree and Document

Correlation between Element Tree and the document.

The following are some examples of elements and the data they would produce if run against the document shown in the preceding image. Note the omission of the default nodes body and contents in the second two examples:

  • body.contents.headings.2.body.title: would produce "Present Day."

  • body.contents.headings.2.body.contents.headings.1.body.title: would produce "Commercial."

  • body.contents.preface: would produce "The History of Flight" and the text below it, up to but not including "Introduction."

  • headings.2.headings.1.headings.3.title: would produce "McDonnell-Douglas."

  • headings.2.headings.1.headings.3.contents: would produce the text below "McDonnell-Douglas" but above "Military."

Breaking documents requires that HTML Export understand the logical divisions in the structure of a document. Currently the only formats that can give HTML Export this information in an unambiguous manner are Microsoft Word 95 and higher and WordPerfect 6.0 and higher. In these formats, the breaking information is available if the author placed Table of Contents information in the document. Refer to the appropriate software manual for information on the necessary procedure for including this information. That is not to say that the document must have a TOC, only that the information to build one must be present.

It should be noted that some word processing formats, including Microsoft Word 2002 (XP), allow users to specify TOC entries in multiple ways. HTML Export only supports two of these methods if the TOC is specified through:

  • Applied heading styles: Yes

  • Custom styles with outline levels: Yes

  • Outline level applied as a paragraph attribute: No

  • TOC entries: No

Additionally, if a heading style is applied to text inside a table in the original document, HTML Export will not break on that heading. This is because HTML Export will not break within tables.

The sample templates that ship with the HTML Export SDK use document breaking extensively and are probably the best way to understand the uses of the structure-based breaking feature.

3.5.1 Indexes and Structure-Based Breaking

All repeatable nodes have an associated index variable that at any given time in the export process has a current value. For elements that contain repeatable nodes as part of their path, the instance of the repeatable element must be specified by using a number or one of several index variable keywords. The possible values for this index variable (referred to as x in Element Definitions) are as follows:

  • A whole number (integer). HTML Export indexes begin counting with 1 (not 0).

  • current

  • next

  • previous

  • first

  • last

For numeric values, the number is simply inserted as another node in the path. For example, slides.1.image references the first slide in a presentation and footnotes.2.body references the second footnote in a document.

Elements that cannot be guaranteed to be within the document to which the template is applied should not be explicitly referenced. For example, referencing sections.4.body may result in unexpected behavior in documents that have less than 4 sections. Requesting a non-existent element won't cause an error in HTML Export; the insertion will just be ignored. However, if other HTML surrounding the insertion depends on the results of the insert, the output may be invalid HTML.

The current, next, previous, first and last keywords are fairly self-explanatory. For example, slides.current.image references the current slide and slides.next.image refers to the next slide. When the template is processed, the current, next, previous, first and last variables are replaced with the appropriate index value.

next and previous do not change the value of the index, as was the case in versions of HTML Export prior to the 1.2 release. As a result, the only places where the index is changed are inside of a {## repeat} loop and as the result of a {## link} statement. For more information, see Loop: {## repeat}, Linking with Structured Breaking: {## link}, and Breaking Documents by Structure.

{## repeat…}

The initial value of the index variable for any given repeatable element typically is 1. For {## repeat} loops, the index is incremented with each iteration. Termination of a {## repeat} loop resets the counter to its initial value. Actually, it is more accurate to say that the scope of the index is the repeat loop.

The following template fragment uses current in a repeat loop, which outputs all the footnotes in the source file:

{## repeat element=footnotes}
{## insert element=footnotes.current.body}
{## /repeat}

When a template containing a repeat statement is the target of a {## link} statement that specifies the element to be used as the repeat element, the initial value of the index will be determined by the {## link} processing.

{## link…}

The {## link} statement does not affect the index variable in the context of the current template. The {## link} statement can only affect index variables when both an element and a template are specified. In this case only the index variables in the target for the specified element are affected.

If the element specified in the {## link} contains a next or previous keyword, the value of current in the target file will be affected. The initial value of current in the target will be the value of (current in the source)+1 for next. Similarly, previous has the effect of decrementing the value of current.

The following example uses a single template file and the {## link} macro to create a set of HTML files, one for each slide in a presentation. The {## link} does the dual job of driving the generation of the HTML files and providing a "next" link for navigation. Notice the use of the next keyword in the {## if} macro that checks to see if there is a next slide:

{## unit}
<html>
<body>
<!-- insert the current slide -->
{## insert element=slides.current.image width=300}
<hr />
<!-- Is there a next slide? -->
{## if element=slides.next.image}
   <!-- If yes, generate a URL to an HTML file containing 
        the next slide. The HTML file is generated using 
        the current template (because there is no template 
        attribute). While generating the new HTML file, the 
        value of the index on slides will be its current 
        value plus 1 once control returns to this template, 
        the value of the index on slides is unchanged. -->
   <p><a href="{## link element=
   slides.next.image}">Next</a></p>
{## else}
   <!-- If no, create a link to the HTML containing the 
       first slide. -->
   <p><a href="{## link element=
   slides.1.image}">First</a></p>
{## /if}
</body>
</html>
{## /unit}

3.6 Units - Breaking Documents by Content Size

HTML Export has a system for breaking up documents. In addition to being able to break documents according to their structure, template writers can now break documents based on the amount of content to be placed in each output file or "page." Documents can even be broken based on both their structure and content size.

To break documents by content size, two things must be done. First, the SCCOPT_EX_PAGESIZE (pageSize with Transformation Server) option must be set (see the Options documentation for details). The second thing that must be done is that the template used must be equipped with the {## unit} construct.

The basic idea behind the unit template construct is to tell Export what things should be repeated on every "page" and what pieces should only be shown once. In other words, the unit template construct provides a mechanism for grouping template text and document elements. Unit boundaries are used when determining where to break the document when spanning pages.

Here are some examples of the kinds of things the template author might want to appear on every page:

  • The <meta> tag inserting the output document character set.

  • A company copyright message.

  • Navigational elements to link the previous/next pages together.

Typical examples of things that wouldn't go on every page would be:

  • The actual content of the document.

  • Structural navigational elements like the links for a table of contents.

A unit consists of a header, a footer (both of which are optional), and a body. Items that are to be repeated at the beginning or end of every unit should be placed in the header or footer respectively.

A unit is delimited by the {## unit} template macro. Similarly, the {## header} and {## footer} template macros delimit the header and footer respectively. The body is everything that is left between the header and the footer. The {## unit} macro must be the first macro in the template. The body frequently contains nested units. The body may be empty.

To ensure that the header is the first item in the template and the footer is the last item, text between the {## unit} tag and the {## header} tag will be ignored, as will text between the {## /footer} tag and the {## /unit} tag, including whitespace. The header and footer of a unit will be output in every page containing that unit, enclosing that portion of the unit's body that is able to fit in a particular page. The entire template is a unit that may contain additional units.

3.6.1 A Sample Size Breaking Template

By way of example, let's take another look at the very simple template from What Is a Template? To make things more interesting, let's insert the character set into the template with a <meta> tag. Let's also insert some better navigation to improve movement between the pages. The modified version of the template is as follows:

{## unit}{## header}
<html><head>
<meta HTTP-EQUIV="Content-Type" CONTENT="text/html;
charset={## insert element=pragma.charset}" /></head>
<body>
{## anchor aref="prev" format="<p><a href=\"%url\">Prev</a></p>"}
{## /header}
<p>Here is the document you requested.
{## insert element=property.title} by
{## insert element=property.author}</p>

<p>Below is the document itself</p>
{## insert element=body}
{## footer}
{## anchor aref="next" format="<p><a href=\"%url\">Next</a></p>"}
</body>
</html>
{## /footer}{## /unit}

A very small value (about 20 characters) is used for the page size option. The resulting HTML might look like this (HTML that is the result of a macro is in bold):

file1.htm

<html><head>
<meta HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=
us-ASCII"/></head>
<body>
<p>Here is the document you requested.</p>
<p>A Poem by Phil Boutros</p>
<p><a href="file2.htm">Next</a></p>
</body>
</html>

file2.htm

<html><head>
<meta HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ASCII" /></head>
<body>
<p><a href="file1.htm">Next</a></p>
<p>Below is the document itself</p>
<p>Roses are red</p>
<p>Violets are blue</p>
<p><a href="file3.htm">Prev</a></p>
</body>
</html>

file3.htm

<html><head>
<meta HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ASCII" /></head>
<body>
<p><a href="file2.htm">Prev</a></p>
<p>I'm a programmer</p>
<p>and so are you</p>
</body>
</html>

There are several things to note:

  • The page size option value does not apply to the text from the template, only the text inserted from the source document. Each page contains roughly 20 characters of visible input document text.

  • The {## insert} of the character set is part of the {## header} and therefore is inserted into all the output pages.

  • Text from the body of the unit is inserted sequentially. Thus "as is" template text such as the line "<p>Below is the document itself</p>" is only inserted once.

  • The {## anchor} tags only insert links to the previous/next page if there actually is a previous/next page. Thus the first page does not have a link to the non-existent previous page.

  • Finally, the output of the document is split according to the page breaking rules.

3.6.2 Templates Without {## unit} Macros

The {## unit} macro is only required in templates that are designed to break pages based on size using the SCCOPT_EX_PAGESIZEpageSize option. An example of a template that would not perform any size-based breaking is one that defines an HTML <frame>, but does not include any document content. Another example where size-based breaking might not be desired is a table of contents page, even though a table of contents page does contain document content.

A template that does not conform to the {## unit} format is a not a size-based breaking template. Support for this type of template will continue for the indefinite future. The template will be considered to not be a size-based breaking template if the first macro tag encountered is something other than {## unit}. This means that there cannot be any {## unit}, {## header} or {## footer} macros later in the template. The value of the SCCOPT_EX_PAGESIZEpageSize option will be ignored for this type of template.

3.6.3 Indexes and Size-Based Breaking

All repeatable nodes have an associated index variable. For information about using index variable keywords such as "Next" and "Last," see Indexes and Structure-Based Breaking. In addition to those index variable keywords, repeatable grid elements have four additional keywords. They are:

  • up

  • down

  • left

  • right

These keywords may only appear immediately after the grids node in the document tree. For example grids.up.body is legal, but sections.left.grids.1.body is not. Use of these keywords is otherwise self-explanatory.

Note too that individual grids are only addressable relative to each other. In other words, while it is possible to specify the "up" grid, it is not possible to arbitrarily specify a grid directly (for example., "5, 7").

3.7 Using Grids to Navigate Spreadsheet and Database Files

In order to support spreadsheets (and database files, though they are not as common), a new template-based navigation concept known as a "grid" has been introduced. Grids offer a way to consistently navigate a spreadsheet or database in an intuitive fashion.

Grids can be used to present the output of large spreadsheets in smaller pieces, so that less scrolling is necessary. It can also be used to help prevent the HTML versions of large spreadsheets from overwhelming browsers, potentially causing them to lock up. Grids can also be used to halt processing of large spreadsheets before they waste too much CPU time.

To use grids, the template author should use the new grid template element (see Element Definitions). Grids may only be used in templates that have been enabled with the {## unit} template macro. It is also important to set the grid-related options. See the Options documentation for details).

The grid support has some important limitations:

  1. The output file format and flavor are expected to supports tables, although this is not required.

  2. Grids are only used when converting spreadsheets and database input files. Grids are not available for word processing files at this time.

  3. Due to size constraints, grid support works best if the contents of the cells in the input file do not make use of a lot of formatting (bold, special fonts, text color, etc.).

To further explain the grid system, consider a multi-sheet spreadsheet workbook as an example. Each sheet in the spreadsheet workbook is broken into a collection of grids. Each grid has a fixed maximum size and is a rectangular portion of the spreadsheet. The size of the grid is specified as a number of spreadsheet cells. For example, consider the following 7x10 spreadsheet:

Figure 3-3 7x10 Spreadsheet

This shows a 7x10 spreadsheet.

If the OEM wanted to break it up into 3x4 grids, 9 grids would be produced as shown in the following diagrams:

Figure 3-4 3x4 Grids

This shows the spreadsheet broken into 3x4 grids.

Normally, all grids have the same number of cells. The exception is that grids at the right or bottom edge of the spreadsheet may be smaller than the normal size. Grids will never be larger than the requested size. For this reason, grids can easily be navigated by using "up", "down", "left" or "right". One thing that grids cannot do is address individual cells in a spreadsheet (except, of course, in the degenerate case of a grid whose size is 1 x 1).

HTML Export does not force deck/page breaks between each grid. Therefore, if the template writer wants to limit each deck/page to only one grid, they should force the break in the template.

3.7.1 Grid Support When Tables Are Not Available

Not all output flavors supported by HTML Export support the creation of tables. If the output flavor does not support tables, HTML Export will still support grids. However, HTML Export's normal non-table output will be what is presented in grid form. For example, if "[A1]" represents the contents of cell A1, then we would export the following for a grid of size (2x2):

If grids.1.body is:

[A1]

[A2]

[B1]

[B2]

then grids.right.body is:

[C1]

[C2]

[D1]

[D2]

and grids.down.body is:

[A3]

[A4]

[B3]

[B4]

3.8 Choosing a Template

Through the use of templates, HTML Export users have infinite flexibility in the way they can present converted documents. Users typically use one of the following four strategies to select a template:

  1. The simplest method is to use the internal template, which is built into HTML Export. This is the template used when the SCCOPT_EX_TEMPLATE (using Transformation Server, template) option is not set. This template produces a very basic, rudimentary presentation of the input document. The template is an external approximation of this internal document.

  2. There are also sample templates shipped with HTML Export. These templates are designed to meet different needs for HTML Export users (polished navigation, simple HTML for document indexing engines, etc.).

  3. With a bit more effort, the user can modify one of the sample templates shipped with HTML Export. Simple changes, such as adding graphics or static text, should be easily accomplished by someone with a willingness to experiment with these templates.

  4. Advanced users may choose to write a template of their own design, customized specifically to their needs. Such templates can incorporate elements from a wide range of Web standards, such as Java. Needless to say, users who go this route should have strong technical skills at the outset. They should begin the process of creating templates by reading through this chapter in its entirety and looking at the template tutorial.

3.9 Unicode Templates

For non-Unicode templates, the content of the template is copied byte for byte to the output files as needed. Of particular note is the fact that no character mapping takes place on the text in the template file. However, this can create problems when the source input document overrides the requested SCCOPT_EX_OUTPUTCHARACTERSET (using Transformation Server, outputCharacterSet) option setting. To solve this problem, users may use templates written in Unicode.In order for HTML Export to know that a template is encoded in Unicode, the template file must begin with the Unicode Byte Order Mark (BOM). All files beginning with the BOM are assumed to be encoded in Unicode. HTML Export automatically converts Unicode templates to the output character set as needed.