The Output Pages section describes how to set options for general HTML output, as well as for Markup Items, Text Formats, and Page Layouts.
The following topics are covered:
The top-level Output Pages page allows you to set general options for HTML output.
This option can be set to the following values:
none - only HTML tags will be used for formatting.
embedded - CSS styles will be included in the head of each output file.
external - CSS styles will be output in a separate file, which is referenced by all output files generated during the conversion.
inline - CSS style information will be included in every paragraph. This option can drastically increase the size of the output file, but is necessary when the template is used to generate fragments.
By default, the CSS is embedded in the HTML of each output file. This reduces the total number of files generated by the conversion, as there is no need to create a separate CSS file. Note that if a style is needed anywhere in the final output of the conversion, it will appear in the style definitions of every HTML file.
If this option is left blank, the system will generate CSS styles based on how elements are defined under Text Element.
This option is ignored if the CSS generation option above is set to none.
The HTML standards currently limit documents to a single output character set. That character set is specified in an output file using the CONTENT attribute of the <meta> tag. This limits what the technology can do with documents that have multiple character sets. In general, documents that are a mix of a single Asian language and English characters will translate correctly (although with some possible loss of non-alphanumeric characters) if the appropriate DBCS, UTF-8, or Unicode output character set is selected. This is because most DBCS character sets include the standard 7-bit Latin 1 characters. Documents that contain more than one DBCS character set or a DBCS character set and a non-English character set (such as Cyrillic) may not export with all the character glyphs intact unless Unicode or UTF-8 is used.
Source documents that contain characters from many character sets will look best when this option is set to the default, Unicode (UTF-8). This is because the Unicode and UTF-8 character sets contain almost all characters for the most common languages.
While the W3C recommends using Unicode, there is a downside to it at this time. Not all systems have the appropriate fonts needed for using Unicode or UTF-8. Many editors do not understand these character sets, as well. In addition, there are some differences in the way browsers interpret the byte order of 2-byte Unicode characters (which is why both big and little endian Unicode are available settings for this option).
Note that when the CSS generation option described above has been set, there will be no unmappable characters in the HTML. Instead, the unmapped character will be written out in &#....; notation using the decimal representation of the character's Unicode value. Newer browsers support this representation and will convert it to the appropriate character if it is available in the font being used. If the character is not available in that font, the browser's unmappable character symbol (typically a rectangular box) will be seen. Also, note that there may still be unmapped characters in text rendered to graphics. This is because the graphic file is generated at conversion time rather than being rendered by the browser.
GIF
JPEG (default)
PNG
None
When setting this option, remember that the JPEG file format does not support transparency. Though the GIF file format supports transparency, it is limited to using only one of its 256 available colors to represent a transparent pixel ("index transparency").
PNG supports many types of transparency. The PNG files written are created so that various levels of transparency are possible for each pixel. This is achieved through the implementation of an 8-bit "alpha channel". However, at this time the technology will ignore transparency data when metafiles with multiple layers are converted.
While this option is used to help compute table sizes, it is primarily a graphics option. Early browsers and versions of the HTML standard limit the specification of image sizes to dimensions in pixels. For images in particular, this is somewhat natural as GIF, JPEG, BMP, and PNG are bitmap formats whose sizes are defined in pixels. However, many of the source graphics and tables converted specify their size in physical units such as inches or centimeters, and there is no way to know how big a pixel is on the target device for the converted document. In fact, a single document may ultimately be viewed on many devices, each with a different number of pixels or dots per inch (DPI). If graphics are converted too small, image detail will be lost. Conversely, if the graphics are converted too large, conversion times will suffer and files will take longer to download.
Setting this option to 0 may result in the creation of extremely large images. Be aware that there may be limitations in the system running this technology that could result in undesirably large bandwidth consumption or an error message. Additionally, an out-of-memory error message will be generated if system memory is insufficient to handle a particularly large image.
Also note that setting this option to 0 will force the technology to use the DPI settings already present in raster images. For any other type of input file, the current screen resolution will be used as the DPI setting .
quick
smooth
grayscale
Each of these options involves some degree of trade-off between the quality of the resulting image and the speed of conversion.
No setting - This means that no target attribute will be included in links from the source document.
_self - This means that the document is loaded in the same frame as the element that refers to this target (essentially the same as not specifying a target at all).
_parent - This means that the document is loaded into the immediate FRAMESET parent of the current frame. This value is equivalent to _self if the current frame has no parent.
_top - This means that the document is loaded into the full, original window (thus canceling all other frames). This value is equivalent to _self if the current frame has no parent.
_blank - This means that links are opened in a new, unnamed window.
It is important to note the things that setting this option does not do:
While setting this option will make it easier for a human to read the generated markup in a text editor, it does not affect the browser's rendering of the document.
This option does not affect the contents of the .css files, since they do not contain any text from the source document.
The option does not affect spaces or newlines copied from the template, as the contents of the template are already under the control of the user.
Markup items are HTML fragments that may be inserted directly into the output HTML as part of a page layout. Each markup item is a name/value pair. The name is what will appear in the screens for editing page layouts. The value is a block of HTML that will be inserted into the output HTML wherever the markup item appears in a page layout. There is one default name in this section, break, whose value is defined as <br />.
Output text formats define text and formatting attributes of output document text. These formats will define such attributes as the font family, size, and color, standard text attributes (bold, italic, underline, etc) and border attributes. This allows the template author to standardize the look of the output despite differing formatting styles used by the various authors of the source documents. There is one default format in this section, Default Paragraph, whose tag is p. Output Text Formats created here can then be be organized according to Format Mapping Rules, which pick the formatting based on checking the type of source document text.
Note:
Users should be aware that text formats are only applied to text from word processing files. They cannot be used to change the formatting of text that is rendered as part of any graphics generated by the conversion. They are also not applied to text inside spreadsheets.
Always off - Forces the attribute to always be off when formatting the text.
Always on - Forces the attribute to always be on when formatting the text.
Inherit (default) - Takes the state of the attribute from the source document. In other words, if the source document had the text rendered with bold, then the technology will create bold text.
Do not specify - Leave the formatting unspecified. In certain cases, this will produce different HTML than Always off.
There are also three font settings: Font family, Size, and Color. These are only available when you set them to Always on. The defaults for these three settings are Arial, 12pt, and 000000 (hexadecimal for black).
Border style - The default is None, and you can select one of the allowable border style from the drop-down box: dotted, dashed, solid, double, groove, ridge, inset, outset.
Border color - The default is 000000 ((hexadecimal for black), and you can specify a color in a valid CSS format.
Border width - The default is 1pt, and you can specify a border width in a valid CSS format.
Format mapping rules allow you to specify output document formatting and the sequence in which rules are checked.
For example, a rule may be created for mapping paragraphs in the My Style style. Below that rule, another rule may exist for mapping paragraphs with outline level 1 applied. An input document may have one or more paragraphs in the My Style style that also have outline level 1 applied. In this example the technology will only apply the My Style formatting to such paragraphs and ignore the outline level 1 rule for them.
Outline level - Match the outline level specified in the source document. Application-predefined "heading" styles typically have corresponding outline levels applied as part of the style definition.
Style name - Match the paragraph or character style name.
Is footnote - Match any footnote.
Is endnote - Match any endnote.
Is header - Match any document header text.
Is footer - Match any document footer text.
This example shows how to map the heading styles found in the source document to CSS styles for the HTML display.
Page layouts are used to organize how the various parts of the output are arranged. This includes such items as where to place a Table of Contents in the output document. A default layout has been provided for users who need output that is pleasing to the eye, but are not particular about the details of their output.
Users may create multiple page layouts, each optimized for a specific file type. The Document Handling page allows you to specify which page layout to use.
Section Name - Use the title for the current document section. Section titles are not available in all document formats, such as word processing files. Two examples of where this is very useful are presentations, where this corresponds to the slide title; and spreadsheets/database files, where this corresponds to the sheet name. Using the section name works well with output layouts that place one slide/sheet in each output HTML page. In this situation, each page would have a title that matches the title of its contents. If the page layout does not break the document by sections, the name of the first section will be used as the title text.
Text Element - Use a text element already defined under Text Element. Using a text element for the title makes a good fail-safe entry at the bottom of the list, just in case all other title sources are undefined/unavailable.
Property - Use any document property already defined under Document Property.
Output Text Format - Use an output paragraph format already defined under Output Text Formats. The first non-empty instance of a paragraph in this format is used.
Only navigation items and markup can be inserted into the body of the navigation layout. Page navigation is not supported in the navigation layout; it is intended for use with document and section layout items.
Use of a navigation layout is optional. If one is used, you can specify markup items to be placed in either the Head and/or the Body.
For the Head item, you can define the content that will be placed in the HTML <head> of all of the output files this layout applies to. The following items are placed by default in the head:
A <meta> tag stating the character set in which the HTML file is encoded.
A <title> tag, the contents of which are defined in the Title Source page.
A <meta> tag stating that the HTML was generated.
<meta> tags for all document properties defined under Document Property that specify a Meta tag name.
If the CSS generation option was selected on Output Pages, then CSS style definitions generated by the technology are included, either in a <style> tag (for embedded CSS) or with a <link> tag to a CSS file generated by the conversion.
If the External CSS stylesheet option was selected on Output Pages, then an HTML <link> tag to the user specified CSS file is included.
Also, any markup items previously defined under Output Markup Items can be inserted here. The markup items specified on this screen will appear in the head after all of the auto-generated items listed above. By default, there is nothing listed here.
For the Body item, you can define the content that will be placed in the body of the navigation page created by this layout. The following items may be placed in the top:
Markup Item - A text and HTML markup item defined under Output Markup Items.
Navigation Element - A navigation element defined under Add Navigation Element.
Break on sections - If checked, enables page breaking on document sections. Only applicable for multi-section documents.
Break on pages - If checked, enables page breaking based on the document-specific pagination options.
A <meta> tag stating the character set in which the HTML file is encoded.
A <title> tag, the contents of which are defined in the Title Source page.
A <meta> tag stating that the HTML was generated.
<meta> tags for all document properties defined under Document Property that specify a Meta tag name.
If the CSS generation option was selected on Output Pages, then CSS style definitions generated by the technology are included.
If the External CSS stylesheet option was selected on Output Pages, then an HTML <link> tag to the user specified CSS file is included.
Also, any markup items previously defined under Output Markup Items can be inserted here. The markup items specified on this screen will appear in the head after all of the auto-generated items listed above. By default, there is nothing listed here.
Header - The header from the input document.
Navigation Item - A navigation item defined on Add Navigation Element.
Document Property - A document property defined on Document Property.
Markup Item - A text and HTML markup item defined on Output Markup Items.
Text Element - A text element defined on Text Element.
Document Property - A document property defined on Document Property.
Markup Item - A text and HTML markup item defined on Output Markup Items.
Text Element - A text element defined on Text Element.
Section Name - The name of the current section. If the name of the current section is not specified in the source document or is undefined (such as in word processing documents), then nothing will be inserted. Adding this type of item brings up a simple screen where the author selects which Output Format to use for the element.
Document Property - A document property defined on Document Property.
Markup Item - A text and HTML markup item defined on Output Markup Items.
Text Element - A text element defined on Text Element.
Header - The header from the input document.
Footer - The footer from the input document.
Document Property - A document property defined on Document Property.
Markup Item - A text and HTML markup item defined on Output Markup Items.
Text Element - A text element defined on Text Element.
Footer - The footer from the input document.
Navigation Item - A navigation item defined on Add Navigation Element.
Document Property - A document property defined on Document Property.
Markup Item - A text and HTML markup item defined on Output Markup Items.
Text Element - A text element defined on Text Element.