1 Introduction

This chapter is an introduction to HTML Export, a powerful SDK that allows an OEM to translate almost any document, spreadsheet, presentation, or graphic into high quality HTML.

Note:

For new functionality information, see What's New guide.

HTML Export's primary goal is producing faithful representations of source files using the HTML, GIF, JPEG and PNG formats. Using a C, Java or .NET API, the developer can set various options that affect the content and structure of the output.

There may be references to other Outside In Technology SDKs within this manual. To obtain complete documentation for any other Outside In product, see Middleware documentation page and click Outside In Technology link below.

This chapter includes the following sections:

1.1 Architectural Overview

The basic architecture of Outside In technologies is the same across all supported platforms.

Filter/Module Description

Input Filter

The input filters form the base of the architecture. Each one reads a specific file format or set of related formats and sends the data to OIT through a standard set of function calls. There are more than 150 of these filters that read more than 600 distinct file formats. Filters are loaded on demand by the data access module.

Export Filter

Architecturally similar to input filters, export filters know how to write out a specific format based on information coming from the chunker module. The export filters generate HTML, GIF, JPEG, and PNG.

Chunker

The Chunker module is responsible for caching a certain amount of data from the filter and returning this data to the export filter.

Export

The Export module implements the export API and understands how to load and run individual export filters.

Data Access

The Data Access module implements a generic API for access to files. It understands how to identify and load the correct filter for all the supported file formats. The module delivers to the developer a generic handle to the requested file, which can then be used to run more specialized processes, such as the Export process.

1.2 Definition of Terms

The following terms are used in this documentation.

Term Definition

Developer

Someone integrating this technology into another technology or application. Most likely this is you, the reader.

Source File

The file the developer wishes to export.

Output File

The file being written: HTML, CSS, JavaScript, GIF, JPEG, and PNG.

Page

A single text and its associated graphics to make a page of output. Pages have suggested lengths, but the actual length may be greater or smaller than the suggested value. Page sizes count only the bytes of visible text in the document, not markup.

Data Access Module

The core of Outside In Data Access, in the SCCDA library.

Data Access Submodule (also referred to as "Submodule")

This refers to any of the Outside In Data Access modules, including SCCEX (Export), but excluding SCCDA (Data Access).

Note: HTML Export normally comes with only the SCCEX Submodule.

Document Handle (also referred to as "hDoc")

A Document Handle is created when a file is opened using Data Access (see Data Access Common Functions). Each Document Handle may have any number of Subhandles.

Subhandle (also referred to as "hItem")

Any of the handles created by a Submodule's Open function. Every Subhandle has a Document Handle associated with it. For example, the hExport returned by EXOpenExport is a Subhandle. The DASetOption and DAGetOption functions in the Data Access Module may be called with any Subhandle or Document Handle. The DARetrieveDocHandle function returns the Document Handle associated with any Subhandle.

1.3 Directory Structure

Each Outside In product has an sdk directory, under which there is a subdirectory for each platform on which the product ships (for example, hx/sdk/hx_win-x86-32_sdk). Under each of these directories are the following two subdirectories:

  • redist - Contains only the files that the customer is allowed to redistribute. These include all the compiled modules, filter support files, .xsd and .dtd files, cmmap000.bin, and third-party libraries, like freetype.

  • sdk - Contains the other subdirectories that used to be at the root-level of an sdk (common, lib (windows only), resource, samplefiles, and samplecode (previously samples). In addition, one new subdirectory has been added, demo, that holds all of the compiled sample apps and other files that are needed to demo the products. These are files that the customer should not redistribute (.cfg files, exportmaps, etc.).

In the root platform directory (for example, hx/sdk/hx_win-x86-32_sdk), there are two files:

  • README - Explains the contents of the sdk, and that makedemo must be run in order to use the sample applications.

  • makedemo (either .bat or .sh – platform-based) - This script will either copy (on Windows) or Symlink (on Unix) the contents of …/redist into …/sdk/demo, so that sample applications can then be run out of the demo directory.

1.3.1 Installing Multiple SDKs

If you load more than one OIT SDK, you must copy files from the secondary installations into the top-level OIT SDK directory as follows:

  • redist – copy all binaries into this directory.

  • sdk – this directory has several subdirectories: common, demo, lib, resource, samplecode, samplefiles. In each case, copy all of the files from the secondary installation into the top-level OIT SDK subdirectory of the same name. If the top-level OIT SDK directory lacks any directories found in the directory being copied from, just copy those directories over.

1.4 How to Use HTML Export

Here's a step-by-step overview of how to export a source file to HTML.

  1. Call DAIniExt to initialize the Data Access technology. This function needs to be called only once per application. If using threading, then pass in the correct ThreadOption.
  2. Set any options that require a NULL handle type (optional). Certain options need to be set before the desired source file is opened. These options are identified by requiring a NULL handle type. They include, but aren't limited to:
    • SCCOPT_FALLBACKFORMAT

    • SCCOPT_FIFLAGS

    • SCCOPT_TEMPDIR

    • SCCOPT_EX_CALLBACKS

    • SCCOPT_EX_UNICODECALLBACKSTR

    • SCCOPT_UNMAPPABLECHAR

  3. Open the source file. DAOpenDocument is called to create a document handle that uniquely identifies the source file. This handle may be used in subsequent calls to the EXOpenExport function or the open function of any other Data Access Submodule, and will be used to close the file when access is complete. This allows the file to be accessed from multiple Data Access Submodules without reopening.
  4. Set the Options. If you require option values other than the default settings, call DASetOption to set options. Note that options listed in the Options chapter as having "Handle Types" that accept VTHEXPORT may be set any time before EXRunExport is called. For more information on options and how to set them, see DASetOption.
  5. Open a Handle to HTML Export. Using the document handle, EXOpenExport is called to obtain an export handle that identifies the file to the specific export product. This handle will be used in all subsequent calls to the specific export functions. The dwOutputId parameter of this function is used to specify that the output file type should be set to FI_HTML, FI_MHTML or FI_XHTML.
  6. Make Any Required Calls to Annotation Functions. This is the point at which any calls to annotation functions (such as EXHiliteText, EXInsertText or EXHideText) should be made.
  7. Export the File. EXRunExport is called to generate the output file(s) from the source file.
  8. Close the Handle to HTML Export. EXCloseExport is called to terminate the export process for the file. After this function is called, the export handle will no longer be valid, but the document handle may still be used.
  9. Close the Source File. DACloseDocument is called to close the source file. After calling this function, the document handle will no longer be valid.
  10. Close HTML Export. DADeInit is called to de-initialize the Data Access technology.