XML Export .NET Classes

This chapter describes the XML Export .NET classes.

The following classes are covered:

ArchiveNode Class

ArchiveNode provides information about an archive node. This is a read-only class where the technology fills in all the values.

Namespace

OutsideIn

Properties

Callback Class

Callback messages are notifications that come from Outside In during the export process, providing information and sometimes the opportunity to customize the generated output.

Namespace

OutsideIn

To access callback messages, your code must create an object that inherits from Callback and pass it through the API’s SetCallbackHandler method. Your object can implement methods that override the default behavior for whichever methods your application is interested in.

Callback has three methods: OpenFile, CreateNewFile and NewFileInfo.

OpenFile

OpenFileResponse OpenFile(
      FileTypeValue fileType,
      string fileName
)

This callback is made any time a new file needs to be opened.

Parameters

Return Value

To take action in response to this method, return an OpenFileResponse object.

FileTypeValue Enumeration

This enumeration defines the type of file being requested to be opened. Its value may be one of the following:

OpenFileResponse Class

This is a class to define a new file or stream object in response to an OpenFile callback.

Constructor
OpenFileResponse(FileInfo file)

File: File object with full path to the new file.

OpenFileResponse(Stream file)

File: A stream to which the file data will be written.

CreateNewFile

CreateNewFileResponse CreateNewFile( FileFormat ParentOutputId,  FileFormat OutputId,
   Association Association, string Path)

This callback is made any time a new output file needs to be generated. This gives the developer the chance to affect where the new output file is created, how it is named, and the URL (if any) used to reference the file.

Parameters

Return Value

To take action in response to this notification, return a CreateNewFileResponse object with the new file information. If you wish to accept the defaults for the path and URL, you may return null.

CreateNewFileResponse Class

This is a class to define a new output file location in response to a CreateNewFile callback. If you do not wish to change the path to the new output file, you may use the path as received. If you do not wish to specify the URL for the new file, you many specify it as null.

Constructor
CreateNewFileResponse(FileInfo File, string URL) 
 
Association Enumeration

This enumeration defines, for a new file created by an export process, the different possible associations between the new file and the primary output file. Its value may be one of the following:

Note that some of these relationships will not be possible in all Outside In Export SDKs.

NewFileInfo

void NewFileInfo( FileFormat ParentOutputId, FileFormat OutputId,
    Association Association, string Path, string URL)
 

This informational callback is made just after each new file has been created.

Parameters

CreateTempFile

CreateTempFileResponse CreateTempFile()

This callback is made any time a new temporary file needs to be generated. This gives the developer the chance to handle the reading and writing of the temporary file.

Return Value

To take action in response to this notification, return a CreateTempFileResponse object with the temporary file information.

CreateTempFileResponse Class

This is a class to define a new file or stream object in response to an CreateTempFile callback.

Constructor
CreateTempFileResponse (Stream file)

File: A stream to which the file data will be written and read from.

Exporter Interface

This section describes the properties and methods of Exporter.

All of Outside In’s Exporter functionality can be accessed through the Exporter Interface. The object returned by OutsideIn class is an implementation of this interface. This class derives from the Document Interface, which in turn is derived from the OptionsCache Interface.

Namespace

OutsideIn

Methods

Document Interface

All of the Outside In document-related methods are accessed through the Document Interface.

Namespace

OutsideIn

Methods

OptionsCache Class

This section describes the OptionsCache class.

The options that configure the way outputs are generated are accessed through the OptionsCache class.

All of the options described in the following subsections are available through this interface. Other methods in this interface are described below.

Namespace

OutsideIn.Options

Methods

AcceptAlternateGraphics

OIT Option ID: SCCOPT_ACCEPT_ALT_GRAPHICS

This option enables an optimization in XML Export’s graphics output when exporting embedded graphics from an input document. When this option is set to true and the input document contains embedded graphics that are already in one of our supported output formats, they will be copied to output files rather than converted to the selected output format specified by the GraphicType option.

For example, if this option is set to true and the selected output graphics type is GIF, an input document’s embedded JPEG graphic will be copied to a JPEG output file rather than being converted to the GIF format. The same behavior applies to all of XML Export’s supported graphics output formats (currently GIF, JPEG, and PNG.)

If this option is set to false, all graphics output will be in the format specified by the GraphicType option.

Note:

When using this option, JPEG files will be transferred directly to their output file location, without being filtered. This presents the possibility that any JPEG viruses in the file can be transferred to that location, as well.

Data Type

bool

Data
Default

false

DefaultInputCharacterSet

OIT Option ID: SCCOPT_DEFAULTINPUTCHARSET

This option is used in cases where Outside In cannot determine the character set used to encode the text of an input file. When all other means of determining the file’s character set are exhausted, Outside In will assume that an input document is encoded in the character set specified by this option. This is most often used when reading plain-text files, but may also be used when reading HTML or PDF files.

Data Type

DefaultInputCharacterSetValue

DefaultInputCharacterSetValue Enumeration

DefaultInputCharacterSetValue can be one of the following enumerations:

SystemDefault

Unicode

BigEndianUnicode

LittleEndianUnicode

Utf8

Utf7

Ascii

UnixJapanese

UnixJapaneseEuc

UnixChineseTrad1

UnixChineseEucTrad1

UnixChineseTrad2

UnixChineseEucTrad2

UnixKorean

UnixChineseSimple

Ebcdic37

Ebcdic273

Ebcdic274

Ebcdic277

Ebcdic278

Ebcdic280

Ebcdic282

Ebcdic284

Ebcdic285

Ebcdic297

Ebcdic500

Ebcdic1026

Dos437

Dos737

Dos850

Dos852

Dos855

Dos857

Dos860

Dos861

Dos863

Dos865

Dos866

Dos869

Windows874

Windows932

Windows936

Windows949

Windows950

Windows1250

Windows1251

Windows1252

Windows1253

Windows1254

Windows1255

Windows1256

Windows1257

Iso8859_1

Iso8859_2

Iso8859_3

Iso8859_4

Iso8859_5

Iso8859_6

Iso8859_7

Iso8859_8

Iso8859_9

MacRoman

MacCroatian

MacRomanian

MacTurkish

MacIcelandic

MacCyrillic

MacGreek

MacCE

MacHebrew

MacArabic

MacJapanese

HPRoman8

BiDiOldCode

BiDiPC8

BiDiE0

RussianKOI8

JapaneseX0201

Default

SystemDefault

DocumentMemoryMode

OIT Option ID: SCCOPT_DOCUMENTMEMORYMODE

This option determines the maximum amount of memory that the chunker may use to store the document’s data, from 4 MB to 1 GB. The more memory the chunker has available to it, the less often it needs to re-read data from the document.

Data
Default

SMALL: 2 - 16MB

ExtractXMPMetadata

OIT Option ID: SCCOPT_EXTRACTXMPMETADATA

Adobe’s Extensible Metadata Platform (XMP) is a labeling technology that allows you to embed data about a file, known as metadata, into the file itself. This option enables the XMP feature, which does not interpret the XMP metadata, but passes it straight through without any interpretation. This option will be ignored if the ParseXMPMetadata option is enabled.

Data Type

bool

Data
Default

FallbackFormat

This option controls how files are handled when their specific application type cannot be determined. This normally affects all plain-text files, because plain-text files are generally identified by process of elimination, for example, when a file isn’t identified as having been created by a known application, it is treated as a plain-text file. It is recommended that None be set to prevent the conversion from exporting unidentified binary files as though they were text, which could generate many pages of “garbage” output.

Data Type

FallbackFormatValue

FallbackFormatValue Enumeration
Default

Text

GraphicHeight

OIT Option ID: SCCOPT_GRAPHIC_HEIGHT

This option defines the absolute height in pixels to which exported graphics will be resized. If this option is set and the GraphicWidth option is not, the width of the image will be calculated based on the aspect ratio of the source image. The developer should be aware that very large values for this option or GraphicWidth could produce images whose size exceeds available system memory, resulting in conversion failure.

If you are exporting a non-graphic file (word processing, spreadsheet or archive) and the settings for GraphicHeight and GraphicWidth do not match the aspect ratio of the original document, the exported image will have whitespace added so that the original file’s aspect ratio is maintained.

The settings for the GraphicHeightLimit and GraphicWidth options can override the setting for GraphicHeight.

Data Type

Int32

GraphicHeightLimit

OIT Option ID: SCCOPT_GRAPHIC_HEIGHTLIMIT

Note that this option differs from the behavior of setting the height of graphics in that it sets an upper limit on the image height. Images larger than this limit will be reduced to the limit value. However, images smaller than this height will not be enlarged when using this option. Setting the height using GraphicHeight causes all output images to be reduced or enlarged to be of the specified height.

Data Type

Int32

GraphicOutputDPI

OIT Option ID: SCCOPT_GRAPHIC_OUTPUTDPI

This option allows the user to specify the output graphics device’s resolution in DPI and only applies to images whose size is specified in physical units (in/cm). For example, consider a 1” square, 100 DPI graphic that is to be rendered on a 50 DPI device (GraphicOutputDPI is set to 50). In this case, the size of the resulting TIFF, BMP, JPEG, GIF, or PNG will be 50 x 50 pixels.

In addition, the special #define of SCCGRAPHIC_MAINTAIN_IMAGE_DPI, which is defined as 0, can be used to suppress any dimensional changes to an image. In other words, a 1” square, 100 DPI graphic will be converted to an image that is 100 x 100 pixels in size. This value indicates that the DPI of the output device is not important. It extracts the maximum resolution from the input image with the smallest exported image size.

Setting this option to SCCGRAPHIC_MAINTAIN_IMAGE_DPI may result in the creation of extremely large images. Be aware that there may be limitations in the system running this technology that could result in undesirably large bandwidth consumption or an error message. Additionally, an out of memory error message will be generated if system memory is insufficient to handle a particularly large image.

Also note that the SCCGRAPHIC_MAINTAIN_IMAGE_DPI setting will force the technology to use the DPI settings already present in raster images, but will use the current screen resolution as the DPI setting for any other type of input file.

For some output graphic types, there may be a discrepancy between the value set by this option and the DPI value reported by some graphics applications. The discrepancy occurs when the output format uses metric units (DPM, or dots per meter) instead of English units (DPI, or dots per inch). Depending on how the graphics application performs rounding on meters to inches conversions, the DPI value reported may be 1 unit more than expected. An example of a format which may exhibit this problem is PNG.

The maximum value that can be set is 2400 DPI; the default is 96 DPI.

Data Type

Int32

GraphicSizeLimit

OIT Option ID: SCCOPT_GRAPHIC_SIZELIMIT

This option is used to set the maximum size of the exported graphic in pixels. It may be used to prevent inordinately large graphics from being converted to equally cumbersome output files, thus preventing bandwidth waste.

This setting takes precedence over all other options and settings that affect the size of a converted graphic.

When creating a multi-page TIFF file, this limit is applied on a per page basis. It is not a pixel limit on the entire output file.

Data Type

Int32

GraphicSizeMethod

OIT Option ID: SCCOPT_GRAPHIC_SIZEMETHOD

This option determines the method used to size graphics. The developer can choose among three methods, each of which involves some degree of trade off between the quality of the resulting image and speed of conversion.

Using the quick sizing option results in the fastest conversion of color graphics, though the quality of the converted graphic will be somewhat degraded. The smooth sizing option results in a more accurate representation of the original graphic, as it uses anti-aliasing. Antialiased images may appear smoother and can be easier to read, but rendering when this option is set will require additional processing time. The grayscale only option also uses antialiasing, but only for grayscale graphics, and the quick sizing option for any color graphics.

The smooth sizing option does not work on images which have a width or height of more than 4096 pixels.

Data Type

GraphicWidth

OIT Option ID: SCCOPT_GRAPHIC_WIDTH

This option defines the absolute width in pixels to which exported graphics will be resized. If this option is set and the GraphicHeight option is not, the height of the image will be calculated based on the aspect ratio of the source image. The developer should be aware that very large values for this option or GraphicHeight could produce images whose size exceeds available system memory, resulting in conversion failure.

If you are exporting a non-graphic file (word processing, spreadsheet or archive) and the settings for GraphicHeight and GraphicWidth do not match the aspect ratio of the original document, the exported image will have whitespace added so that the original file’s aspect ratio is maintained.

The settings for the GraphicHeightLimit and GraphicWidthLimit options can override the setting for GraphicWidth.

Data Type

Int32

GraphicWidthLimit

OIT Option ID: SCCOPT_GRAPHIC_WIDTHLIMIT

This option allows a hard limit to be set for how wide in pixels an exported graphic may be. Any images wider than this limit will be resized to match the limit. It should be noted that regardless whether the GraphicHeightLimit option is set or not, any resized images will preserve their original aspect ratio.

Note that this option differs from the behavior of setting the width of graphics by using GraphicWidth in that it sets an upper limit on the image width. Images larger than this limit will be reduced to the limit value. However, images smaller than this width will not be enlarged when using this option. Setting the width using GraphicWidth causes all output images to be reduced or enlarged to be of the specified width.

Data Type

Int32

IECondCommentMode

OIT Option ID: SCCOPT_HTML_COND_COMMENT_MODE

Some HTML input files may include “conditional comments”, which are HTML comments that mark areas of HTML to be interpreted in specific versions of Internet Explorer, while being ignored by other browsers. This option allows you to control how the content contained within conditional comments will be interpreted by Outside In’s HTML parsing code.

Data

IgnorePassword

OIT Option ID: SCCOPT_IGNORE_PASSWORD

This option can disable the password verification of files where the contents can be processed without validation of the password. If this option is not set, the filter should prompt for a password if it handles password-protected files.

Data Type

bool

InterlacedGIFs

OIT Option ID: SCCOPT_GIF_INTERLACED

This option allows the developer to specify interlaced or non-interlaced GIF output. Interlaced GIFs are useful when graphics are to be downloaded over slow Internet connections. They allow the browser to begin to render a low-resolution view of the graphic quickly and then increase the quality of the image as it is received. There is no real penalty for using interlaced graphics.

This option is only valid if the dwOutputID parameter of the EXOpenExport function is set to FI_GIF.

Data Type

bool

ISODateTimes

OIT Option ID: SCCOPT_FORMATFLAGS

When this flag is set, all Date and Time values are converted to the ISO 8601 standard. This conversion can only be performed using dates that are stored as numeric data within the original file.

Data

bool

Default

false

JPEGQuality

OIT Option ID: SCCOPT_JPEG_QUALITY

This option allows the developer to specify the lossyness of JPEG compression. The option is only valid if the dwOutputID parameter of the EXOpenExport function is set to FI_JPEGFIF.

Data Type

Int32

Data

A value from 1 to 100, with 100 being the highest quality but the least compression, and 1 being the lowest quality but the most compression.

Default

100

LotusNotesDirectory

OIT Option ID: SCCOPT_LOTUSNOTESDIRECTORY

This option allows the developer to specify the location of a Lotus Notes or Domino installation for use by the NSF filter. A valid Lotus installation directory must contain the file nnotes.dll.

Data

A path to the Lotus Notes directory.

Default

If this option isn’t set, then OIT will first attempt to load the Lotus library according to the operating system’s PATH environment variable, and then attempt to find and load the Lotus library as indicated in HKEY_CLASSES_ROOT\Notes.Link.

OutputGraphicType

OIT Option ID: SCCOPT_GRAPHIC_TYPE

This option allows the developer to specify the format of the graphics produced by the technology.

There is a special optimization that HTML Export can make when this option is set to None. Some of the Outside In Viewer Technology’s import filters can be optimized to ignore certain types of graphics.

Data Type

OutputGraphicTypeValue

OutputGraphicTypeValue Enumeration

These are the possible values for OutputGraphicType:

Default

JPEG

ParseXMPMetadata

OIT Option ID: SCCOPT_PARSEXMPMETADATA

Adobe’s Extensible Metadata Platform (XMP) is a labeling technology that allows you to embed data about a file, known as metadata, into the file itself. This option enables parsing of the XMP data into normal OIT document properties. Enabling this option may cause the loss of some regular data in premium graphics filters (such as Postscript), but won’t affect most formats (such as PDF).

Data Type

bool

Data
Default

false

PDFInputMaxEmbeddedObjects

This option allows the user to limit the number of embedded objects that are produced in a PDF file.

Data Type

UInt32

Data

The maximum number of embedded objects to produce in PDF output. Setting this to 0 would produce an all embedded objects in the input document.

Default

0 – produce all objects.

PDFInputMaxVectorPaths

This option allows the user to limit the number of vector paths that are produced in a PDF file.

Data

The maximum number of paths to produce in PDF output. Setting this to 0 would produce an all vector objects in the input document.

Default

0 – produce all vector objects.

PDFReorderBiDi

OIT Option ID: SCCOPT_PDF_FILTER_REORDER_BIDI

This option controls whether or not the PDF filter will attempt to reorder bidirectional text runs so that the output is in standard logical order as used by the Unicode 2.0 and later specification. This additional processing will result in slower filter performance according to the amount of bidirectional data in the file.

PDFReorderBiDiValue Enumeration

This enumeration defines the type of Bidirection text reordering the PDF filter should perform.

PDFWordSpacingFactor

This option controls the spacing threshold in PDF input documents. Most PDF documents do not have an explicit character denoting a word break. The PDF filter calculates the distance between two characters to determine if they are part of the same word or if there should be a word break inserted. The space between characters is compared to the length of the space character in the current font multiplied by this fraction. If the space between characters is larger, then a word break character is inserted into the text stream. Otherwise, the characters are considered to be part of the same word and no word break is inserted.

Data Type

float

Data

A value representing the percentage of the space character used to trigger a word break. Valid values are positive values less than 2.

Default

0.85

PerformExtendedFI

OIT Option ID: SCCOPT_FIFLAGS

This option affects how an input file’s internal format (application type) is identified when the file is first opened by the Outside In technology. When the extended test flag is in effect, and an input file is identified as being either 7-bit ASCII, EBCDIC, or Unicode, the file’s contents will be interpreted as such by the export process.

The extended test is optional because it requires extra processing and cannot guarantee complete accuracy (which would require the inspection of every single byte in a file to eliminate false positives).

Data Type

bool

Data

One of the following values:

Default

true

ProcessOLEEmbeddingMode

OIT Option ID: SCCOPT_PROCESS_OLE_EMBEDDINGS

Microsoft Powerpoint versions from 1997 through 2003 had the capability to embed OLE documents in the Powerpoint files. This option controls which embeddings are to be processed as native (OLE) documents and which are processed using the alternate graphic.

Note:

The Microsoft Powerpoint application sometimes does embed known Microsoft OLE embeddings (such as Visio, Project) as an “Unknown” type. To process these embeddings, the ProcessOLEEmbedAll option is required. Post Office-2003 products such as Office 2007 embeddings also fall into this category.

Data
Default

Standard

RenderEmbeddedFonts

This option allows you to disable the use of embedded fonts in PDF input files. If the option is set to true, the embedded fonts in the PDF input are used to render text; if the option is set to false, the embedded fonts are not used and the fallback is to use fonts available to Outside In to render text.

Data Type

bool

Default

true

ShowArchiveFullPath

OIT Option ID: SCCOPT_ARCFULLPATH

This option causes the full path of a node to be returned in “GetArchiveNodeInfo” and “GetObjectInfo”.

Data Type

bool

Data
Default

false

StrictFile

When an embedded file or URL can’t be opened with the full path, OutsideIn will sometimes try and open the referenced file from other locations, including the current directory. When this option is set, it will prevent OutsideIn from trying to open the file from any location other than the fully qualified path or URL.

Data Type

bool

Default

false

TimeZoneOffset

OIT Option ID: SCCOPT_TIMEZONE

This option allows the user to define an offset to GMT that will be applied during date formatting, allowing date values to be displayed in a selectable time zone. This option affects the formatting of numbers that have been defined as date values. This option will not affect dates that are stored as text. To query the operating system for the time zone set on the machine, specify TimeZoneOffset_UseNative.

Note:

Daylight savings is not supported. The sent time in msg files when viewed in Outlook can be an hour different from the time sent when an image of the msg file is created.

Data Type

Int32

Data

Integer parameter from -96 to 96, representing 15-minute offsets from GMT. To query the operating system for the time zone set on the machine, specify SCC_TIMEZONE_USENATIVE.

Default

UnmappableCharacter

OIT Option ID: SCCOPT_UNMAPPABLECHAR

This option selects the character used when a character cannot be found in the output character set. This option takes the Unicode value for the replacement character. It is left to the user to make sure that the selected replacement character is available in the output character set.

Data Type

UShort

Data

The Unicode value for the character to use.

Default

XMLDefinitionReference

This option determines whether the converted file will reference a specified schema, DTD, or no reference when generating output.

Data Type

XMLReference

Data

A XMLReference object that defines the XML Definition Reference to be used.

Default

No reference defined

XXFormatOptions

This option is a set of flags that are specific to XML Export output files.

Data Type

XXFormatOptionValues

XXFormatOptionValues Enumeration

The following set of flags:

Default

RemoveFontGroups

DSTTimezone

This option uses the time zone of the system (computer) and calculates the time based on the system time offset.

Data Type

Boolean

Default

False

GenerateExcelRevisions

This option controls the extraction of tracked changes from Excel files.

Data Type

Boolean

Default

False

EnableAlphaBlending

This option allows the user to enable alpha-channel blending (transparency) in rendering vector images. This is primarily useful for improving fidelity when vector images are rendered with a slower graphics engine such as X-Windows, over a network where performance is not an issue.

Data

Boolean

Default

False

InternalRendering

Note:

This option is no longer relevant. Outside In no longer performs graphic rendering through X11 on Linux/Unix platforms.The internal rendering engine is available on all of these platforms. If this option is set, the results will always use the internal rendering engine regardless of the value of this option. The $GDFONTPATH environment variable must be set to specify where to reference fonts. On Windows systems, the Windows graphical rendering engine is always used.

ExportStatus Class

The ExportStatus class provides access to information about a conversion. This information may include information about sub-document failures, areas of a conversion that may not have high fidelity with the original document. When applicable the number of pages in the output is also provided.

Namespace

OutsideIn

Properties

ExportStatusFlags Enumeration

This enumeration is the set of possible known problems that can occur during an export process.

FileFormat Class

This class defines the identifiers for file formats.

Namespace

OutsideIn

Methods

ObjectInfo Class

ObjectInfo provides all the information available about an Outside In Object (object may be an embedded object, a linked object, or a compressed file). This is a read only class where the technology fills in all the values.

Namespace

OutsideIn.Options

Properties

ObjectInfoFlags Enumeration

Bit fields to describe information about an object.

Option Interface

The Option Interface provides the methods and properties to retrieve information about an Outside In Option.

Package

com.oracle.outsidein.options

Accessors

Method

void set(OptionsCache exporter, Object objValue) throws OutsideInException;

This method sets the option to the exporter object and returns the exporter object itself.

Note:

If the type of objValue cannot be converted to the data type the option is expecting, an OutsideInException is thrown.

OutsideInProducts Enumeration

OutsideIn Class

This is a utility class that creates an instance of an Exporter object on request.

Namespace

OutsideIn

Methods

static Exporter NewLocalExporter()

This method creates an instance of an Exporter object. It returns a newly created Exporter object.

static Exporter NewLocalExporter(Exporter source)

This method creates and returns an instance of an Exporter object based on the source Exporter. All the options of source are copied to the new Exporter. The source and destination file information will not be copied.

OutsideInException Class

This is the exception that is thrown when an Outside In Technology error occurs.

This class derives from the Exception class. This class has no public methods or properties except those of the parent Exception class.

Namespace

OutsideIn

OutsideInCastException Class

This exception is thrown when an invalid value is provided as an option value.

This class derives from the OutsideInException class. This class has no public methods or properties except those of the parent Exception class.

Namespace

OutsideIn

XMLReference Class

The XMLReference class is a data class used to define the XML definition reference to be used.

Namespace

OutsideIn.Options

Constructors

XMLReference()

Create an instance of a XMLReference object using No XML defintion reference

XMLReference(XMLReference.ReferenceMethodValue, String)

Create an instance of a XMLReference object to provide a DTD/XSD

ReferenceMethodValue Enumeration

This enumeration is used to set whether Export will reference a schema, a DTD, or no reference when generating output.