12 XML Export Java Classes
The following classes are covered:
12.1 ArchiveNode Class
ArchiveNode provides information about an archive node. This is a read-only class where the technology fills in all the values.
Namespace
com.oracle.outsidein
Accessors
-
boolean isFolder() - A value of true indicates that the record is an archive node.
-
int getFileSize() - File size of the archive node
-
java.util.Date getTime() - Time the archive node was created
-
int getNodeNum() - Serial number of the archive node in the archive
-
String getNodeName() - The name of the archive node
12.2 Callback Class
Callback messages are notifications that come from Outside In during the export process, providing information and sometimes the opportunity to customize the generated output.
Namespace
com.oracle.outsidein
To access callback messages, your code must create an object that inherits from Callback and pass it through the API's SetCallbackHandler method. Your object can implement methods that override the default behavior for whichever methods your application is interested in.
Callback has two methods that you can override: createNewFile and newFileInfo.
12.2.1 createNewFile
CreateNewFileResponse createNewFile( FileFormat parentOutputId, FileFormat outputId, AssociationValue association, String path) throws IOException
This callback is made any time a new output file needs to be generated. This gives the developer the chance to affect where the new output file is created, how it is named, and the URL (if any) used to reference the file.
Parameters
-
parentOutputId: File format identifier of the parent file
-
outputId: File format identifier of the file created
-
association: An AssociationValue that describes relationship between the primary output file and the new file.
-
path: Full path of the file to be created
Return Value
To take action in response to this notification, return a CreateNewFileResponse object with the new file information. If you wish to accept the defaults for the path and URL, you may return null.
12.2.1.1 CreateNewFileResponse Class
This is a class to define a new output file location in response to a CreateNewFile callback. If you do not wish to change the path to the new output file, you may use the path as received. If you do not wish to specify the URL for the new file, you many specify it as null.
Constructor
CreateNewFileResponse(File file, String url) throws IOException
-
file: File object containing the full path to the new file
-
url: A new URL that references the newly created file. This parameter can be null.
CreateNewFileResponse(SeekableByteChannel6 redirect, String url) throws IOException
-
redirect: Object that will be written to as the destination of the transform
-
url: A new URL that references the newly created file.This parameter can be null.
AssociationValue Enumeration
This enumeration defines, for a new file created by an export process, the different possible associations between the new file and the primary output file. Its value may be one of the following:
-
ROOT - indicates the primary output file
-
CHILD - a new file linked (directly or indirectly) from the primary output file
-
SIBLING - indicates new files not linked from the primary output file
-
COPY - the file was copied as a part of a template macro operation.
-
REQUIREDNAME - not used
Note that some of these relationships will not be possible in all Outside In Export SDKs.
12.2.2 newFileInfo
void newFileInfo( FileFormat parentOutputId, FileFormat outputId, AssociationValue association, String path, String url) throws IOException
This informational callback is made just after each new file has been created.
Parameters
-
parentOutputId: File format identifier of the parent file
-
outputId: File format identifier of the file created
-
association: An AssociationValue that describes relationship between the primary output file and the new file.
-
path: Full path of the file created
-
url: URL that references the newly created file
Example
Here is a basic callback handler that notifies an application that it has received newFileInfo notifications.
public static class CallbackHandler extends Callback { myApplication m_theApp; public CallbackHandler( myApplication app ) { m_theApp = app; } public void newFileInfo(FileFormat parentOutputId, FileFormat outputId, AssociationValue association, String path, String url) throws IOException { if( association == AssociationValue.ROOT ) m_theApp.primaryOutputIsReady(true); m_theApp.newOutputFile(path); } }
12.2.3 openFile
OpenFileResponse openFile(FileTypeFalue fileType, String fileName) throws IOException
This callback is made any time a new file needs to be opened.
Parameters
-
fileType: Type of file being requested to be opened
-
fileName: Name of the file to be opened
Return Value
To take action in response to this method, return an OpenFileResponse object.
FileTypeValue Enumeration
This enumeration defines the type of file being requested to be opened. Its value may be one of the following:
-
INPUT: File to be opened (path unknown)
-
TEMPLATE: Template file to be opened
-
PATH: Full file name of the file to be opened
-
OTHER: Not used
12.2.3.1 OpenFileResponse Class
This is a class to define a new file or redirected I/O object in response to an openFile() callback.
Constructors
OpenFileResponse(File file)
-
file: File object with full path to the new file
OpenFileResponse(SeekableByteChannel6 redirect)
-
redirect: A redirected I/O object to which the file data will be written
12.2.4 createTempFile
CreateTempFileResponse createTempFile() throws IOException
This callback is made any time a new temporary file needs to be generated. This gives the developer the chance to handle the reading and writing of the temporary file.
Return Value
To take action in response to this notification, return a CreateTempFileResponse object with the temporary file information.
12.3 Exporter Interface
This section describes the properties and methods of Exporter.
All of Outside In's Exporter functionality can be accessed through the Exporter Interface. The object returned by OutsideIn class is an implementation of this interface. This class derives from the Document Interface, which in turn is derived from the OptionsCache Interface.
Namespace
com.oracle.outsidein
Methods
-
getExportStatus
ExportStatus getExportStatus()
This function is used to determine if there were conversion problems during an export. The ExportStatus object returned may have information about sub-document failures, areas of a conversion that may not have high fidelity with the original document. When applicable the number of pages in the output is also provided.
-
newSubDocumentExporter
Exporter newSubDocumentExporter( int SubDocId, SubDocumentIdentifierTypeValue idType ) throws OutsideInException
Create a new Exporter for a subdocument.
SubDocId: Identifier of the subdocument
idType: Type of subdocument
SubDocumentIdentifierTypeValue: This is an enumeration for the type of subdocument being opened.
-
XMLEXPORTLOCATOR: Subdocument to be opened is based on output of XML Export (SubdocId is the value of the object_id attribute of a locator element.)
-
ATTACHMENTLOCATOR: Subdocument to be opened is based on the locator value provided by the one of the Export SDKs.
-
EMAILATTACHMENTINDEX: Subdocument to be opened is based on the index of the attachment from an email message. (SubdocId is the zero-based index of the attachment from an email message file. The first attachment presented by OutsideIn has the index value 0, the second has the index value 1, etc.)
Returns: A new Exporter object for the subdocument
-
-
newSubObjectExporter
Exporter newSubObjectExporter( SubObjectTypeValue objType, int data1, int data2, int data3, int data4 ) throws OutsideInException
Create a new Exporter for a subobject.
objType: Type of subobject
data1: Data identifying the subobject from SearchML
data2: Data identifying the subobject from SearchML
data3: Data identifying the subobject from SearchML
data4: Data identifying the subobject from SearchML
Returns: A new Exporter object for the subobject
SubObjectTypeValue: An enumeration to describe the type of SubObject to open.
-
LinkedObject
-
EmbeddedObject
-
CompressedFile
-
Attachment
-
-
newArchiveNodeExporter
Exporter newArchiveNodeExporter( int dwRecordNum ) throws OutsideInException
Create a new Exporter for an archive node. You may get the number of nodes in an archive using getArchiveNodeCount. The nodes are numbered from 0 to getArchiveNodeCount -1.
dwRecordNum: The number of the record to retrieve information about. The first node is node 0 and the total number of nodes may be obtained from GetArchiveNodeCount.
Returns: A new Exporter object for the archive node
-
export
void export() throws OutsideInException
Perform the conversion.
-
setDestinationFile
OptionsCache setDestinationFile( String filename ) throws OutsideInException
Set the location of the destination file
filename: Full path to the destination file
Returns: The updated options object
-
setExportTimeout
OptionsCache setExportTimeout(int millisecondsTimeout)
This method sets the time that the export process should wait for a response from the Outside In export engine to complete the export of a document, setting an upper limit on the time that will elapse during a call to export(). If the specified length of time is reached before the export has completed, the export operation will be terminated and an OutsideInException will be thrown. If this option is not set, the default timeout is 5 minutes.
-
newLocalExporter
static Exporter newLocalExporter(Exporter source)
This method creates and returns an instance of an Exporter object based on the source Exporter. All the options of source are copied to the new Exporter. The source and destination file information will not be copied.
12.3.1 Document Interface
All of the Outside In document-related methods are accessed through the Document Interface.
Namespace
com.oracle.outsidein
Methods
-
close
void close()
Closes the currently open document.
-
getArchiveNodeCount
int getArchiveNodeCount() throws OutsideInException
Retrieves the number of nodes in an archive file.
Returns the number of nodes in the archive file or 0 if the file is not an archive file.
-
getFileId
FileFormat getFileId(FileIdInfoFlagValue dwFlags) throws OutsideInException
Gets the format of the file based on the technology's content-based file identification process.
dwFlags: Option to retrieve the file identification pre-Extended or post-Extended Test
Returns the format identifier of the file.
-
getObjectInfo
ObjectInfo getObjectInfo() throws OutsideInException
Retrieves the information about an embedded object.
Return: An ObjectInfo object with the information about the embedded object
-
getArchiveNode
ArchiveNode getArchiveNode(int nNodeNum) throws OutsideInException
Retrieves information about a record in an archive file. You may get the number of nodes in an archive using getArchiveNodeCount.
nNodeNum: The number of the record to retrieve information about. The first node is node 0.
Return Value: An ArchiveNode object with the information about the record
-
saveArchiveNode
void saveArchiveNode( int nNodeNum, File file) throws OutsideInException
Extracts a record in an archive file to disk.
nNodeNumType: The number of the record to retrieve information about. The first node is node 0.
file: The destination file to which the file will be extracted.
-
saveArchiveNode with Search Export Flags
void saveArchiveNode( int flags, int params1, int params2, File file) throws OutsideInException
Extracts a record in an archive file to disk without reading the data for all nodes in the archive in a sequential order. To use this function, you must first process the archive with Search Export and save the Node data for later use in this function.
flagsType: Special flags value from Search Export
params1: Data1 from Search Export
params2: Data2 from Search Export
file: The destination file to which the file will be extracted
-
setSourceFile
OptionsCache setSourceFile( String filename) throws OutsideInException
Set the source document.
filename: Full path of the source document
Returns: The options cache object associated with this document
12.3.2 SeekableByteChannel6 Interface
Enables API users to handle I/O for the source and destination documents. Implement this interface to control I/O operations such as reading, writing, and seeking. This interface mimics the java.nio.channels.SeekableByteChannel interface which is only available in Java 7 and later. Note that SeekableByteChannel6 will be removed in favor of java.nio.channels.SeekableByteChannel if support for Java 6 is dropped in a future release of the Outside In Java API. Until then, this interface must be used if redirected I/O is required.
Namespace
com.oracle.outsidein
Methods
-
Get position
long position()
Returns this channel's position.
-
Set position
SeekableByteChannel6 position(long newPosition)
Sets this channel's position.
-
read
int read(java.nio.ByteBuffer dst)
Reads a sequence of bytes from this channel into the given buffer. Bytes are read starting at this channel's current position, and then the position is updated with the number of bytes actually read.
-
size
long size()
Returns the current size of the entity to which this channel is connected.
-
truncate
SeekableByteChannel6 truncate(long size)
Truncates the entity, to which this channel is connected, to the given size. Never invoked by Outside In and may be implemented by just returning this.
-
write
int write(java.io.nio.ByteBuffer src)
Writes a sequence of bytes to this channel from the given buffer. Bytes are written starting at this channel's current position. The entity to which the channel is connected is grown, if necessary, to accommodate the written bytes, and then the position is updated with the number of bytes actually written.
-
close
void close()
Closes this channel. If this channel is already closed then invoking this method has no effect.
-
isOpen
boolean isOpen()
Tells whether or not this channel is open.
12.3.3 OptionsCache Class
This section describes the OptionsCache class.
The options that configure the way outputs are generated are accessed through the OptionsCache class.
All of the options described in the following subsections are available through this interface. Other methods in this interface are described below.
Namespace
com.oracle.outsidein.options
Methods
-
OptionsCache setSourceFile(File file) throws OutsideInException
Sets the source document to be opened.
file: Full path to source file
-
OptionsCache setSourceFile(SeekableByteChannel6 redirect) throws OutsideInException
Sets an object that implements SeekableByteChannel6 to be used as the source document. Exporting a file using this method may have issues with files that require the original name of the file (examples: if the extension of the file is needed for identification purposes or if the name of a secondary file depends on the name/path of the original source file).
redirect: Object implementing SeekableByteChannel6 to be used to read the source data containing the input file
-
OptionsCache setSourceFile(SeekableByteChannel6 redirect, String filename) throws OutsideInException
Sets an object that implements SeekableByteChannel6 to be used as the source document and provides information about the filename.
redirect: Object implementing SeekableByteChannel6 to be used to read the source data containing the input file
filename: A fully qualified path or file name that may be used to derive the extension of the file or name of a secondary file that is dependent on the name/path of the source file
-
OptionsCache addSourceFile(File file) throws OutsideInException
Sets the next source document file to be exported in sequence. This allows multiple documents to be exported to the same output destination.
file: Full path to source file
-
OptionsCache addSourceFile(SeekableByteChannel6 redirect)
Set a redirected channel as the next source document to be exported to the original destination file. This method has the same limitations as the similar setSourceFile(SeekableByteChannel6 redirect) method.
-
OptionsCache addSourceFile(SeekableByteChannel6 redirect, String Filename)
Set a redirected channel as the next source document to be exported to the original destination file. The file name provided is used as in the method setSourceFile(SeekableByteChannel6 redirect, String Filename)
-
OptionsCache setSourceFormat(FileFormat fileId)
Sets the source format to process the input file as, ignoring the algorithmic detection of the file type.
fileId: the format to treat the input document as.
-
OptionsCache setDestinationFile(File file) throws OutsideInException
Sets the location of the destination file.
file: Full path to the destination file
-
OptionsCache setDestinationFile(SeekableByteChannel6 redirect) throws OutsideInException
Sets an object that implements SeekableByteChannel6 to be used as the destination document. An Exporter.export() operation will write the output data to the provided SeekableByteChannel6 object.
redirect: Object implementing SeekableByteChannel6 to be used as the destination document written during an Exporter.export() operation
-
OptionsCache setCallbackHandler(Callback callback)
Sets the object to use to handle callbacks.
callback: the callback handling object.
-
OptionsCache setPasswordsList(List<String> Passwords)
Provides a list of strings to use as passwords for encrypted documents. The technology will cycle through this list until a successful password is found or the list is exhausted.
Passwords: List of strings to be used as passwords.
-
OptionsCache setLotusNotesId(String NotesIdFile)
Sets the Lotus Notes ID file location.
NotesIdFile: Full path to the Notes ID file.
-
OptionsCache setOpenForNonSequentialAccess(boolean bOpenForNonSequentialAccess)
Setting this option causes the technology to open archive files in a special mode that is only usable for non-sequential access of nodes.
bOpenForNonSequentialAccess : If set to true would open the archive file in the special access mode. Note that turning this flag on a non-archive file will throw an exception at RunExport time.
12.3.3.1 AcceptAlternateGraphics
OIT Option ID: SCCOPT_ACCEPT_ALT_GRAPHICS
This option enables an optimization in XML Export's graphics output when exporting embedded graphics from an input document. When this option is set to TRUE and the input document contains embedded graphics that are already in one of our supported output formats, they will be copied to output files rather than converted to the selected output format specified by the GraphicType option.
For example, if this option is set to TRUE and the selected output graphics type is GIF, an input document's embedded JPEG graphic will be copied to a JPEG output file rather than being converted to the GIF format. The same behavior applies to all of XML Export's supported graphics output formats (currently GIF, JPEG, and PNG.)
If this option is set to FALSE, all graphics output will be in the format specified by the GraphicType option.
Note:
When using this option, JPEG files will be transferred directly to their output file location, without being filtered. This presents the possibility that any JPEG viruses in the file can be transferred to that location, as well.
Data Type
boolean
Data
-
true: FI_GIF, FI_JPEGFIF, and FI_PNG embeddings will be extracted, not converted. All other embeddings will be converted to the format specified by GraphicType. If graphicType is set to FI_NONE, no embeddings will be extracted or converted.
-
false: All embeddings will be converted to the format specified by GraphicType. Embeddings that are already in that format will be extracted, not converted. If graphicType is set to FI_NONE, no embeddings will be extracted or converted.
Default
false
12.3.3.2 DefaultInputCharacterSet
OIT Option ID: SCCOPT_DEFAULTINPUTCHARSET
This option is used in cases where Outside In cannot determine the character set used to encode the text of an input file. When all other means of determining the file's character set are exhausted, Outside In will assume that an input document is encoded in the character set specified by this option. This is most often used when reading plain-text files, but may also be used when reading HTML or PDF files.
Data Type
DefaultInputCharacterSetValue
DefaultInputCharacterSetValue Enumeration
DefaultInputCharacterSetValue can be one of the following enumerations:
SYSTEMDEFAULT
UNICODE
BIGENDIANUNICODE
LITTLEEENDIANUNICODE
UTF8
UTF7
ASCII
UNIXJAPANESE
UNIXJAPANESEEUC
UNIXCHINESETRAD1
UNIXCHINESEEUCTRAD1
UNIXCHINESETRAD2
UNIXCHINESEEUCTRAD2
UNIXKOREAN
UNIXCHINESESIMPLE
EBCDIC37
EBCDIC273
EBCDIC274
EBCDIC277
EBCDIC278
EBCDIC280
EBCDIC282
EBCDIC284
EBCDIC285
EBCDIC297
EBCDIC500
EBCDIC1026
DOS437
DOS737
DOS850
DOS852
DOS855
DOS857
DOS860
DOS861
DOS863
DOS865
DOS866
DOS869
WINDOWS874
WINDOWS932
WINDOWS936
WINDOWS949
WINDOWS950
WINDOWS1250
WINDOWS1251
WINDOWS1252
WINDOWS1253
WINDOWS1254
WINDOWS1255
WINDOWS1256
WINDOWS1257
ISO8859_1
ISO8859_2
ISO8859_3
ISO8859_4
ISO8859_5
ISO8859_6
ISO8859_7
ISO8859_8
ISO8859_9
MACROMAN
MACCROATIAN
MACROMANIAN
MACTURKISH
MACICELANDIC
MACCYRILLIC
MACGREEK
MACCE
MACHEBREW
MACARABIC
MACJAPANESE
HPROMAN8
BIDIOLDCODE
BIDIPC8
BIDIE0
RUSSIANKOI8
JAPANESEX0201
Default
SYSTEMDEFAULT
12.3.3.3 DocumentMemoryMode
OIT Option ID: SCCOPT_DOCUMENTMEMORYMODE
This option determines the maximum amount of memory that the chunker may use to store the document's data, from 4 MB to 1 GB. The more memory the chunker has available to it, the less often it needs to re-read data from the document.
Data
-
SMALLEST: 1 - 4MB
-
SMALL: 2 - 16MB
-
MEDIUM: 3 - 64MB
-
LARGE: 4 - 256MB
-
LARGEST: 5 - 1 GB
Default
SMALL: 2 - 16MB
12.3.3.4 EnableAlphaBlending
This option allows the user to enable alpha-channel blending (transparency) in rendering vector images. This is primarily useful to improve fidelity when rendering with a slower graphics engine, such as X-Windows over a network when performance is not an issue.
Data
Boolean
Default
False
12.3.3.5 ExtractXMPMetadata
OIT Option ID: SCCOPT_EXTRACTXMPMETADATA
Adobe's Extensible Metadata Platform (XMP) is a labeling technology that allows you to embed data about a file, known as metadata, into the file itself. This option enables the XMP feature, which does not interpret the XMP metadata, but passes it straight through without any interpretation. This option will be ignored if the ParseXMPMetadata option is enabled.
Data Type
boolean
Data
-
true: This setting enables XMP extraction.
-
false: This setting disables XMP extraction.
Default
-
false
12.3.3.6 FallbackFormat
This option controls how files are handled when their specific application type cannot be determined. This normally affects all plain-text files, because plain-text files are generally identified by process of elimination, for example, when a file isn't identified as having been created by a known application, it is treated as a plain-text file. It is recommended that None be set to prevent the conversion from exporting unidentified binary files as though they were text, which could generate many pages of "garbage" output.
Data Type
FallbackFormatValue
FallbackFormatValue Enumeration
-
TEXT: Unidentified file types will be treated as text files.
-
NONE: Outside In will not attempt to process files whose type cannot be identified
Default
TEXT
12.3.3.7 GraphicHeight
OIT Option ID: SCCOPT_GRAPHIC_HEIGHT
This option defines the absolute height in pixels to which exported graphics will be resized. If this option is set and the GraphicWidth option is not, the width of the image will be calculated based on the aspect ratio of the source image. The developer should be aware that very large values for this option or GraphicWidth could produce images whose size exceeds available system memory, resulting in conversion failure.
If you are exporting a non-graphic file (word processing, spreadsheet or archive) and the settings for GraphicHeight and GraphicWidth do not match the aspect ratio of the original document, the exported image will have whitespace added so that the original file's aspect ratio is maintained.
The settings for the GraphicHeightLimit and GraphicWidth options can override the setting for GraphicHeight.
Data Type
long
12.3.3.8 GraphicHeightLimit
OIT Option ID: SCCOPT_GRAPHIC_HEIGHTLIMIT
Note that this option differs from the behavior of setting the height of graphics in that it sets an upper limit on the image height. Images larger than this limit will be reduced to the limit value. However, images smaller than this height will not be enlarged when using this option. Setting the height using GraphicHeight causes all output images to be reduced or enlarged to be of the specified height.
Data Type
long
12.3.3.9 GraphicOutputDPI
OIT Option ID: SCCOPT_GRAPHIC_OUTPUTDPI
This option allows the user to specify the output graphics device's resolution in DPI and only applies to images whose size is specified in physical units (in/cm). For example, consider a 1" square, 100 DPI graphic that is to be rendered on a 50 DPI device (GraphicOutputDPI is set to 50). In this case, the size of the resulting TIFF, BMP, JPEG, GIF, or PNG will be 50 x 50 pixels.
In addition, the special #define of SCCGRAPHIC_MAINTAIN_IMAGE_DPI, which is defined as 0, can be used to suppress any dimensional changes to an image. In other words, a 1" square, 100 DPI graphic will be converted to an image that is 100 x 100 pixels in size. This value indicates that the DPI of the output device is not important. It extracts the maximum resolution from the input image with the smallest exported image size.
Setting this option to SCCGRAPHIC_MAINTAIN_IMAGE_DPI may result in the creation of extremely large images. Be aware that there may be limitations in the system running this technology that could result in undesirably large bandwidth consumption or an error message. Additionally, an out of memory error message will be generated if system memory is insufficient to handle a particularly large image.
Also note that the SCCGRAPHIC_MAINTAIN_IMAGE_DPI setting will force the technology to use the DPI settings already present in raster images, but will use the current screen resolution as the DPI setting for any other type of input file.
For some output graphic types, there may be a discrepancy between the value set by this option and the DPI value reported by some graphics applications. The discrepancy occurs when the output format uses metric units (DPM, or dots per meter) instead of English units (DPI, or dots per inch). Depending on how the graphics application performs rounding on meters to inches conversions, the DPI value reported may be 1 unit more than expected. An example of a format which may exhibit this problem is PNG.
The maximum value that can be set is 2400 DPI; the default is 96 DPI.
Data Type
long
12.3.3.10 GraphicSizeLimit
OIT Option ID: SCCOPT_GRAPHIC_SIZELIMIT
This option is used to set the maximum size of the exported graphic in pixels. It may be used to prevent inordinately large graphics from being converted to equally cumbersome output files, thus preventing bandwidth waste.
This setting takes precedence over all other options and settings that affect the size of a converted graphic.
When creating a multi-page TIFF file, this limit is applied on a per page basis. It is not a pixel limit on the entire output file.
Data Type
long
12.3.3.11 GraphicSizeMethod
OIT Option ID: SCCOPT_GRAPHIC_SIZEMETHOD
This option determines the method used to size graphics. The developer can choose among three methods, each of which involves some degree of trade off between the quality of the resulting image and speed of conversion.
Using the quick sizing option results in the fastest conversion of color graphics, though the quality of the converted graphic will be somewhat degraded. The smooth sizing option results in a more accurate representation of the original graphic, as it uses anti-aliasing. Antialiased images may appear smoother and can be easier to read, but rendering when this option is set will require additional processing time. The grayscale only option also uses antialiasing, but only for grayscale graphics, and the quick sizing option for any color graphics.
The smooth sizing option does not work on images which have a width or height of more than 4096 pixels.
Data
-
QUICKSIZING
-
SMOOTHSIZING
-
SMOOTHGRAYSCALESIZING
12.3.3.12 GraphicWidth
OIT Option ID: SCCOPT_GRAPHIC_WIDTH
This option defines the absolute width in pixels to which exported graphics will be resized. If this option is set and the GraphicHeight option is not, the height of the image will be calculated based on the aspect ratio of the source image. The developer should be aware that very large values for this option or GraphicHeight could produce images whose size exceeds available system memory, resulting in conversion failure.
If you are exporting a non-graphic file (word processing, spreadsheet or archive) and the settings for GraphicHeight and GraphicWidth do not match the aspect ratio of the original document, the exported image will have whitespace added so that the original file's aspect ratio is maintained.
The settings for the GraphicHeightLimit and GraphicWidthLimit options can override the setting for GraphicWidth.
Data Type
long
12.3.3.13 GraphicWidthLimit
OIT Option ID: SCCOPT_GRAPHIC_WIDTHLIMIT
This option allows a hard limit to be set for how wide in pixels an exported graphic may be. Any images wider than this limit will be resized to match the limit. It should be noted that regardless whether the GraphicHeightLimit option is set or not, any resized images will preserve their original aspect ratio.
Note that this option differs from the behavior of setting the width of graphics by using GraphicWidth in that it sets an upper limit on the image width. Images larger than this limit will be reduced to the limit value. However, images smaller than this width will not be enlarged when using this option. Setting the width using GraphicWidth causes all output images to be reduced or enlarged to be of the specified width.
Data Type
long
12.3.3.14 IECondCommentMode
OIT Option ID: SCCOPT_HTML_COND_COMMENT_MODE
Some HTML input files may include "conditional comments", which are HTML comments that mark areas of HTML to be interpreted in specific versions of Internet Explorer, while being ignored by other browsers. This option allows you to control how the content contained within conditional comments will be interpreted by Outside In's HTML parsing code.
Data
-
NONE: Don't output any conditional comment
-
IE5: Include the IE5 comments
-
IE6: Include the IE6 comments
-
IE7: Include the IE7 comments
-
IE8: Include the IE8 comments
-
IE9: Include the IE9 comments
-
ALL: Include all conditional comments
12.3.3.15 IgnorePassword
OIT Option ID: SCCOPT_IGNORE_PASSWORD
This option can disable the password verification of files where the contents can be processed without validation of the password. If this option is not set, the filter should prompt for a password if it handles password-protected files.
Data Type
boolean
12.3.3.16 InterlacedGIFs
OIT Option ID: SCCOPT_GIF_INTERLACED
This option allows the developer to specify interlaced or non-interlaced GIF output. Interlaced GIFs are useful when graphics are to be downloaded over slow Internet connections. They allow the browser to begin to render a low-resolution view of the graphic quickly and then increase the quality of the image as it is received. There is no real penalty for using interlaced graphics.
This option is only valid if the dwOutputID parameter of the EXOpenExport function is set to FI_GIF.
Data Type
boolean
12.3.3.17 InternalRendering
Note:
This option is no longer relevant. Outside In no longer performs graphic rendering through X11 on Linux/Unix platforms.The internal rendering engine is available on all of these platforms. If this option is set, the results will always use the internal rendering engine regardless of the value of this option. The $GDFONTPATH environment variable must be set to specify where to reference fonts. On Windows systems, the Windows graphical rendering engine is always used.
12.3.3.18 ISODateTimes
OIT Option ID: SCCOPT_FORMATFLAGS
When this flag is set, all Date and Time values are converted to the ISO 8601 standard. This conversion can only be performed using dates that are stored as numeric data within the original file.
Data Type
boolean
Default
false
12.3.3.19 JPEGQuality
OIT Option ID: SCCOPT_JPEG_QUALITY
This option allows the developer to specify the lossyness of JPEG compression. The option is only valid if the dwOutputID parameter of the EXOpenExport function is set to FI_JPEGFIF.
Data Type
long
Data
A value from 1 to 100, with 100 being the highest quality but the least compression, and 1 being the lowest quality but the most compression.
Default
100
12.3.3.20 LotusNotesDirectory
OIT Option ID: SCCOPT_LOTUSNOTESDIRECTORY
This option allows the developer to specify the location of a Lotus Notes or Domino installation for use by the NSF filter. A valid Lotus installation directory must contain the file nnotes.dll.
Type (Common): String
Data
A path to the Lotus Notes directory.
Default
If this option isn't set, then OIT will first attempt to load the Lotus library according to the operating system's PATH environment variable, and then attempt to find and load the Lotus library as indicated in HKEY_CLASSES_ROOT\Notes.Link.
12.3.3.21 OutputGraphicType
OIT Option ID: SCCOPT_GRAPHIC_TYPE
This option allows the developer to specify the format of the graphics produced by the technology.
-
When setting this option, remember that the JPEG file format does not support transparency.
-
Though the GIF file format supports transparency, it is limited to using only one of its 256 available colors to represent a transparent pixel ("index transparency").
-
PNG supports many types of transparency. The PNG files written by HTML Export are created so that various levels of transparency are possible for each pixel. This is achieved through the implementation of an 8-bit "alpha channel".
There is a special optimization that HTML Export can make when this option is set to None. Some of the Outside In Viewer Technology's import filters can be optimized to ignore certain types of graphics.
Data Type
OutputGraphicTypeValue
OutputGraphicTypeValue Enumeration
These are the possible values for OutputGraphicType:
-
GIF: Create GIF images
-
JPEG: Create JPEG/JFIF images
-
PNG: Create PNG images
-
NONE: Turn off graphic conversions
Default
JPEG
12.3.3.22 ParseXMPMetadata
OIT Option ID: SCCOPT_PARSEXMPMETADATA
Adobe's Extensible Metadata Platform (XMP) is a labeling technology that allows you to embed data about a file, known as metadata, into the file itself. This option enables parsing of the XMP data into normal OIT document properties. Enabling this option may cause the loss of some regular data in premium graphics filters (such as Postscript), but won't affect most formats (such as PDF).
Data Type
boolean
Data
-
true: This setting enables parsing XMP.
-
false: This setting disables parsing XMP.
Default
false
12.3.3.23 PDFInputMaxEmbeddedObjects
This option allows the user to limit the number of embedded objects that are produced in a PDF file.
Data Type
long
Data
The maximum number of embedded objects to produce in PDF output. Setting this to 0 would produce an all embedded objects in the input document.
Default
0 – produce all objects.
12.3.3.24 PDFInputMaxVectorPaths
This option allows the user to limit the number of vector paths that are produced in a PDF file.
Data Type
long
Data
The maximum number of paths to produce in PDF output. Setting this to 0 would produce an all vector objects in the input document.
Default
0 – produce all vector objects.
12.3.3.25 PDFReorderBiDi
OIT Option ID: SCCOPT_PDF_FILTER_REORDER_BIDI
This option controls whether or not the PDF filter will attempt to reorder bidirectional text runs so that the output is in standard logical order as used by the Unicode 2.0 and later specification. This additional processing will result in slower filter performance according to the amount of bidirectional data in the file.
PDFReorderBiDiValue Enumeration
This enumeration defines the type of Bidirection text reordering the PDF filter should perform.
-
STANDARDBIDI: Do not attempt to reorder bidirectional text runs.
-
REORDEREDBIDI: Attempt to reorder bidirectional text runs.
12.3.3.26 PDFWordSpacingFactor
This option controls the spacing threshold in PDF input documents. Most PDF documents do not have an explicit character denoting a word break. The PDF filter calculates the distance between two characters to determine if they are part of the same word or if there should be a word break inserted. The space between characters is compared to the length of the space character in the current font multiplied by this fraction. If the space between characters is larger, then a word break character is inserted into the text stream. Otherwise, the characters are considered to be part of the same word and no word break is inserted.
Data Type
float
Data
A value representing the percentage of the space character used to trigger a word break. Valid values are positive values less than 2.
Default
0.85
12.3.3.27 PerformExtendedFI
OIT Option ID: SCCOPT_FIFLAGS
This option affects how an input file's internal format (application type) is identified when the file is first opened by the Outside In technology. When the extended test flag is in effect, and an input file is identified as being either 7-bit ASCII, EBCDIC, or Unicode, the file's contents will be interpreted as such by the export process.
The extended test is optional because it requires extra processing and cannot guarantee complete accuracy (which would require the inspection of every single byte in a file to eliminate false positives).
Data Type
boolean
Data
One of the following values:
-
false: When this is set, standard file identification behavior occurs.
-
true: If set, the File Identification code will run an extended test on all files that are not identified.
Default
-
true
12.3.3.28 ProcessOLEEmbeddingMode
OIT Option ID: SCCOPT_PROCESS_OLE_EMBEDDINGS
Microsoft Powerpoint versions from 1997 through 2003 had the capability to embed OLE documents in the Powerpoint files. This option controls which embeddings are to be processed as native (OLE) documents and which are processed using the alternate graphic.
Note:
The Microsoft Powerpoint application sometimes does embed known Microsoft OLE embeddings (such as Visio, Project) as an "Unknown" type. To process these embeddings, the ProcessOLEEmbedAll option is required. Post Office-2003 products such as Office 2007 embeddings also fall into this category.
Data
-
STANDARD: Process embeddings that are known standard embeddings. These include Office 2003 versions of Word, Excel, Visio, etc.
-
ALL: Process all embeddings in the file.
-
NONE: Process none of the embeddings in the file.
Default
STANDARD
12.3.3.29 RenderEmbeddedFonts
This option allows you to disable the use of embedded fonts in PDF input files. If the option is set to true, the embedded fonts in the PDF input are used to render text; if the option is set to false, the embedded fonts are not used and the fallback is to use fonts available to Outside In to render text.
Data Type
boolean
Default
true
12.3.3.30 ShowArchiveFullPath
OIT Option ID: SCCOPT_ARCFULLPATH
This option causes the full path of a node to be returned in "GetArchiveNodeInfo" and "GetObjectInfo".
Data Type
boolean
Data
-
true: Provide the full path.
-
false: Do not provide the path.
Default
false
12.3.3.31 StrictFile
When an embedded file or URL can't be opened with the full path, OutsideIn will sometimes try and open the referenced file from other locations, including the current directory. When this option is set, it will prevent OutsideIn from trying to open the file from any location other than the fully qualified path or URL.
Data Type
boolean
Default
false
12.3.3.32 TimeZoneOffset
OIT Option ID: SCCOPT_TIMEZONE
This option allows the user to define an offset to GMT that will be applied during date formatting, allowing date values to be displayed in a selectable time zone. This option affects the formatting of numbers that have been defined as date values. This option will not affect dates that are stored as text.
Note:
Daylight savings is not supported. The sent time in msg files when viewed in Outlook can be an hour different from the time sent when an image of the msg file is created.
Data Type
long
Data
Integer parameter from -96 to 96, representing 15-minute offsets from GMT. To query the operating system for the time zone set on the machine, specify SCC_TIMEZONE_USENATIVE.
Default
-
0: GMT time
12.3.3.33 UnmappableCharacter
OIT Option ID: SCCOPT_UNMAPPABLECHAR
This option selects the character used when a character cannot be found in the output character set. This option takes the Unicode value for the replacement character. It is left to the user to make sure that the selected replacement character is available in the output character set.
Data Type
int
Data
The Unicode value for the character to use.
Default
-
0x002a = "*"
12.3.3.34 XMLDefinitionReference
This option determines whether the converted file will reference a specified schema, DTD, or no reference when generating output.
Data Type
XMLReference
Data
A XMLReference object that defines the XML Definition Reference to be used.
Default
No reference defined
12.3.3.35 XXFormatOptions
This option is a set of flags that are specific to XML Export output files.
Data Type
EnumSet<XXFormatOptionValues>
XXFormatOptionValues Enumeration
The following set of flags:
-
DELIMITERS: Often, files have individual characters that are placed at specific draw locations. Consequently, the Flexiondoc converter produces individual draw_text characters without any indication of word boundaries. This flag forces the Flexiondoc converter to attempt to determine where words and lines end. The input filters indicate these positions by producing a WORD_DELIMITER for word endings, and a DELIMITER for line endings. These delimiters are passed along in the Flexiondoc output to assist the user in reconstructing words and lines.
-
OPTIMIZESECTIONS: Use wp.section elements to delineate column references.
-
FLATTENSTYLES: Flatten styles to eliminate the need to process the "based-on=" attribute. By turning on this option, paragraph style should all be fully attributed. Character styles can't be fullly attributed, that is, they won't always be completely flattened.
-
PROCESSARCHIVESUBDOCUMENTS: Process all archive sub-objects and put the output in the main Flexiondoc output
-
PROCESSATTACHMENTSUBDOCUMENTS: Process all attachments and put the output in the main Flexiondoc output
-
PROCESSEMBEDDINGSUBDOCUMENTS: Process all embeddings and put the output in the main Flexiondoc output
-
REMOVEFONTGROUPS: Replace font groups with references to individual fonts.
-
INCLUDETEXTOFFSETS: Include text_offset attribute on tx.p and tx.r elements.
-
SEPARATESTYLETABLES: Enabling this flag will cause the style_tables subtree to be streamed to a separate output unit. This item is deprecated.
-
USEFULLFILEPATHS: Locators for externalized embeddings will contain full, absolute path names.
-
BITMAPASBITMAP: dr.image objects are converted to a graphic file and the resulting file is referenced by the locator child of the dr.image.
-
CHARTASBITMAP: ch.chart objects are converted to a graphic file and the resulting file is referenced by the locator child of the ch.chart.
-
PRESENTATIONASBITMAP: pr.slide objects are converted to a graphic file and the resulting file is referenced by the locator child of the pr.slide.
-
VECTORASBITMAP: dr.drawing objects are converted to a graphic file and the resulting file is referenced by the locator child of the dr.drawing.
-
GENERATESYSTEMMETADATA: When this flag is set, system metadata will be generated. This information is gathered through system calls and may adversely affect performance.
-
NOBITMAPELEMENTS: Bitmap graphics are suppressed; no dr.image content will appear in the converted document.
-
NOCHARTELEMENTS: Charts are suppressed; no ch.chart content will appear in the converted document.
-
NOPRESENTATIONELEMENTS: Presentation slides are suppressed; no pr.slide content will appear in the converted document.
-
NOVECTORELEMENTS: Vector drawings are suppressed; no dr.drawing content will appear in the converted document.
-
These next four flags are mutually exclusive:
-
DEFAULTCHARACTERMAPPING: Default behavior: All text is mapped to Unicode, in tx.text elements.
-
NOCHARARACTERMAPPING: All text is left in the original character set, in tx.utext elements.
-
MAPTEXT: Text is mapped to Unicode where possible, unmappable text is left in the original character set.
-
MAPPEDANDUNMAPPEDCHARACTERS: Both mapped and unmapped text is included as an alt element containing tx.text and tx.utext.
-
Default
REMOVEFONTGROUPS
12.4 ExportStatus Class
The ExportStatus class provides access to information about a conversion. This information may include information about sub-document failures, areas of a conversion that may not have high fidelity with the original document. When applicable the number of pages in the output is also provided.
Namespace
com.oracle.outsidein
Accessors
-
long getPageCount() - A count of all of the output pages produced during an export operation.
-
EnumSet<ExportStatusFlags> getStatusFlags() - Gets the information about possible fidelity issues with the original document.
-
long getSubDocsFailed() - Number of sub documents that were not converted.
-
long getSubDocsPassed() - Number of sub documents that were successfully converted.
ExportStatusFlags Enumeration
This enumeration is the set of possible known problems that can occur during an export process.
-
NoInformationAvailable: No Information is available
-
MissingMap: A PDF text run was missing the toUnicode table
-
VerticalText: A vertical text run was present
-
TextEffects: A run that had unsupported text effects applied. One example is Word Art
-
UnsupportedCompression: A graphic had an unsupported compression
-
UnsupportedColorSpace: A graphic had an unsupported color space
-
Forms: A sub documents had forms
-
RightToLeftTables: A table had right to left columns
-
Equations: A file had equations
-
AliasedFont: The desired font was missing, but a font alias was used
-
MissingFont: The desired font wasn't present on the system
-
SubDocFailed: a sub-document was not converted
-
TypeThreeFont: A type 3 font was encountered.
-
UnsupportedShading: An unsupported shading pattern was encountered.
-
InvalidHTML: An HTML parse error, as defined by the W3C, was encountered.
12.5 FileFormat Class
This class defines the identifiers for file formats.
Namespace
com.oracle.outsidein
Methods
-
GetDescription
String GetDescription()
This method returns the description of the format.
-
GetId
int GetId()
This method returns the numeric identifier of the format.
-
ForId
FileFormat ForId(int id)
This method returns the FileFormat object for the given identifier.
id: The numeric identifier for which the corresponding FileFormat object is returned.
12.6 ObjectInfo Class
ObjectInfo provides all the information available about the OIT Object. This is a read-only class where the technology fills in all the values.
Namespace
com.oracle.outsidein.options
Accessors
-
ObjectInfo.CompressionValues getCompression() - the type of compression used to store the object, if known.
-
EnumSet<ObjectInfo.ObjectInfoFlagValues> getFlags() - flags indicating attributes of the object.
-
FileFormat getFormatId() - the format Identifier of the object.
-
String getName() - name of the object.
ObjectInfoFlags Enumeration
Bit fields to describe information about an object.
-
PARTIALFILE: Object would not normally exist outside the source document
-
PROTECTEDFILE: Object is encrypted or password protected
-
UNSUPPORTEDCOMPRESSION: Object uses an unsupported compression mechanism
-
DRMFILE: Object uses Digital Rights Management protection
-
UNIDENTIFIEDFILE: Object is extracted, but can not successfully identified
-
LINKTOFILE: Object links to file, it can not be extracted
-
ENCRYPTEDFILE: Object is encrypted and can be decrypted with the known password
12.7 Option Interface
The Option Interface provides the methods and properties to retrieve information about an Outside In Option.
Namespace
Outside In
Properties
-
Name — Name of the option
-
Description — Description of the option
-
DataType — The type of the option value
-
SupportingProducts — The list of products that support this option
Method
void Set(OptionsCache exporter, Object objValue);
This method sets the option to the exporter object.
-
exporter — The exporter option
-
objValue — Value of the option
Note:
If the type of objValue cannot be converted to the data type the option is expecting, an OutsideInCastException is thrown.
OutsideInProducts Enumeration
-
HTMLExport — Outside In HTML Export
-
ImageExport — Outside In Image Export
-
PDFExport — Outside In PDF Export
-
SearchExport — Outside In Search Export
-
WebViewExport — Outside In Web View Export
-
XMLExport — Outside In XML Export
-
AllExports — All Outside In export products
12.8 OutsideIn Class
This is a utility class that creates an instance of an Exporter object on request.
Namespace
com.oracle.outsidein
Methods
static Exporter newLocalExporter()
This method creates an instance of an Exporter object. It returns a newly created Exporter object.
static Exporter newLocalExporter(Exporter source)
This method creates and returns an instance of an Exporter object based on the source Exporter. All the options of source are copied to the new Exporter. The source and destination file information will not be copied.
12.9 OutsideInException Class
This is the exception that is thrown when an Outside In Technology error occurs.
This class derives from the Exception class. This class has no public methods or properties except those of the parent Exception class.
Namespace
com.oracle.outsidein
12.10 XMLReference Class
The XMLReference class is a data class used to define the XML definition reference to be used.
Namespace
com.oracle.outsidein.options
Constructors
XMLReference()
Create an instance of a XMLReference object using No XML definition reference
XMLReference(XMLReference.ReferenceMethodValue, String)
Create an instance of a XMLReference object to provide a DTD/XSD
ReferenceMethodValue Enumeration
This enumeration is used to set whether Export will reference a schema, a DTD, or no reference when generating output.
-
DTD: Document Type Definition (DTD)
-
XSD: Extensible Schema Definition
-
NONE: No definition reference