23 Configuring Inbound Refinery

Inbound Refinery offers a variety of conversion options depending on what components are installed and enabled on Content Server and Inbound Refinery. This chapter provides an overview of the different conversion options.

At minimum, the following components must be installed and enabled for basic conversion.

Component Name Component Description Enabled on Server

InboundRefinery

Enables Inbound Refinery

Inbound Refinery Server

InboundRefinerySupport

Enables the Content Server to work with Inbound Refinery

Content Server


This chapter discusses the following topics:

23.1 Managing Native Content Conversion

Note:

Unless Content Server is configured to work with an Inbound Refinery instance, files are all passed through to the website in their native format.

When Content Server is configured as a provider for an Inbound Refinery instance, the file formats to pass to the refinery for conversion must be specified in one of the following ways:

  • Using the File Format Wizard, accessed by choosing Refinery Administration from the Adminstration menu.

  • Using the File Formats option on the Configuration Manager applet to map file extensions (.doc, .txt, and so on) to file formats and then map the file formats to the conversion option on the refinery. This option provides more flexibility in mapping different file extensions to different conversion options.

  • Create a custom component to base the conversion on the value of specified metadata fields for the content item, including the file format or custom fields.

After the job passes from Content Server to Inbound Refinery, the refinery configuration determines how to convert and return the native file. File formats are automatically configured during installation or can be added and changed as needed.

This section covers these topics:

23.1.1 Identifying MIME Types

When defining new file formats, specify the MIME (Multipurpose Internet Mail Extensions) type corresponding to the file extension (for example, the format mapped to the doc file extension is application/msword).

When a content item is checked in to the repository, the content item's format is assigned according to the format mapped to the file extension of the native file. If the native file is not converted, Content Server includes this format when delivering the content item to clients. Using the MIME type for the format assists the client in determining what type of data the file is, the associated helper applications, and so on.

Check MIME types and the list of registered MIME types at http://www.iana.org/assignments/media-types/index.html.

23.1.2 Native Applications and Content Conversions

The native applications used to convert content must meet the following requirements.

Native Application Requirements

MS Word

MS Project

Lotus Freelance

MS Excel

Lotus 123

Corel WordPerfect

MS PowerPoint

Lotus WordPro

MS Visio

iGrafx Designer

Verify that the native application is installed if needed by Inbound Refinery for the conversion.

Associate the file type to a conversion process on the File Formats tab.

For Word and PowerPoint applications, use the Native Options tab on the Local Inbound Refinery Configuration page to specify whether to process links.

MS Publisher

FrameMaker

PhotoShop

PageMaker

Verify that the native application is installed.

Configure the file path in Inbound Refinery. To check in FrameMaker books, use the Upload Multiple Files option, which must be first enabled in System Properties.

Associate the file type to a conversion process on the File Formats tab.

Other

Verify that the native application is installed (if required).

Install the custom conversion program in Inbound Refinery.

Configure the file path in Inbound Refinery.

Associate the file type to a conversion process on the File Formats tab.


23.1.3 Associating File Types with Conversion Programs

Associating file types with conversion programs is a two-stage process.

Add the file format and associate the file extension with the format:

  1. Choose Administration then Admin Applets from the Main menu.

  2. Click Configuration Manager.

  3. On the Configuration Manager page, choose Options then File Formats from the Page menu.

  4. On the File Formats page, click Add in the File Formats pane to add a file format.

  5. On the Add/Edit File Format page, enter the necessary information:

    • Format: Usually the MIME type.

    • Conversion type: Associates the format name with a conversion program. Select from the following:

      • Passthru: Documents with extensions mapped to PASSTHRU are not converted, but are displayed on the website in their native file format. The client computer must have the native applications.

      • Legacy Custom: Documents with extensions mapped to CUSTOM execute a conversion program not included in the set of standard conversions. This option is not supported in this release.

    • Description: A brief description of the file format.

  6. Click OK.

Enter the file extension to associate with the format:

  1. Click Add in the File Extensions pane.

  2. On the Add/Edit File Extension page, enter the necessary information:

    • Extension: The three-character designation for the file format. A file with this format is converted using the conversion program specified by the Map to Format field.

    • Map To Format: A list of the available formats with specified conversions (defined in the File Formats pane). Select a format to directly relate all files with that extension to a specific conversion program.

  3. Click OK.

23.1.4 Thumbnails

Thumbnails are small preview images of content. They are used on search results pages and typically link to the web-viewable file they represent. A thumbnail provides consumers with a visual sample of a file without actually opening the file itself. This enables them to check a file before committing to downloading the larger, original file. A basic set of thumbnail creation options is available by default.

23.2 Content Server and Refinery Configuration Scenarios

Inbound Refinery can be used to refine content managed by Content Server. Inbound Refinery can be installed on the same computer as Content Server or on one or more separate computers. You must add the refinery as a provider to Content Server instances on the same or separate computers after installation. For details, see Section 23.3.1.

Note:

Oracle Inbound Refinery does not support running in a cluster environment. Inbound Refinery can do conversion work for a Content Server cluster, but cannot run in a cluster environment itself. To ensure that Inbound Refinery functions properly, Inbound Refinery creates and maintains a long-term lock on the /queue/conversion directory. If mistakenly configured as part of a cluster and a second Inbound Refinery attempts to start and lock the same directory, the second Inbound Refinery will fail to start, and the attempt is logged.

Various configurations are possible, so keep the following general rules in mind when setting up a refinery environment:

  • If processing a large number of content items per day, do not run Inbound Refinery on the same computer as Content Server.

  • The more dedicated refinery systems that are installed, the faster the content is processed. Having more refinery systems than Content Server instances provides optimal speed. Having fewer refinery systems than Content Server instances can slow down performance when converting large numbers of files.

  • Typically, there is no reason to have multiple refineries on the same computer because refineries share the system's resources. One refinery can serve as a provider to multiple Content Servers. This includes third-party applications used during conversion. To improve performance, use separate computers for each refinery.

  • Some file types and/or large files are processed slower than average. If a large number of these types must be processed in addition to other file types, consider setting up a refinery on a separate system to process just these file types. This requires more than one refinery system, but it does provide optimum refining speed and performance.

The following scenarios are common. Other refinery configurations are possible in addition to the ones described in this section. Specific content management applications might require their own particular refinery setup, which does not necessarily match any scenario mentioned in this section.

  • Scenario A: One Content Server and one refinery on the same computer.

  • Scenario B: Multiple Content Servers and one refinery on the same computer.

  • Scenario C: Multiple Content Servers and one refinery on separate computers.

  • Scenario D: One refinery per Content Server on separate computers.

  • Scenario E: Multiple refineries per Content Server on separate computers.

Each of these scenarios is explained in more detail in the following sections, including the benefits of each scenario and considerations to take into account for each scenario. In the scenario images, the following symbols are used to represent a computer, the Content Server, and the Inbound Refinery:

  • Large Circle: computer

  • Small Circle: Inbound Refinery

  • Small Square: Content Server

23.2.1 Scenario A

Scenario A Diagram

This is the most basic scenario possible. It comprises one Content Server and one refinery on the same computer.

  • Benefits:

    • Least expensive and easiest to configure.

    • Only one copy of third-party applications required for refinery conversions must be purchased.

  • Considerations:

    • Number and speed of conversions is limited.

    • Not as powerful as scenarios where refineries are not deployed on the Content Server computer, because refinery processing on the Content Server computer can slow searches and access to the website, and vice versa. Each conversion can take between seconds and minutes, depending on the file type and size.

23.2.2 Scenario B

Surrounding text describes refinery_scenario_b.gif.

This scenario comprises multiple Content Servers and one refinery on the same computer.

  • Benefits: Only one copy of third-party applications required for refinery conversions must be purchased.

  • Considerations:

    • Number and speed of conversions is limited.

    • Not as powerful as scenarios where refineries are not deployed on the Content Server computer, because refinery processing on the Content Server computer can slow searches and access to the website, and vice versa. Each conversion can take between seconds and minutes, depending on the file type and size.

    • In this configuration, typically the following choices should be made when deploying the refinery:

      • The refinery is set as a provider to one of the Content Servers. After deployment, the refinery will need to be added as a provider to the other Content Servers. For details, see Section 23.3.1.

23.2.3 Scenario C

Surrounding text describes refinery_scenario_c.gif.

This scenario comprises multiple Content Servers and one refinery on separate computers.

  • Benefits:

    • Only one copy of third-party applications required for refinery conversions must be purchased.

    • Faster processing than when the refinery is deployed on the same computer as a Content Server.

    • Refinery processing does not affect Content Server searches and access to the website, and vice versa.

  • Considerations:

    • Not as powerful as scenarios where there is at least one refinery per Content Server.

    • In this configuration, typically the following choices should be made when deploying the refinery:

      • The refinery will need to be added as a provider to each Content Server. For details, see Section 23.3.1.

23.2.4 Scenario D

Surrounding text describes refinery_scenario_d.gif.

This scenario comprises one refinery per Content Server on separate computers.

  • Benefits:

    • Faster processing for high volumes of content and big file sizes.

    • Refinery processing does not affect Content Server searches and access to the website, and vice versa.

  • Considerations:

    • Each refinery computer needs a copy of all third-party applications required for conversion.

    • Each refinery will need to be added as a provider to each Content Server. For details, see Section 23.3.1.

23.2.5 Scenario E

Surrounding text describes refinery_scenario_e.gif.

This scenario comprises multiple refineries per Content Server on separate computers.

  • Benefits:

    • Fastest processing for high volumes of content and big file sizes.

    • Refinery processing does not affect Content Server searches and access to the website, and vice versa.

  • Considerations:

    • Each refinery computer needs a copy of all third-party applications required for conversion.

    • In this configuration, typically the following choices should be made when deploying the refineries:

      • Each refinery will need to be added as a provider to each Content Server. For details, see Section 23.3.1.

23.3 Configuring Content Server and Refinery Communication

This section discusses the following topics:

23.3.1 Configuring Refinery Providers

A Content Server communicates with a refinery via a provider. A refinery can serve as a provider for one or multiple Content Servers. For more information about common configurations, see Section 23.2.

The refinery can be added as a provider to a Content Server on the same computer or added to Content Servers on separate computers after deployment.

This section discusses the following topics:

23.3.1.1 Adding or Editing Refinery Providers

To add a refinery as a provider to a Content Server:

  1. Log into the Content Server as an administrator.

  2. Choose Administration then Providers from the Main menu.

  3. In the Create a New Provider section of the Providers page, click Add in the Action column for the outgoing provider type.

  4. On the Add/Edit Outgoing Socket Provider page, complete the following fields:

    • Provider Name (required): a name for the refinery provider.

    • Provider Description (required): a user-friendly description for the provider.

    • Provider Class (required): the name of the Java class for the provider. The default is the intradoc.provider.SocketOutgoingProvider class.

    • Connection Class: not required.

    • Configuration Class: not required.

    • Server Host Name (required): The host name of the server on which the refinery is installed.

    • HTTP Server Address: The HTTP server address for the refinery. Not required when the refinery is on the same computer as the Content Server.

    • Server Port (required): The port on which the refinery provider will communicate. This entry must match the server socket port configured on the post installation configuration page during deployment of Inbound Refinery. For information on post configuration see the Oracle WebCenter Content Installation Guide. The default refinery port is 5555.

    • Instance Name (required): the instance name of the refinery. For example, ref2.

    • Relative Web Root (required): the relative web root of the refinery is /ibr/.

  5. Select the Use Connection Password check box if the refinery you are connecting to imposes authentication for the Content Server (the Content Server will share the refinery's user base). If enabled, you must specify a user name and password to be used and have the ProxyConnections component installed and configured on the refinery.

  6. Select the Handles Inbound Refinery Conversion Jobs check box. This is required.

  7. Deselect the Inbound Refinery Read Only Mode check box. Select this check box only when you do not want the Content Server to send new conversion jobs to the refinery.

  8. If necessary, change the maximum number of jobs allowed in the Content Server's pre-converted queue. The default is 1000 jobs.

  9. Click Add. The Providers page is displayed with the new refinery provider added to the Providers table.

  10. Restart the Content Server.

To edit information for an existing refinery provider, access the Providers page and click Info in the Action menu for the provider to edit. Make the required changes on the Add/Edit Outgoing Socket Provider page. Restart the Content Server when done.

23.3.1.2 Disabling/Enabling Refinery Providers

To disable or enable an existing refinery provider:

  1. Log into the Content Server as an administrator.

  2. Choose Administration then Providers from the Main menu.

  3. In the Providers table on the Providers page, click Info in the Action column for the refinery provider to disable or enable.

  4. On the Provider Information page, click Disable or Enable.

  5. Restart the Content Server.

23.3.1.3 Deleting Refinery Providers

To delete an existing refinery provider:

  1. Log into the Content Server as an administrator.

  2. Choose Administration then Providers from the Main menu.

  3. In the Providers table on the Providers page, click Info in the Action column for the refinery provider to delete.

  4. On the Provider Information page, click Delete. A confirmation message is displayed.

  5. Click OK.

23.3.2 Editing the Refinery IP Security Filter

An IP security filter is used to restrict access to a refinery. Only hosts with IP or IPv6 addresses matching the specified criteria are granted access. By default, the IP security filter is 127.0.0.1|0:0:0:0:0:0:0:1, which means the Inbound Refinery will only listen to communication from localhost. To ensure that a Content Server can communicate with all of its refineries, the IP or IPv6 address of each Content Server computer should be added to the refinery's IP security filter. This is true even if the refinery is running on the same computer as the Content Server. To edit an IP security filter for a refinery:

  1. Access the refinery computer.

  2. Start the System Properties application:

    • Windows: choose Start then Programs. Select Oracle Content Server/Inbound Refinery, the instance_name, then Utilities and System Properties

    • UNIX: run the SystemProperties script, which is located in the /bin subdirectory of the refinery installation directory

  3. Select the Server tab.

  4. The IP Address Filter field must include the IP or IPv6 address of each Content Server computer (even if this is the same physical computer that is also running the refinery server). The default value of this field is 127.0.0.1|0:0:0:0:0:0:0:1 (localhost), but you can add any number of valid IP or IPv6 addresses. You can specify multiple IP addresses separated by the pipe symbol (|), and you can use wildcards (* for zero or many characters, and ? for single characters). For example:

    127.0.0.1|0:0:0:0:0:0:0:1|10.10.1.10|62.43.163.*|62.43.161.12?
    

    Important:

    Always include the localhost IP address (127.0.0.1).

  5. Click OK when you are done, and restart the refinery server.

    Tip:

    Alternately, you can add IP addresses to the IP security filter directly in the config.cfg file located in the IntradocDir/config directory. Add the IP or IPv6 address to the SocketHostAddressSecurityFilter variable. For example: SocketHostAddressSecurityFilter=127.0.0.1|0:0:0:0:0:0:0:1|10.10.1.10|62.43.163.*

23.3.3 Setting Library Path for UNIX Platforms

Content Server and Inbound Refinery use Outside In Technology. Ouside In Technology is dynamically linked with the GCC libraries (libgcc_s and libstdc++) on all Linux platforms as well as both Solaris platforms and HPUX ia64. Content Server must be able to access these libraries, however Solaris and HPUX do not initially make these libraries available. If running Content Server or Inbound Refinery on either Solaris or HPUX, you need to obtain and install the GCC libraries and configure Content Server to find them. For information about configuring the library paths, see the Oracle WebCenter Content Installation Guide.

23.4 Configuring Content Servers to Send Jobs to Refineries

File extensions, file formats, and conversions are used in Content Server to define how content items should be processed by Inbound Refinery and its conversion add-ons. In addition, application developers can create custom conversions.

This section discusses the following topics:

23.4.1 About File Formats and Conversions

File formats are generally identified by their Multipurpose Internet Mail Extension (MIME) type, and each file format is linked to a specific conversion. Each file extension is mapped to a specific file format. Therefore, based on a checked-in file's extension, the Content Server can control if and how the file is processed by refineries. The conversion settings of the refineries specify which conversions the refineries accept and control the output of the conversions.

Consider the following example: the doc file extension is mapped to the file format application/msword, which is linked to the conversion Word. This means that the Content Server attempts to send all Microsoft Word files (with the doc file extension) checked into the Content Server to a refinery for conversion.

As another example, if the xls file extension is mapped to the file format application/vnd.ms-excel, which is linked to the conversion PassThru, Microsoft Excel files are not sent to a refinery. Instead, the Content Server can be configured to place either a copy of the native file or an HCST file that points to the native vault file in the /weblayout directory. This means that users must have an application capable of opening the native file installed on their computer to view the file.

Figure 23-1 Mapping File Formats to a Conversion

Surrounding text describes Figure 23-1 .

When a file is checked into the Content Server and its file format is mapped to a conversion, the Content Server will check to see if it has any refinery providers that accept that conversion and are available to take a conversion job. This means that:

  • Refinery providers must be set up for the Content Server. For details, see Section 23.3.1.

  • The refinery(s) need to be configured to accept the conversion. For details, see Section 23.6.2.

Conversions specify how a file format should be processed, including the conversion steps that should be completed and the conversion engine that should be used. For details, see Section 23.4.2..

Conversions available in the Content Server should match those available in the refinery. When a file format is mapped to a conversion in the Content Server, files of that format are sent for conversion upon check-in. One or more refineries must be set up to accept that conversion. For details, see Section 23.6.2.

The following default conversions are available. Additional conversions might be available when conversion add-ons are installed. For more information, see the documentation for each specific conversion add-on.

Conversion Description

PassThru

Used to prevent files from being converted. When this conversion is linked to a file format, all file extensions mapped to that file format are not sent for conversion. The Content Server can be configured to place either a copy of the native file or an HCST file that points to the native vault file in the /weblayout directory. For details, see Section 23.4.3.

Word

Used to send Microsoft Word, Microsoft Write, and rich text format (RTF) files for conversion. The files are converted according to the conversion settings for the refinery.

Excel

Used to send Microsoft Excel files for conversion. The files are converted according to the conversion settings for the refinery.

PowerPoint

Used to send Microsoft PowerPoint files for conversion. The files are converted according to the conversion settings for the refinery.

MSProject

Used to send Microsoft Project files for conversion. The files are converted according to the conversion settings for the refinery.

Distiller

Used to send PostScript files for conversion. The files are converted to PDF using the specified PostScript distiller engine.

MSPub

Used to send Microsoft Publisher files for conversion. The files are converted according to the conversion settings for the refinery.

FrameMaker

Used to send Adobe FrameMaker files for conversion. The files are converted according to the conversion settings for the refinery.

Visio

Used to send Microsoft Visio files for conversion. The files are converted according to the conversion settings for the refinery.

WordPerfect

Used to send Corel WordPerfect files for conversion. The files are converted according to the conversion settings for the refinery.

PhotoShop

Used to send Adobe Photoshop files for conversion. The files are converted according to the conversion settings for the refinery.

InDesign

Used to send Adobe InDesign, Adobe PageMaker, and QuarkXPress files for conversion. The files are converted according to the conversion settings for the refinery.

MSSnapshot

Used to send Microsoft Snapshot files for conversion. The files are converted according to the conversion settings for the refinery.

PDF Refinement

Used to send checked-in PDF files for refinement. Depending on the conversion settings for the refinery, this includes optimizing the PDF files for fast web viewing using the specified PostScript distiller engine.

Ichitaro

The Ichitaro conversion is not supported for this version of Inbound Refinery.

OpenOffice

Used to send OpenOffice and StarOffice files for conversion. The files are converted according to the conversion settings for the refinery.

ImageThumbnail

Used to send select graphics formats for creation of simple thumbnails only. This is useful if Inbound Refinery is not installed but thumbnail images of graphics formats are wanted. The returned web-viewable files are a copy of the native file and optionally a thumbnail image.

When Inbound Refinery is installed, it can be used instead of the ImageThumbnail conversion to send graphics formats for conversion, including the creation of image renditions and thumbnails.

NativeThumbnail

Used to send select file formats for creation of thumbnails from the native format rather than from an intermediate PDF conversion. Typically, this conversion is used to create thumbnails of text files (TXT), Microsoft Outlook e-mail files (EML and MSG), and Office documents without first converting to PDF. The returned web-viewable files are a copy of the native file and optionally a thumbnail rendition and/or a an XML rendition. For an XML rendition to be created, XMLConverter must be installed and XML step configured and enabled.

MultipageTiff

Used to send files for conversion directly to multi-page TIFF files using Outside In Image Export. When file formats are mapped to this conversion, the conversion settings for the refinery are ignored, and the files are sent directly to Image Export for conversion to a TIFF file.

OutsideIn Technology

Uses Outside In X to print supported formats to PostScript for conversion with WinNativeConverter on the refinery server.

Direct PDFExport

Used to send files for conversion directly to PDF using Outside In PDF Export.

FlexionXML

Used to send files for conversion using XML Converter.

SearchML

Used to send files for conversion using XML Converter

XML-XSLT Transformation

Used to send files for XSLT transformation using XML Converter. XSL transformation is used to output XML data into another format.

LegacyCustom

The LegacyCustom conversion is not supported for this version of Inbound Refinery.

Digital Media Graphics

When Digital Asset Manager is installed, this is used to send digital images for conversion into multiple image renditions using Image Manager.

Digital Media Video

When Digital Asset Manager is installed, this is used to send digital videos for conversion into multiple video or audio renditions using Video Manager.

TIFFConversion

Used to send TIFF files for conversion to a PDF format that enables indexing of text in the document.

Word HTML

Used to send Microsoft Word files for conversion to HTML using the native Microsoft Word application.

PowerPoint HTML

Used to send Microsoft PowerPoint files for conversion to HTML using the native Microsoft PowerPoint application.

Excel HTML

Used to send Microsoft Excel files for conversion to HTML using the native Microsoft Excel application.

Visio HTML

Used to send Microsoft Visio files for conversion to HTML using the native Microsoft Visio application.


23.4.1.1 Passing Content Items Through the Refinery and Failed Conversions

When a file format is linked to the conversion PassThru, all file extensions mapped to that file format are not converted. When a content item with a file extension mapped to PassThru is checked into the Content Server, the file is not sent to a refinery, and web-viewable files are not created. The Content Server can be configured to place either a copy of the native file or an HCST file that points to the native file in the weblayout directory. This means that the application that was used to create the file, or an application capable of opening the file, is required on each client for the user to be able to view the file. For details, see Section 23.4.3.

If a file is sent to the refinery and the refinery notifies the Content Server that the conversion has failed, the Content Server can be configured to place a copy of the native file in the weblayout directory. In this case users must also have an application capable of opening the native file installed on their computer to view the file. For details, see Section 23.4.4.

23.4.1.2 About MIME Types

It is recommended that you name new file formats by the MIME (Multipurpose Internet Mail Extensions) type corresponding to the file extension (for example, the format mapped to the doc file extension would be application/msword).

When a content item is checked in to Content Server, the content item's format is assigned according to the format mapped to the file extension of the native file. If the native file is not converted, Content Server includes this format when delivering the content item to clients. Using the MIME type for the format assists the client in determining what type of data the file is, what helper applications should be used, and so on.

If the native file is converted, Inbound Refinery assigns the appropriate format to the web-viewable file (for example, if a refinery generates a PDF file, it would identify this file as application/pdf), and Content Server then includes this format when delivering the web-viewable file to clients (instead of the format specified for the native file).

Inbound Refinery includes an extensive list of file formats configured out of the box when installed. Check the listing in the Configuration Manager applet of the Content Server provider. New formats should only need to be added if working with rare or proprietary formats.

The are several good resources on the Internet for identifying the correct MIME type for a file format. For example:

23.4.2 Managing File Types

You can manage file types and file format configuration details using the File Formats Wizard page or the Configuration Manager. The File Formats Wizard page can be used to configure conversions for most common file types, however it does not replicate all of the Configuration Manager applet features.

Important:

The InboundRefinerySupport component must be installed and enabled on the Content Server and at least one Inbound Refinery provider enabled to enable the File Formats Wizard page. Also, conversion option components might add file types to the File Formats Wizard page.

To use the File Formats Wizard page:

  1. Log in as an administrator.

  2. Choose Administration then Refinery Administration then File Formats Wizard from the Main menu.

  3. On the File Formats Wizard page, select the check box for each file type to be sent to a refinery for conversion. To select or deselect all check boxes, select or deselect the check box in the heading row.

    Important:

    The Ichitaro conversion is not supported for this version of Inbound Refinery.

  4. Click Reset if you want to revert to the last saved settings.

  5. Click Update. The corresponding default file extensions, file formats, and conversions are mapped automatically for the selected file types.

To use the Configuration Manager:

  1. Log in as an administrator.

  2. IChoose Administration then Admin Applets from the Main menu. Choose Configuration Manager.

  3. Select Options then File Formats.

23.4.2.1 Adding or Editing File Formats

To add a file format and link it to a conversion:

  1. On the File Formats page, in the File Formats section, click Add.

  2. On the Add New/Edit File Formats page, in the Format field, enter the name of the file format. Any name can be used, but Oracle recommends that you use the MIME type associated with the corresponding file extension(s).

  3. From the Conversion drop-down list, choose the appropriate conversion.

    Important:

    The Ichitaro conversion is not supported for this version of Inbound Refinery.

  4. In the Description field, enter a description for the file format.

  5. Click OK to save the settings.

To edit a file format, select the file format and click Edit. On the Add New/Edit File Formats page, make the appropriate changes.

23.4.2.2 Adding or Editing File Extensions

To add a file extension and map it to a file format (and thus associate the file extension with a conversion):

  1. On the File Formats page, in the File Extensions section, click Add.

  2. On the Add/Edit File Extensions page, in the Extension field, enter the file extension.

  3. From the Map to Format drop-down list, choose the appropriate file format from the list of defined file formats. Selecting a file format directly assigns all files with the specified extension to the specific conversion that is linked to the file format.

  4. Click OK to save the settings.

To edit a file extension, select the file extension on the File Formats page and click Edit. Make the appropriate changes.

23.4.3 Configuring the Content Server for PassThru Files

When a file format is linked to the conversion PassThru, all file extensions mapped to that file format are not sent for conversion. By default, the Content Server places a copy of the native file in the weblayout directory. However, the Content Server can be configured to place an HCST file that points to the native vault file in the weblayout directory instead. This can be useful if you have large files that are not being converted, and you do not want to copy the large files to the weblayout directory.

Please note the following important considerations:

  • The contents of the HCST file are controlled by the contents of the redirectionfile_template.htm file.

  • The GET_FILE service is used to deliver the file, so no PDF highlighting or byte serving is available. This can be resolved by overriding the template and reconfiguring the web server.

  • A simple template is used; the browser's Back button might not be functional and layout differences might occur. This can be resolved by overriding the template and reconfiguring the web server.

  • There is no reduction in the number of files because there is still an HCST file in the weblayout directory. However, there can be disk space savings if the native vault file is large.

  • This setting has no affect on files that are sent to a refinery for conversion; that is, if a file is sent to a refinery for conversion, another Content Server setting controls whether web-viewable files or a copy of the native file are placed in the weblayout directory, and an HCST file cannot be used. For more information, see Section 23.4.4.

To configure the Content Server to place an HCST file in the weblayout directory instead of a copy of the native file:

  1. Using a text editor, open the Content Server config.cfg file located in the IntradocDir/config/ directory.

  2. Include the IndexVaultFile variable, and set the value to true:

    IndexVaultFile=true
    
  3. Save your changes to the config.cfg file.

  4. Restart the Content Server.

23.4.4 Configuring the Content Server Refinery Conversion Options

You can configure how a Content Server interacts with its refinery providers, including how the Content Server should handle pre and post-converted jobs.

Important:

The InboundRefinerySupport component must be installed and enabled on the Content Server and at least one Inbound Refinery provider enabled to make the Inbound Refinery Conversion Options page available.

To configure how the Content Server should handle pre and post-converted jobs:

  1. Log into the Content Server as an administrator.

  2. Choose Administration then Refinery Administration then Conversion Options from the Main menu.

  3. On the Refinery Conversion Options page, enter the following information:

    • Enter the number of seconds between successive transfer attempts for pre-converted jobs. The default is 10 (seconds).

    • Enter the total number of minutes allowed to transfer a single job before action is taken. The default is 30 (minutes).

    • Enter the native file compression threshold size in MB. The default threshold size is 1024 MB (1 GB). Unless the native file exceeds the threshold size, it is compressed before the Content Server transfers it to a refinery. This setting is used to avoid the overhead of compressing very large files (for example, large video files). If you do not want any native files to be compressed before transfer, set the native file compression threshold size to 0.

    • If you want the conversion to fail when the time for transferring a job expires, select the check box.

    • Determine how you want the Content Server to handle failed conversions. If a file is sent to a refinery and conversion fails, the Content Server can be configured to place a copy of the native file in the /weblayout directory ("Refinery Passthru"). To enable passthru, select the check box. To disable passthru, deselect the check box.

      Please note the following important considerations:

      • When a file is sent to the refinery for conversion, an HCST file cannot be used instead of a copy of the native file. For more information on configuring how the Content Server handles files that are not sent to the refinery, see Section 23.4.3.

      • This setting can also be overridden manually using the AllowPassthru variable in the config.cgf file located in the IntradocDir\config\ directory.

  4. Click Reset if you want to revert to the last saved settings or click Update to save the changes.

  5. Restart the Content Server.

23.4.5 Overriding Conversions at Check-In

Certain file extensions might be used in multiple ways in your environment. A good example is the ZIP file extension. For example, you might be checking in:

  • Multiple TIFF files compressed into a single ZIP file that you want a refinery with Tiff Converter to convert to a single PDF file with OCR.

  • Multiple file types compressed into a single ZIP file that you do not want sent to a refinery for conversion (the ZIP file should be passed through in its native format).

When using a file extension in multiple ways, the Content Server can be configured to enable the user to choose how a file is converted when they check the file into the Content Server. This is referred to as Allow override format on checkin. To enable this Content Server functionality:

  1. Log in as an administrator.

  2. Select Administration then Admin Server from the Main menu.

  3. On the Admin Server page, click the button for the Content Server instance to configure.

  4. On the Administration page, in the navigation menu, click General Configuration.

  5. Enable the Allow override format on checkin check box.

  6. Click Save.

  7. Using the Configuration Manager, map the file extension to the conversion that is used most commonly to make it the default conversion. For example, for the ZIP file extension, you might set up the following default conversion:

    • Map the ZIP file extension to the application/x-zip-compressed file format, and the application/x-zip-compressed file format to the TIFFConversion conversion. Thus, by default it would be assumed that ZIP files contain multiple tiff files and should be sent to a refinery with Tiff Converter for conversion to PDF with OCR.

  8. Using the Configuration Manager, set up the alternate file formats and conversions that you want to be available for selection by the user at check-in. Continuing the preceding example for the ZIP file extension, you might set up the following alternate conversions:

    • Map the application/zip-passthru file format to the PassThru conversion. This option could then be selected at check-in for a ZIP file containing a variety of files that should not be sent to a refinery for conversion. The ZIP file would then be passed through in its native format.

  9. Restart the Content Server. When a user checks in a file, the user can override the default conversion by selecting any of the conversions you have set up.

Enabling users to override conversions at check-in is often used in conjunction with multiple, dedicated refineries and/or custom conversions. Continuing the preceding example for the ZIP file extension, you might have one refinery set up with Tiff Converter, which would be used to convert ZIP files containing multiple tiff files to PDF with OCR, and a second refinery set up to convert ZIP files containing Microsoft Office files to PDF.

23.4.5.1 Changing the Size of Thumbnails

By default, thumbnails are displayed as 80 x 80 pixels. To display at a different size:

  1. Open the config.cfg file located in the IntradocDir/config/ directory in a text editor.

  2. Change the following variables as needed t o change the thumbnail height and width:

    • ThumbnailHeight=xxx (where xxx is the value in pixels)

    • ThumbnailWidth=xxx (where xxx is the value in pixels)

    Scaling is done based on whichever setting is smaller (the height setting is used if the settings are equal), preserving the aspect ratio.

  3. Save the changes.

  4. Restart the Content Server.

Note:

This updates the size of all of your thumbnails.

For more information about the ThumbnailHeight and ThumbnailWidth variables, see Oracle Fusion Middleware Configuration Reference for Oracle WebCenter Content.

23.5 Viewing Status Details

This section discusses how to view the status of conversion jobs. The following topics are discussed:

23.5.1 Refinery Conversion Status

To view refinery conversation statuses, choose Administration then Refinery Administration then Conversion Options from the Main men. You can also click the Conversion Job Status tab on the IBR Provider Status page.

Important:

The InboundRefinerySupport component must be installed and enabled and at least one Inbound Refinery provider enabled for this page to be available.

The following information appears on the Refinery Conversion status page.

Element Description

Refresh

Updates the status of the displayed jobs.

Force Job Queue Check

Forces Content Server to deliver jobs to refinery providers. This is particularly useful if a refinery has gone down, causing any pending jobs to fail. In this situation, pending jobs are periodically resubmitted to providers for conversion. This button forces the submission.

Conversion Job ID

A unique identifier assigned by Inbound Refinery to each submitted job.

Content ID

The unique Content Server identifier of the content item submitted for conversion.

Conversion Job State

Identifies where a job is in the conversion process.

Job Submitted to Provider

Identifies the provider to which a job is submitted.

Last Action At

Lists the date and time of the last change in job state.

Actions

Links to the Content Server content information page of the content item submitted for conversion.


23.5.2 IBR Provider Status

To view IBR Provider status, select Administration then Refinery Administration then IBR Provider Status from the Main menu. You can also click the IBR Provider Status tab on the Refinery Conversion Job Status page.

Important:

The InboundRefinerySupport component must be installed and enabled on the Content Server and at least one Inbound Refinery provider enabled for this page to be available.

The following information appears on the IBR Provider status page.

Element Description

Force Status Update

Refreshes the status of the displayed providers.

Provider

The name of each provider.

Available

Identifies whether a provider is accepting content for conversion.

Read Only

Identifies if a provider is read only, meaning that it can no longer accept jobs for conversion. It can only return conversions to Content Server.

Jobs Queued

Identifies the number of jobs each provider has waiting for conversion.

Last Message

Displays the last status message delivered by the provider.

Connection State

Identifies whether the provider is connected to the Content Server or not.

Last Activity Date

Lists the date and time of the last provider activity.

Actions

Displays the Provider Information page, listing information regarding the specific provider.


23.6 Configuring Refinery Conversion Settings

Before configuring refinery conversion settings, you should complete the following tasks:

  • Start your refinery.

  • Verify that your refinery has been set up as a provider to one or multiple Content Servers. For details, see Section 23.3.1.

  • Verify that the InboundRefinerySupport component is installed and enabled on each Content Server.

  • Verify that each Content Server has been configured to send files to the refinery for conversion. For details, see Section 23.4.

Refinery conversion settings control which conversions the refinery will accept and how the refinery processes each conversion. Inbound Refinery includes Outside In Image Export, which can be used for the following.

  • To create thumbnails of files. Thumbnails are small preview images of content. For details, see Section 23.6.4.

  • To convert files to multi-page TIFF files, enabling users to view the files through standard web browsers with a TIFF viewer plugin. For details, see Section 23.6.3.

In addition, several conversion options are available for use with Inbound Refinery. When a conversion option is enabled, its conversion settings are added to the refinery.

This section discusses the following topics:

23.6.1 Calculating Timeouts

As content is processed by a refinery, it is allotted a certain amount of processing time based on the size of the file and the settings on the Timeout Settings page. The timeout value, in minutes, is calculated as follows:

timeout value [in minutes] = ([file size in bytes] x timeout factor) / 60,000

In order to determine what file to use, Inbound Refinery first checks if the previous step produced a file. If so, that file is used in the timeout calculations. Otherwise, the native file is used. If the previous step outputted more than one file (for example, Excel to PostScript), the sum of the file sizes is used. The content item to be processed is allotted at least the number of minutes indicated in the Minimum column, but no more minutes than indicated in the Maximum column. If the calculated timeout value is lower than the minimum value, the minimum value applies. If the calculated timeout value is larger than the maximum value, the maximum value applies.

23.6.1.1 Timeout Calculations

The following examples show how timeouts are calculated:

  • Example 1

File size = 10 MB (10485760 bytes or 10240 KB)
Minimum = 2
Maximum = 10
Factor = 3
Calculated Timeout = 10485760 * 3 / 60000 = 524.288 minutes = 8.74 hours

In this case, Inbound Refinery will wait only the maximum of 10 minutes.

  • Example 2

File size = 200 KB (204800 bytes)
Minimum = 2
Maximum = 30
Factor = 2
Calculated Timeout = 204800 * 2 / 60000 = 6.83 minutes

In this case, Inbound Refinery will wait only the calculated 6.83 minutes and not the Maximum of 30 minutes.

  • Example 3

File size = 50 KB (51200 bytes)
Minimum = 2
Maximum = 30
Factor = 2
Calculated Timeout = 51200 * 2 / 60000 = 1.71 minutes
In this case, Inbound Refinery will wait the minimum of 2 minutes and not the calculated timeout or the Maximum of 30 minutes

23.6.2 Setting Accepted Conversions

To set the conversions that the refinery will accept and queue maximums:

  1. Log into the refinery.

  2. Choose Conversion Settings then Conversion Listing.

  3. On the Conversion Listing page, set the total number of conversion jobs that are allowed to be queued by the refinery. The default is 0 (unlimited).

  4. Enter the maximum number of conversions allowed to wait for pick up by a Content Server before Inbound Refinery will no longer accept conversion jobs from that Content Server. The default is 1000.

  5. Enter the number of seconds that the refinery should be considered busy when the maximum number of conversions has been reached. The default is 120 (seconds). When the maximum number of conversion jobs for the refinery has been reached, Content Servers will wait this amount of time before attempting to communicate with the refinery again.

  6. Enter the maximum number of conversions that the refinery should process at the same time. The default is 5.

  7. Select the check box for each conversion that you want the refinery to accept.

    • By default, all conversions are selected and accepted.

    • To select all conversions, select the Accept check box in the column heading.

    • To deselect all conversions, deselect the Accept check box in the column heading.

      Important:

      The Ichitaro and LegacyCustom conversions are not supported for this version of Inbound Refinery.

  8. Set the maximum number of jobs (across all refinery queues) for each conversion type. The default is 0 (unlimited).

  9. Click Update to save your changes.

  10. Restart each Content Server that is an agent to the refinery to effect your changes in the Content Server's queuing immediately. Otherwise, the changes in refinery's accepted conversions will not be known to the Content Server until the next time it polls the refinery.

23.6.3 Setting Multi-Page TIFF Files as the Primary Web-Viewable Rendition

Inbound Refinery includes Outside In Image Export, used to convert files to multi-page TIFF files as the primary web-viewable rendition. This enables users to view the files through standard web browsers with a TIFF viewer plugin.

Other conversion options, such as PDF Export, are used to create other types of renditions as the primary web-viewable rendition. When conversion options that can generate a web-viewable rendition are enabled, additional options for the options are available.

To set multi-page TIFF files as the primary web-viewable rendition that the refinery will generate:

  1. Log into the refinery.

  2. Choose Conversion Settings then Primary Web Rendition.

  3. On the Primary Web-Viewable Rendition page, select Convert to multi-page Tiff using Outside In to convert files to multipage TIFF files as the primary web-viewable rendition.

  4. Click Update to save the changes.

23.6.4 Setting Up Thumbnails

Thumbnails are small preview images of content used on search results pages and typically link to the web-viewable file they represent. A thumbnail provides consumers with a visual sample of a file without actually opening the file itself. This enables them to check a file before committing to downloading the larger, original file.

Inbound Refinery includes Outside In Image Export, which can be used to create thumbnails of files. Please note the following important considerations:

  • You must configure the file formats and conversions in each Content Server to send files to the refinery for thumbnailing. For details, see Section 23.4. The refinery must be configured to accept the conversions. For details, see Section 23.6.2.

  • The Outside In Image Export thumbnail engine cannot successfully create thumbnails of PDF files with Type 3 Fonts. If a checked in PDF file contains Type 3 Fonts, the Outside In Image Export thumbnail engine will create a thumbnail with a blank page.

  • Thumbnail files are stored as JPEG, GIF, or PNG files in Content Server's the /weblayout directory with the characters @t in their filenames. For example, the file Report2001@t~2.jpg is the thumbnail that belongs to Report2001~2.pdf (which is revision 2 of a file called Report2001.xxx).

  • Thumbnails cannot be processed for any files that are encrypted or are password-protected.

  • Thumbnails can be created for EML files. If you are using Internet Explorer and have installed the April, 2003, Cumulative Patch for Outlook Express, you will receive an error if you click on the thumbnail to view an EML file. This only applies if the primary web-viewable file is an EML file (a multi-page TIFF or a PDF version of the EML file was not generated by the refinery as the primary web-viewable file, and the native EML file was copied to the weblayout directory as the primary web-viewable file).

  • Thumbnails of EML files do not exactly match the look-and-feel of the EML file as opened in Outlook Express because the thumbnail is created based on a plain-text rendition, whereas Outlook Express opens the file in its own format.

  • For details about changing the size of thumbnails displayed in the Content Server, see Section 23.4.5.1.

  • If thumbnails are turned off in Inbound Refinery, any thumbnails already created are still displayed on the search results pages. To prevent this, remove THUMBNAIL from the AllowableAdditionalRenditions entry in the config.cfg file located in the IntradocDir\config\ directory.

Thumbnails are the only additional rendition available in Inbound Refinery by default. Other conversion options and custom conversions enable you to create additional renditions.

To enable thumbnails and configure thumbnail settings:

  1. Log into the refinery.

  2. Select Conversion Settings then Additional Renditions.

  3. On the Additional Renditions page, select Create Thumbnail Images using Outside In.

  4. Click Update to save your changes.

  5. Click Options.

  6. On the Thumbnail Options page, select the necessary thumbnail options. Click Update when done.

    Note:

    When using Inbound Refinery on a SPARC system running Solaris, or any system running Linux, by default Outside In Image Export uses its internal graphics code to render fonts and graphics. You can also choose to use the operating system's native graphics subsystem instead. For details, see Section 23.6.5.

The following list describes the available options.

Element Description

Create Thumbnail Image from the Native Vault File check box

Specifies whether the thumbnail image is created from the native file or the primary web-viewable file.

Page Number of Native Vault File to Use to Create Thumbnail Image field

Specifies which page of the native file is used to create the thumbnail image. The default setting is 1. The first page of the native file is used to create the thumbnail image.

Use quick sizing radio button

Specifies the fastest conversion of color graphics but the quality of the converted graphic is somewhat degraded.

Use smooth sizing radio button

Specifies a more accurate representation of the original graphic, but requires a more complex process which slows down the conversion speed slightly. This is the default setting.

Smooth sizing for grayscale graphics radio button

Use the smooth sizing option for grayscale graphics and the quick sizing option for any color graphics.

Produce jpg thumbnails radio button

Specifies that all thumbnails be created as JPG files. This is the default thumbnail file type setting.

Produce gif thumbnails radio button

Specifies that all thumbnails be created as GIF files.

Produce png thumbnails radio button

Specifies that all thumbnails be created as PNG files.

Update button

Saves changes to settings.

Reset button

Reverts to the last saved settings.


23.6.5 Configuring Rendering Options on UNIX

When running Inbound Refinery on Linux or Solaris SPARC and creating multi-page TIFF files or thumbnails, by default Outside In uses its internal graphics code to render fonts and graphics. Therefore, access to a running X Window System display server (X Server) and the presence of either Motif (Solaris) or LessTif (Linux) is not required. The system only needs to be able to locate usable fonts. Fonts are not provided with Outside In. For information about setting the path to usable fonts, see Section 23.6.6.

To configure Inbound Refinery so that Image Export uses the operating system's native graphics subsystem to render fonts and graphics instead of its internal graphics code:

  1. Log into the Inbound Refinery computer as the Inbound Refinery user.

  2. The Inbound Refinery computer must have access to a running X Window System display server (X Server) and the presence of either Motif (Solaris) or LessTif (Linux).

  3. Ensure that the DISPLAY variable in the Inbound Refinery startup script (.profile, .login, .bashrc, and so on) points to the running X server. For example:

    DISPLAY=:0.0
    export DISPLAY
    
  4. Source the new .profile (for example, using /usr/bin/sh, run the command:

    ..profile
    
  5. Give Outside In Image Export permission to use the running X Server with the following command:

    xhost +localhost
    
  6. Lock the console, leaving the Inbound Refinery user logged in.

  7. Log into the refinery.

  8. Select Conversion Settings then Third-Party Application Settings.

  9. On the Third-Party Application Settings page, click Options under the General OutsideIn Filter Options section.

  10. Select Use native operating system's native graphics subsystem.

  11. Click Update.

23.6.6 Specifying the Font Path

For Inbound Refinery to work properly, you must specify the path to fonts used to generate font images. By default, the font path is set to the font directory in the JVM used by Inbound Refinery: java.home/lib/fonts. However, the fonts included in the default directory are limited and may cause poor renditions. Also, in some cases if a non-standard JVM is used, then the JVM font path may be different than that specified as the default. If this is the case, an error message is displayed from both Inbound Refinery and Content Server. If this occurs, ensure the font path is set to the directory containing the fonts necessary to properly render your conversions.

To configure Inbound Refinery to locate usable fonts:

  1. Log into the Inbound Refinery computer as the Inbound Refinery user.

  2. Under Conversion Settings, click Third-Party Applications Settings.

  3. On the Third-Party Application Settings page, click Options under the General OutsideIn Filter Options section.

  4. Enter the path to the font directories to be used by Outside In in the text field. For example, on Linux:

    /usr/lib/X11/fonts/TTF
    

    or on Windows:

    C:\WINDOWS\Fonts
    

    If fonts are called for and cannot be found, Outside In will exit with an error. Only TrueType fonts (*.ttf or *.ttc files) are supported.

  5. Click Update.

23.6.7 Configuring Timeout Settings for Graphics Conversions

To configure timeout settings for graphics conversions:

  1. Log into the refinery.

  2. Select Conversion Settings then Timeout Settings.

  3. On the Timeout Settings page, enter the Minimum (in minutes), Maximum (in minutes), and Factor for Graphics conversions. For details, see Section 23.6.1.

  4. Click Update to save the changes.