21 Configuring Inbound Refinery

Oracle WebCenter Content: Inbound Refinery offers a variety of content conversion options depending on what components are installed and enabled on Oracle WebCenter Content Server and Inbound Refinery. This chapter provides an overview of the different conversion options and instructions on configuring Inbound Refinery.

This chapter discusses the following topics:

21.1 Prerequisites for Configuring Inbound Refinery

At minimum, the following components must be installed and enabled for basic conversion functionality.

Component Name Component Description Enabled on Server

InboundRefinery

Enables Inbound Refinery

Inbound Refinery Server

InboundRefinerySupport

Enables the Content Server to work with Inbound Refinery

Content Server

21.2 Content Server and Refinery Configuration Scenarios

Oracle WebCenter Content: Inbound Refinery can be used to refine content managed by Content Server. The Inbound Refinery application can be installed on the same computer as Content Server or on one or more separate computers. You must add the refinery as a provider to Content Servers on the same or separate computers after installation. For details, see Configuring Refinery Providers.

Note:

Inbound Refinery does not support running in a cluster environment. Inbound Refinery can do conversion work for a Content Server cluster, but cannot run in a cluster environment itself. To ensure that Inbound Refinery functions properly, Inbound Refinery creates and maintains a long-term lock on the /queue/conversion directory. If Inbound Refinery is mistakenly configured as part of a cluster and a second Inbound Refinery attempts to start and lock the same directory, the second Inbound Refinery will fail to start, and the attempt is logged.

Various configurations are possible, so keep the following general rules in mind when setting up a refinery environment:

  • If processing a large number of content items per day, do not run Inbound Refinery on the same computer as Content Server.

  • The more dedicated refinery systems that are installed, the faster the content is processed. Having more refinery systems than Content Server instances provides optimal speed. Having fewer refinery systems than Content Server instances can slow down performance when converting large numbers of files.

  • Typically, there is no reason to have multiple refineries on the same computer because refineries share the system's resources. One refinery can serve as a provider to multiple Content Servers. This includes third‐party applications used during conversion. To improve performance, use separate computers for each refinery.

  • Some file types and large files are processed slower than average. If a large number of these types must be processed in addition to other file types, consider setting up a refinery on a separate system to process just these file types. This requires more than one refinery system, but it does provide optimum refining speed and performance.

The following scenarios are common. Other refinery configurations are possible in addition to the ones described in this section. Specific content management applications might require their own particular refinery setup, which does not necessarily match any scenario mentioned in this section.

  • Scenario A: One Content Server and one refinery on the same computer.

  • Scenario B: Multiple Content Servers and one refinery on the same computer.

  • Scenario C: Multiple Content Servers and one refinery on separate computers.

  • Scenario D: One refinery per Content Server on separate computers.

  • Scenario E: Multiple refineries per Content Server on separate computers.

Each of these scenarios is explained in more detail in their descriptions, including the benefits of a scenario and considerations to take into account for a scenario. In the scenario images, the following symbols are used to represent a computer, the Content Server, and the Inbound Refinery:

  • Large Circle: computer

  • Small Circle: Inbound Refinery

  • Small Square: Content Server

21.2.1 Scenario A

Scenario A Diagram; described in surrounding text

This is the most basic scenario possible. It comprises one Content Server and one refinery on the same computer.

  • Benefits:

    • Least expensive and easiest to configure.

    • Only one copy of third-party applications required for refinery conversions must be purchased.

  • Considerations:

    • Number and speed of conversions is limited.

    • Not as powerful as scenarios where refineries are not deployed on the Content Server computer, because refinery processing on the Content Server computer can slow searches and access to the website, and vice versa. Each conversion can take between seconds and minutes, depending on the file type and size.

21.2.2 Scenario B

Scenario B diagram: described in surrounding text

This scenario comprises multiple Content Servers and one refinery on the same computer.

  • Benefits: Only one copy of third-party applications required for refinery conversions must be purchased.

  • Considerations:

    • Number and speed of conversions is limited.

    • Not as powerful as scenarios where refineries are not deployed on the Content Server computer, because refinery processing on the Content Server computer can slow searches and access to the website, and vice versa. Each conversion can take between seconds and minutes, depending on the file type and size.

    • In this configuration, typically the refinery is set as a provider to one of the Content Servers. After deployment, the refinery will need to be added as a provider to the other Content Servers. For details, see Configuring Refinery Providers.

21.2.3 Scenario C

Scenario C diagram; described in surrounding text

This scenario comprises multiple Content Servers and one refinery on separate computers.

  • Benefits:

    • Only one copy of third-party applications required for refinery conversions must be purchased.

    • Faster processing than when the refinery is deployed on the same computer as a Content Server.

    • Refinery processing does not affect Content Server searches and access to the website, and vice versa.

  • Considerations:

    • Not as powerful as scenarios where there is at least one refinery per Content Server.

    • In this configuration, typically the refinery will need to be added as a provider to each Content Server. For details, see Configuring Refinery Providers.

21.2.4 Scenario D

Scenario D diagram; described in surrounding text

This scenario comprises one refinery per Content Server on separate computers.

  • Benefits:

    • Faster processing for high volumes of content and big file sizes.

    • Refinery processing does not affect Content Server searches and access to the website, and vice versa.

  • Considerations:

    • Each refinery computer needs a copy of all third-party applications required for conversion.

    • Each refinery will need to be added as a provider to each Content Server. For details, see Configuring Refinery Providers.

21.2.5 Scenario E

Scenario E diagram; described in surrounding text

This scenario comprises multiple refineries per Content Server on separate computers.

  • Benefits:

    • Fastest processing for high volumes of content and big file sizes.

    • Refinery processing does not affect Content Server searches and access to the website, and vice versa.

  • Considerations:

    • Each refinery computer needs a copy of all third-party applications required for conversion.

    • In this configuration, typically each refinery will need to be added as a provider to each Content Server instance. For details, see Configuring Refinery Providers.

21.3 Configuring Content Server and Refinery Communication

This section discusses the following topics:

21.3.1 Configuring Refinery Providers

A Content Server communicates with a refinery through a provider. A refinery can serve as a provider for one or multiple Content Server instances. For more information about common configurations, see Content Server and Refinery Configuration Scenarios.

The refinery can be added as a provider to a Content Server instance on the same computer or added to Content Server instances on separate computers after deployment.

This section discusses the following topics:

21.3.1.1 Adding or Editing Refinery Providers

To add a refinery as a provider to a Content Server instance:

  1. Log into the Content Server as an administrator.
  2. From the main menu, choose Administration then Providers.
  3. In the Create a New Provider section of the Providers page, click Add in the Action column for the outgoing provider type.
  4. On the Add/Edit Outgoing Socket Provider page, complete the following fields:
    • Provider Name (required): A name for the refinery provider.

    • Provider Description (required): A user‐friendly description for the provider.

    • Provider Class (required): The name of the Java class for the provider. The default is the intradoc.provider.SocketOutgoingProvider class.

    • Connection Class: Not required.

    • Configuration Class: Not required.

    • Server Host Name (required): The host name of the server on which the refinery is installed.

    • HTTP Server Address: The HTTP server address for the refinery. Not required when the refinery is on the same computer as the Content Server.

    • Server Port (required): The port on which the refinery provider will communicate. This entry must match the server socket port configured on the post installation configuration page during deployment of Inbound Refinery. For information on post configuration, see Installing and Configuring Oracle WebCenter Content. The default refinery port is 5555.

    • Instance Name (required): The instance name of the refinery. For example, ref2.

    • Relative Web Root (required): The relative web root of the refinery is /ibr/.

  5. Select the Use Connection Password check box if the refinery you are connecting to imposes authentication for the Content Server (the Content Server will share the refinery's user base). If enabled, you must specify a user name and password to be used and have the ProxyConnections component installed and configured on the refinery.
  6. Select the Handles Inbound Refinery Conversion Jobs check box. This is required.
  7. Deselect the Inbound Refinery Read Only Mode check box. Select this check box only when you do not want the Content Server to send new conversion jobs to the refinery.
  8. If necessary, change the maximum number of jobs allowed in the Content Server's pre‐converted queue. The default is 1000 jobs.
  9. Click Add.

    The Providers page opens with the new refinery provider added to the Providers table.

  10. Restart the Content Server.

To edit information for an existing refinery provider, access the Providers page and click Info in the Action column for the provider to edit. Make the required changes on the Add/Edit Outgoing Socket Provider page. When done, restart the Content Server instance.

21.3.1.2 Disabling/Enabling Refinery Providers

To disable or enable an existing refinery provider:

  1. Log into the Content Server as an administrator.
  2. From the main menu, choose Administration then Providers.
  3. In the Providers table on the Providers page, click Info in the Action column for the refinery provider to disable or enable.
  4. On the Provider Information page, click Disable or Enable.
  5. Restart the Content Server.

21.3.1.3 Deleting Refinery Providers

To delete an existing refinery provider:

  1. Log into the Content Server as an administrator.
  2. From the main menu, choose Administration then Providers.
  3. In the Providers table on the Providers page, click Info in the Action column for the refinery provider to delete.
  4. On the Provider Information page, click Delete.

    A confirmation message appears.

  5. Click OK.

21.3.2 Editing the Refinery IP Security Filter

An IP security filter is used to restrict access to a refinery. Only hosts with IP or IPv6 addresses matching the specified criteria are granted access. By default, the IP security filter is 127.0.0.1|0:0:0:0:0:0:0:1, which means the Inbound Refinery will only listen to communication from localhost. To ensure that a Content Server can communicate with all of its refineries, the IP or IPv6 address of each Content Server computer should be added to the refinery's IP security filter. This is true even if the refinery is running on the same computer as the Content Server instance. To edit an IP security filter for a refinery:

  1. Access the refinery computer.
  2. Start the System Properties application:
    • Windows: choose Start then Programs. Select Oracle Content Server/Inbound Refinery, the instance_name, then Utilities and System Properties

    • UNIX: run the SystemProperties script, which is located in the /bin subdirectory of the refinery installation directory

  3. Select the Server tab.
  4. The IP Address Filter field must include the IP or IPv6 address of each Content Server computer (even if this is the same physical computer that is also running the refinery server). The default value of this field is 127.0.0.1|0:0:0:0:0:0:0:1 (localhost), but you can add any number of valid IP or IPv6 addresses. You can specify multiple IP addresses separated by the pipe symbol (|), and you can use wildcards (* for zero or many characters, and ? for single characters). For example:
    127.0.0.1|0:0:0:0:0:0:0:1|10.10.1.10|62.43.163.*|62.43.161.12?
    

    Note:

    Always include the localhost IP address (127.0.0.1).

  5. Click OK when you are done, and restart the refinery server.

    Tip:

    Alternately, you can add IP addresses to the IP security filter directly in the config.cfg file located in the IntradocDir/config directory. Add the IP or IPv6 address to the SocketHostAddressSecurityFilter variable. For example: SocketHostAddressSecurityFilter=127.0.0.1|0:0:0:0:0:0:0:1|10.10.1.10|62.43.163.*

    Make sure that you specify the localhost IP or IPv6 address in the SocketHostAddressSecurityFilter variable in the config.cfg file.

21.3.3 Setting Library Path for UNIX Platforms

Content Server and Inbound Refinery use Outside In Technology. Ouside In Technology is dynamically linked with the GCC libraries (libgcc_s and libstdc++) on all Linux platforms as well as both Solaris platforms and HPUX ia64. Content Server must be able to access these libraries, however, Solaris and HPUX do not initially make these libraries available. If running Content Server or Inbound Refinery on either Solaris or HPUX, you need to obtain and install the GCC libraries and configure Content Server to find them. For information about configuring the library paths, see Setting Library Paths in Environment Variables on UNIX Platforms in Installing and Configuring Oracle WebCenter Content.

21.4 Configuring Content Servers to Send Jobs to Refineries

File extensions, file formats, and conversions are used in Content Server to define how content items should be processed by Inbound Refinery and its conversion add‐ons. In addition, application developers can create custom conversions.

This section discusses the following topics:

21.4.1 Understanding File Formats and Conversions

File formats are generally identified by their Multipurpose Internet Mail Extension (MIME) type, and each file format is linked to a specific conversion. Each file extension is mapped to a specific file format. Therefore, based on a checked-in file's extension, the Content Server can control if and how the file is processed by refineries. The conversion settings of the refineries specify which conversions the refineries accept and control the output of the conversions.

Consider the following example: the doc file extension is mapped to the file format application/msword, which is linked to the conversion Word. This means that the Content Server attempts to send all Microsoft Word files (with the doc file extension) checked into the Content Server to a refinery for conversion.

As another example, if the xls file extension is mapped to the file format application/vnd.ms-excel, which is linked to the conversion PassThru, Microsoft Excel files are not sent to a refinery. Instead, the Content Server can be configured to place either a copy of the native file or an HCST file that points to the native vault file in the /weblayout directory. This means that users must have an application capable of opening the native file installed on their computer to view the file.

Figure 21-1 Mapping File Formats to a Conversion

Illustration described in surrounding text

When a file is checked into the Content Server and its file format is mapped to a conversion, the Content Server checks to see if it has any refinery providers that accept that conversion and are available to take a conversion job. This means that:

Conversions specify how a file format should be processed, including the conversion steps that should be completed and the conversion engine that should be used. For details, see Managing File Types.

Conversions available in the Content Server should match those available in the refinery. When a file format is mapped to a conversion in the Content Server, files of that format are sent for conversion upon check-in. One or more refineries must be set up to accept that conversion. For details, see Setting Accepted Conversions.

The following default conversions are available. Additional conversions might be available when conversion add‐ons are installed. For more information, see the documentation for each specific conversion add‐on.

Conversion Description

PassThru

Used to prevent files from being converted. When this conversion is linked to a file format, all file extensions mapped to that file format are not sent for conversion. The Content Server can be configured to place either a copy of the native file or an HCST file that points to the native vault file in the /weblayout directory. For details, see Configuring the Content Server for PassThru Files.

Word

Used to send Microsoft Word, Microsoft Write, and rich text format (RTF) files for conversion. The files are converted according to the conversion settings for the refinery.

Excel

Used to send Microsoft Excel files for conversion. The files are converted according to the conversion settings for the refinery.

PowerPoint

Used to send Microsoft PowerPoint files for conversion. The files are converted according to the conversion settings for the refinery.

MSProject

Used to send Microsoft Project files for conversion. The files are converted according to the conversion settings for the refinery.

Distiller

Used to send PostScript files for conversion. The files are converted to PDF using the specified PostScript distiller engine.

MSPub

Used to send Microsoft Publisher files for conversion. The files are converted according to the conversion settings for the refinery.

FrameMaker

Used to send Adobe FrameMaker files for conversion. The files are converted according to the conversion settings for the refinery.

Visio

Used to send Microsoft Visio files for conversion. The files are converted according to the conversion settings for the refinery.

WordPerfect

Used to send Corel WordPerfect files for conversion. The files are converted according to the conversion settings for the refinery.

PhotoShop

Used to send Adobe Photoshop files for conversion. The files are converted according to the conversion settings for the refinery.

InDesign

Used to send Adobe InDesign, Adobe PageMaker, and QuarkXPress files for conversion. The files are converted according to the conversion settings for the refinery.

MSSnapshot

Used to send Microsoft Snapshot files for conversion. The files are converted according to the conversion settings for the refinery.

PDF Refinement

Used to send checked-in PDF files for refinement. Depending on the conversion settings for the refinery, this includes optimizing the PDF files for fast web viewing using the specified PostScript distiller engine.

Ichitaro

Used to send Ichitaro files for conversion. The files are converted according to the conversion settings for the refinery.

OpenOffice

Used to send OpenOffice and StarOffice files for conversion. The files are converted according to the conversion settings for the refinery.

ImageThumbnail

Used to send select graphics formats for creation of simple thumbnails only. This is useful if Inbound Refinery is not installed but thumbnail images of graphics formats are wanted. The returned web-viewable files are a copy of the native file and optionally a thumbnail image.

When Inbound Refinery is installed, it can be used instead of the ImageThumbnail conversion to send graphics formats for conversion, including the creation of image renditions and thumbnails.

NativeThumbnail

Used to send select file formats for creation of thumbnails from the native format rather than from an intermediate PDF conversion. Typically, this conversion is used to create thumbnails of text files (TXT), Microsoft Outlook email files (EML and MSG), and Office documents without first converting to PDF. The returned web-viewable files are a copy of the native file and optionally a thumbnail rendition and/or a an XML rendition. For an XML rendition to be created, XMLConverter must be installed and XML step configured and enabled.

MultipageTiff

Used to send files for conversion directly to multi-page TIFF files using Outside In Image Export. When file formats are mapped to this conversion, the conversion settings for the refinery are ignored, and the files are sent directly to Image Export for conversion to a TIFF file.

OutsideIn Technology

Uses Outside In X to print supported formats to PostScript for conversion with WinNativeConverter on the refinery server.

Direct PDFExport

Used to send files for conversion directly to PDF using Outside In PDF Export.

FlexionXML

Used to send files for conversion using XML Converter.

SearchML

Used to send files for conversion using XML Converter

XSLT Transformation

Used to send files for XSLT transformation using XML Converter. XSL transformation is used to output XML data into another format.

Digital Media Graphics

When Digital Asset Manager is installed, this is used to send digital images for conversion into multiple image renditions using Image Manager.

Digital Media Video

When Digital Asset Manager is installed, this is used to send digital videos for conversion into multiple video or audio renditions using Video Manager.

TIFFConversion

Used to send TIFF files for conversion to a PDF format that enables indexing of text in the document.

Word HTML

Used to send Microsoft Word files for conversion to HTML using the native Microsoft Word application.

PowerPoint HTML

Used to send Microsoft PowerPoint files for conversion to HTML using the native Microsoft PowerPoint application.

Excel HTML

Used to send Microsoft Excel files for conversion to HTML using the native Microsoft Excel application.

Visio HTML

Used to send Microsoft Visio files for conversion to HTML using the native Microsoft Visio application.

21.4.1.1 Passing Content Items Through the Refinery and Failed Conversions

When a file format is linked to the conversion PassThru, all file extensions mapped to that file format are not converted. When a content item with a file extension mapped to PassThru is checked into the Content Server, the file is not sent to a refinery, and web‐viewable files are not created. The Content Server can be configured to place either a copy of the native file or an HCST file that points to the native file in the weblayout directory. This means that the application that was used to create the file, or an application capable of opening the file, is required on each client for the user to be able to view the file. For details, see Configuring the Content Server for PassThru Files.

If a file is sent to the refinery and the refinery notifies the Content Server that the conversion has failed, the Content Server can be configured to place a copy of the native file in the weblayout directory. In this case users must also have an application capable of opening the native file installed on their computer to view the file. For details, see Configuring the Content Server Refinery Conversion Options.

21.4.1.2 About MIME Types

It is recommended that you name new file formats by the MIME (Multipurpose Internet Mail Extensions) type corresponding to the file extension (for example, the format mapped to the doc file extension would be application/msword).

When a content item is checked in to Content Server, the content item's format is assigned according to the format mapped to the file extension of the native file. If the native file is not converted, Content Server includes this format when delivering the content item to clients. Using the MIME type for the format assists the client in determining what type of data the file is, what helper applications should be used, and so on.

If the native file is converted, Inbound Refinery assigns the appropriate format to the web‐viewable file (for example, if a refinery generates a PDF file, it would identify this file as application/pdf), and Content Server then includes this format when delivering the web-viewable file to clients (instead of the format specified for the native file).

Inbound Refinery includes an extensive list of file formats configured out of the box when installed. Check the listing in the Configuration Manager applet of the Content Server provider. New formats should only need to be added if working with rare or proprietary formats.

The are several good resources on the Internet for identifying the correct MIME type for a file format. For example:

21.4.2 Managing File Types

You can manage file types and file format configuration details using the File Formats Wizard page or the Configuration Manager. The File Formats Wizard page can be used to configure conversions for most common file types, however it does not replicate all of the Configuration Manager applet features.

Note:

The InboundRefinerySupport component must be installed and enabled on the Content Server and at least one Inbound Refinery provider enabled to enable the File Formats Wizard page. Also, conversion option components might add file types to the File Formats Wizard page.

To use the File Formats Wizard page:

  1. Log in as an administrator.

  2. From the main menu, choose Administration then Refinery Administration then File Formats Wizard.

  3. On the File Formats Wizard page, select the check box for each file type to be sent to a refinery for conversion. To select or deselect all check boxes, select or deselect the check box in the heading row.

  4. Click Reset if you want to revert to the last saved settings.

  5. Click Update.

    The corresponding default file extensions, file formats, and conversions are mapped automatically for the selected file types.

To use the Configuration Manager:

  1. Log in as an administrator.
  2. From the main menu, choose Administration then Admin Applets.
  3. Choose Configuration Manager.
  4. Select Options then File Formats.

21.4.2.1 Adding or Editing File Formats

To add a file format and link it to a conversion:

  1. On the File Formats page, in the File Formats section, click Add.
  2. On the Add New/Edit File Formats page, in the Format field, enter the name of the file format. Any name can be used, but Oracle recommends that you use the MIME type associated with the corresponding file extension(s).
  3. From the Conversion list, choose the appropriate conversion.
  4. In the Description field, enter a description for the file format.
  5. Click OK to save the settings.

To edit a file format, select the file format and click Edit. On the Add New/Edit File Formats page, make the appropriate changes.

21.4.2.2 Adding or Editing File Extensions

To add a file extension and map it to a file format (and thus associate the file extension with a conversion):

  1. On the File Formats page, in the File Extensions section, click Add.
  2. On the Add/Edit File Extensions page, in the Extension field, enter the file extension.
  3. From the Map to Format list, choose the appropriate file format from the list of defined file formats. Selecting a file format directly assigns all files with the specified extension to the specific conversion that is linked to the file format.
  4. Click OK to save the settings.

To edit a file extension, select the file extension on the File Formats page and click Edit. Make the appropriate changes.

21.4.3 Configuring the Content Server for PassThru Files

When a file format is linked to the conversion PassThru, all file extensions mapped to that file format are not sent for conversion. By default, the Content Server places a copy of the native file in the weblayout directory. However, the Content Server can be configured to place an HCST file that points to the native vault file in the weblayout directory instead. This can be useful if you have large files that are not being converted, and you do not want to copy the large files to the weblayout directory.

Please note the following important considerations:

  • The contents of the HCST file are controlled by the contents of the redirectionfile_template.htm file.

  • The GET_FILE service is used to deliver the file, so no PDF highlighting or byte serving is available. This can be resolved by overriding the template and reconfiguring the web server.

  • A simple template is used; the browser's Back button might not be functional and layout differences might occur. This can be resolved by overriding the template and reconfiguring the web server.

  • There is no reduction in the number of files because there is still an HCST file in the weblayout directory. However, there can be disk space savings if the native vault file is large.

  • This setting has no affect on files that are sent to a refinery for conversion; that is, if a file is sent to a refinery for conversion, another Content Server setting controls whether web‐viewable files or a copy of the native file are placed in the weblayout directory, and an HCST file cannot be used. For more information, see Configuring the Content Server Refinery Conversion Options.

To configure the Content Server to place an HCST file in the weblayout directory instead of a copy of the native file:

  1. Using a text editor, open the Content Server config.cfg file located in the IntradocDir/config/ directory.

  2. Include the IndexVaultFile variable, and set the value to true:

    IndexVaultFile=true
    
  3. Save your changes to the config.cfg file.

  4. Restart the Content Server.

21.4.4 Configuring the Content Server Refinery Conversion Options

You can configure how a Content Server interacts with its refinery providers, including how the Content Server should handle pre and post‐converted jobs.

Note:

The InboundRefinerySupport component must be installed and enabled on the Content Server and at least one Inbound Refinery provider enabled to make the Inbound Refinery Conversion Options page available.

To configure how the Content Server should handle pre- and post‐converted jobs:

  1. Log into the Content Server as an administrator.
  2. From the main menu, choose Administration then Refinery Administration then Conversion Options.
  3. On the Refinery Conversion Options page, enter the following information:
    • Enter the number of seconds between successive transfer attempts for preconverted jobs. The default is 10 (seconds).

    • Enter the native file compression threshold size in MB. The default threshold size is 1024 MB (1 GB). Unless the native file exceeds the threshold size, it is compressed before the Content Server transfers it to a refinery. This setting is used to avoid the overhead of compressing very large files (for example, large video files). If you do not want any native files to be compressed before transfer, set the native file compression threshold size to 0.

    • If you want the conversion to fail when the time for transferring a job expires, select the check box.

    • Determine how you want the Content Server to handle failed conversions. If a file is sent to a refinery and conversion fails, the Content Server can be configured to place a copy of the native file in the /weblayout directory ("Refinery Passthru"). To enable Passthru, select the check box. To disable Passthru, deselect the check box.

      Please note the following important considerations:

      • When a file is sent to the refinery for conversion, an HCST file cannot be used instead of a copy of the native file. For more information on configuring how the Content Server handles files that are not sent to the refinery, see Configuring the Content Server for PassThru Files.

      • This setting can also be overridden manually using the AllowPassthru variable in the config.cgf file located in the IntradocDir\config\ directory.

  4. Click Reset if you want to revert to the last saved settings or click Update to save the changes.
  5. Restart the Content Server.

21.4.5 Configuring Image Files to Bypass Preview

For common image file formats, the native user interface typically displays a preview of the native document that most web browsers can display. In contrast, the Oracle WebCenter Content user interface typically uses Outside In technology to convert native documents into page images before displaying a preview of the document on the View Documents page.

For multilayer image files, such as animated .gif files, Outside In creates one image for every layer of the multilayer file. This results in a separate page for every layer in the original image. Because most web browsers can correctly display animated .gif files and other multilayer formats, you may want Content Server to bypass Outside In so that the animation displays correctly.

To bypass Outside In and instead display the native file, an administrator lists the formats in the SimplePreviewFormatList variable in the content server intradoc.cfg file. This should be done for formats supported by browsers accessing the content server, such as standard image formats. This allows the browser to display the native file and correctly interpret the multiple layers of animated file formats.

To specify what image formats you want to use a simple preview:

  1. Use a text editor to open the intradoc.cfg file located in the Content Server DomainDir/ucm/cs/bin directory.
  2. Specify the image formats to use a simple preview in a comma separated list. For example:
    SimplePreviewFormatList=image/gif,image/png
    
  3. Save your changes to the intradoc.cfg file.
  4. Restart the content server.

21.4.6 Overriding Conversions at Check-In

Certain file extensions might be used in multiple ways in your environment. A good example is the ZIP file extension. For example, you might be checking in:

  • Multiple TIFF files compressed into a single ZIP file that you want a refinery with Tiff Converter to convert to a single PDF file with OCR.

  • Multiple file types compressed into a single ZIP file that you do not want sent to a refinery for conversion (the ZIP file should be passed through in its native format).

When using a file extension in multiple ways, the Content Server can be configured to enable the user to choose how a file is converted when they check the file into the Content Server. This is referred to as Allow override format on checkin. To enable this Content Server functionality:

  1. Log in as an administrator.
  2. From the main menu, choose Administration then Admin Server then General Configuration.
  3. Enable the Allow override format on checkin check box.
  4. Click Save.
  5. Using the Configuration Manager, map the file extension to the conversion that is used most commonly to make it the default conversion. For example, for the ZIP file extension, you might set up the following default conversion:
    • Map the ZIP file extension to the application/zip file format, and the application/zip file format to the TIFFConversion conversion. Thus, by default it would be assumed that ZIP files contain multiple tiff files and should be sent to a refinery with Tiff Converter for conversion to PDF with OCR.

  6. Using the Configuration Manager, set up the alternate file formats and conversions that you want to be available for selection by the user at check-in. Continuing the preceding example for the ZIP file extension, you might set up the following alternate conversions:
    • Map the application/zip-passthru file format to the PassThru conversion. This option could then be selected at check-in for a ZIP file containing a variety of files that should not be sent to a refinery for conversion. The ZIP file would then be passed through in its native format.

  7. Restart the Content Server.

    When a user checks in a file, the user can override the default conversion by selecting any of the conversions you have set up.

Enabling users to override conversions at check-in is often used in conjunction with multiple dedicated refineries and custom conversions. Continuing the preceding example for the ZIP file extension, you might have one refinery set up with Tiff Converter, which would be used to convert ZIP files containing multiple tiff files to PDF with OCR, and a second refinery set up to convert ZIP files containing Microsoft Office files to PDF.

21.4.6.1 Changing the Size of Thumbnails

By default, thumbnails are displayed as 100 x 100 pixels. To display at a different size:

  1. Open the config.cfg file located in the IntradocDir/config/ directory in a text editor.
  2. Change the following variables as needed to change the thumbnail height and width:
    • ThumbnailHeight=xxx (where xxx is the value in pixels)

    • ThumbnailWidth=xxx (where xxx is the value in pixels)

    Scaling is done based on whichever setting is smaller (the height setting is used if the settings are equal), preserving the aspect ratio.

  3. Save the changes.
  4. Restart the Content Server.

Note:

This updates the size of all of your thumbnails.

For more information about the ThumbnailHeight and ThumbnailWidth variables, see ThumbnailHeight and ThumbnailWidth in Configuration Reference for Oracle WebCenter Content.

21.4.7 Modifying Default Content Conversion Settings

After content items are checked in and before they are sent to the refinery, Dynamic Converter executes the pre_submit_to_conversion include resource. This include resource acts as a placeholder for any custom component that you create. The custom component overrides the value of the dConversion variable for the content item which specifies the action taken by the refinery.

To modify the default conversion criteria, you can create a custom component to modify the pre_submit_to_conversion resource. For example, you can selectively include or exclude full-text indexing for content items based on their MIME types or based on the value of any metadata field, including custom metadata fields.

The custom component you create loads after the default include and effectively replaces the content with the content you provide. For information about creating custom components, see Developing with Oracle WebCenter Content.

21.4.7.1 Conversion Resource

The default script for the pre_submit_to_conversion include contains sample code enclosed as a comment. The sample code does not execute, but is provided as one example of how you could modify the dConversion variable:

[[%
 The pre_submit_to_conversion include can be used to reset a 
 document conversion type based on specified metadata field values.
 Create a custom component to override the sample content in this 
 include.   The sample include below is enclosed as a comment and does 
 not execute.
%]]
<@dynamichtml pre_submit_to_conversion@> 
[[% 
<$if strEquals(dDocTitle, "skip Conversion")$>
  <$dConversion="PassThru"$>
<$elseif strEquals(dDocType, "Image")$>
  <$dConversion="MultipageTiff"$>
<$endif$>
%]]
<@end@>

In the sample code, if the title of the content item (the value of the dDocTitle metadata field) is "Skip Conversion", then the content item is not converted (dConversion is given the value of "PASSTHRU"). Also by default, if the document type (the value of the dDocType metadata field) is "Image", then the content item is converted to a multi-page TIFF image file.

21.4.7.2 Settings for the dConversion Variable

The value of the dConversion variable determines how a checked-in content item is converted. The format for the dConversion variable is:

dConversion="<conversion_type>"

21.4.7.3 Conversion Resource Include Example

Scenario:

When a user checks in Word document, they can choose to either convert the document as a Word document or to pass the document through unconverted.

Solution:

To allow users to determine whether a Word document are converted or not, you can create a custom metadata field (for example, xPerformConversion) with a list that has two values: Yes and No with No as the default. When a user checks in a Word document that is to be converted, they set the value of xPerformConversion field to Yes during check in.

To implement this solution, create a custom component whose content is as follows:

<@dynamichtml pre_submit_to_conversion@>
<$if strEquals(xPerformConversion, "No")$>
   <$dConversion="PASSTHRU"$>
<$elseif strEquals(xPerformConversion, "Yes")$>
   <$dConversion="Word"$>
<$endif$>
<@end@>

21.5 Viewing Status Details

This section discusses how to view the status of conversion jobs. The following topics are discussed:

21.5.1 Viewing Refinery Conversion Status

To view refinery conversion status, use the main menu to choose Administration then Refinery Administration then Conversion Options. You can also click the Conversion Job Status tab on the IBR Provider Status page.

Note:

The InboundRefinerySupport component must be installed and enabled and at least one Inbound Refinery provider enabled for this page to be available.

The following conversion status information is available.

Element Description

Refresh

Updates the status of the displayed jobs.

Force Job Queue Check

Forces Content Server to deliver jobs to refinery providers. This is particularly useful if a refinery has gone down, causing any pending jobs to fail. In this situation, pending jobs are periodically resubmitted to providers for conversion. This button forces the submission.

Conversion Job ID

A unique identifier assigned by Inbound Refinery to each submitted job.

Content ID

The unique Content Server identifier of the content item submitted for conversion.

Conversion Job State

Identifies where a job is in the conversion process.

Job Submitted to Provider

Identifies the provider to which a job is submitted.

Last Action At

Lists the date and time of the last change in job state.

Actions

Links to the Content Server content information page of the content item submitted for conversion.

21.5.2 Viewing IBR Provider Status

To view IBR Provider status, use the main menu to choose Administration then Refinery Administration then IBR Provider Status. You can also click the IBR Provider Status tab on the Refinery Conversion Job Status page.

Note:

The InboundRefinerySupport component must be installed and enabled on the Content Server and at least one Inbound Refinery provider enabled for this page to be available.

The following conversion information appears on the IBR Provider Status page.

Element Description

Force Status Update

Refreshes the status of the displayed providers.

Provider

The name of each provider.

Available

Identifies whether a provider is accepting content for conversion.

Read Only

Identifies if a provider is read only, meaning that it can no longer accept jobs for conversion. It can only return conversions to Content Server.

Jobs Queued

Identifies the number of jobs each provider has waiting for conversion.

Last Message

Displays the last status message delivered by the provider.

Connection State

Identifies whether the provider is connected to the Content Server or not.

Last Activity Date

Lists the date and time of the last provider activity.

Actions

Displays the Provider Information page, listing information regarding the specific provider.

21.6 Configuring Refinery Conversion Settings

Before configuring refinery conversion settings, you should complete the following tasks:

  • Start your refinery.

  • Verify that your refinery has been set up as a provider to one or multiple Content Servers. For details, see Configuring Refinery Providers.

  • Verify that the InboundRefinerySupport component is installed and enabled on each Content Server.

  • Verify that each Content Server has been configured to send files to the refinery for conversion. For details, see Configuring Content Servers to Send Jobs to Refineries.

Refinery conversion settings control which conversions the refinery will accept and how the refinery processes each conversion. Inbound Refinery includes Outside In Image Export, which can be used for the following.

  • Create thumbnails of files. Thumbnails are small preview images of content. For details, see Setting Up Thumbnails.

  • Convert files to multi-page TIFF files, enabling users to view the files through standard web browsers with a TIFF viewer plugin. For details, see S.

In addition, several conversion options are available for use with Inbound Refinery. When a conversion option is enabled, its conversion settings are added to the refinery.

This section discusses the following topics:

21.6.1 Calculating Timeouts

As content is processed by a refinery, it is allotted a certain amount of processing time based on the size of the file and the settings on the Timeout Settings page. The timeout value, in minutes, is calculated as follows:

timeout value [in minutes] =([file size in bytes] x timeout factor) /60,000

In order to determine what file to use, Inbound Refinery first checks if the previous step produced a file. If so, that file is used in the timeout calculations. Otherwise, the native file is used. If the previous step output more than one file (for example, Excel to PostScript), the sum of the file sizes is used. The content item to be processed is allotted at least the number of minutes indicated in the Minimum column, but no more minutes than indicated in the Maximum column. If the calculated timeout value is lower than the minimum value, the minimum value applies. If the calculated timeout value is larger than the maximum value, the maximum value applies.

21.6.1.1 Timeout Calculations

The following examples show how timeouts are calculated:

  • Example 1

  • File size =10 MB (10485760 bytes or 10240 KB)
  • Minimum =2
  • Maximum =10
  • Factor =3
  • Calculated Timeout =10485760 *3 /60000 =524.288 minutes =8.74 hours

In this case, Inbound Refinery will wait only the maximum of 10 minutes.

  • Example 2

  • File size =200 KB (204800 bytes)
  • Minimum =2
  • Maximum =30
  • Factor =2
  • Calculated Timeout =204800 *2 /60000 =6.83 minutes

In this case, Inbound Refinery will wait only the calculated 6.83 minutes and not the Maximum of 30 minutes.

  • Example 3

  • File size =50 KB (51200 bytes)
  • Minimum =2
  • Maximum =30
  • Factor =2
  • Calculated Timeout =51200 *2 /60000 =1.71 minutes
  • In this case, Inbound Refinery will wait the minimum of 2 minutes and not the calculated timeout or the Maximum of 30 minutes

21.6.2 Setting Accepted Conversions

To set the conversions that the refinery will accept and queue maximums:

  1. Log into the refinery.
  2. Choose Settings then Conversions.
  3. On the Conversion Listing page, set the total number of conversion jobs that are allowed to be queued by the refinery. The default is 0 (unlimited).
  4. Enter the maximum number of conversions allowed to wait for pick up by a Content Server before Inbound Refinery will no longer accept conversion jobs from that Content Server. The default is 1000.
  5. Enter the number of seconds that the refinery should be considered busy when the maximum number of conversions has been reached. The default is 120 (seconds). When the maximum number of conversion jobs for the refinery has been reached, Content Servers will wait this amount of time before attempting to communicate with the refinery again.
  6. Enter the maximum number of conversions that the refinery should process at the same time. The default is 5.
  7. Select the check box for each conversion that you want the refinery to accept.
    • By default, all conversions are selected and accepted.

    • To select all conversions, select the Accept check box in the column heading.

    • To deselect all conversions, deselect the Accept check box in the column heading.

  8. Set the maximum number of jobs (across all refinery queues) for each conversion type. The default is 0 (unlimited).
  9. Click Update to save your changes.
  10. Restart each Content Server that is an agent to the refinery to effect your changes in the Content Server's queuing immediately. Otherwise, the changes in refinery's accepted conversions will not be known to the Content Server until the next time it polls the refinery.

21.6.3 Setting Multi-Page TIFF Files as the Primary Web-Viewable Rendition

Inbound Refinery includes Outside In Image Export, used to convert files to multi-page TIFF files as the primary web‐viewable rendition. This enables users to view the files through standard web browsers with a TIFF viewer plugin.

Other conversion options, such as PDF Export, are used to create other types of renditions as the primary web-viewable rendition. When conversion options that can generate a web-viewable rendition are enabled, additional options for the options are available.

To set multi-page TIFF files as the primary web-viewable rendition that the refinery will generate:

  1. Log into the refinery.
  2. Choose Settings then Conversions.
  3. On the Conversions page, select Convert to Multipage TIFF image to convert files to multipage TIFF files as the primary web-viewable rendition.
  4. Click Update to save the changes.

21.6.4 Setting Up Thumbnails

Thumbnails are small preview images of content used on search results pages and typically link to the web-viewable file they represent. A thumbnail provides consumers with a visual sample of a file without actually opening the file itself. This enables them to check a file before committing to downloading the larger, original file.

Inbound Refinery includes Outside In Image Export, which can be used to create thumbnails of files. Please note the following important considerations:

  • You must configure the file formats and conversions in each Content Server to send files to the refinery for thumbnailing. For details, see Configuring Content Servers to Send Jobs to Refineries. The refinery must be configured to accept the conversions. For details, see Setting Accepted Conversions.

  • The Outside In Image Export thumbnail engine cannot successfully create thumbnails of PDF files with Type 3 Fonts. If a checked in PDF file contains Type 3 Fonts, the Outside In Image Export thumbnail engine will create a thumbnail with a blank page.

  • Thumbnail files are stored as JPEG, GIF, or PNG files in Content Server's the weblayout directory with the characters @t in their filenames. For example, the file Report2001@t~2.jpg is the thumbnail that belongs to Report2001~2.pdf (which is revision 2 of a file called Report2001.xxx).

  • Thumbnails cannot be processed for any files that are encrypted or are password‐protected.

  • Thumbnails can be created for EML files. If you are using Internet Explorer and have installed the April, 2003, Cumulative Patch for Outlook Express, you will receive an error if you click on the thumbnail to view an EML file. This only applies if the primary web-viewable file is an EML file (a multi-page TIFF or a PDF version of the EML file was not generated by the refinery as the primary web-viewable file, and the native EML file was copied to the weblayout directory as the primary web-viewable file).

  • Thumbnails of EML files do not exactly match the look-and-feel of the EML file as opened in Outlook Express because the thumbnail is created based on a plain‐text rendition, whereas Outlook Express opens the file in its own format.

  • For details about changing the size of thumbnails displayed in the Content Server, see Changing the Size of Thumbnails.

  • Note that if thumbnails are turned off in Inbound Refinery, any thumbnails already created are still displayed on the search results pages.

Thumbnails are the only additional rendition available in Inbound Refinery by default. Other conversion options and custom conversions enable you to create additional renditions.

To enable thumbnails and configure thumbnail settings:

  1. Log into the refinery.
  2. Choose Settings then Conversions.
  3. On the Conversions page, in the Additional Renditions Options column, select Create Thumbnail Rendition.
  4. Click Update to save your changes.
  5. Click Settings, then Options.
  6. Under Thumbnail Options, click Configure.
  7. In the Thumbnail Options page, select the necessary thumbnail options. Click Update when done.

    Note:

    When using Inbound Refinery on a SPARC system running Solaris, or any system running Linux, by default Outside In Image Export uses its internal graphics code to render fonts and graphics. You can also choose to use the operating system's native graphics subsystem instead. For details, see Configuring Rendering Options on UNIX.

The following table describes the available options.

Element Description

Create Thumbnail Image from the Native Vault File check box

Specifies whether the thumbnail image is created from the native file or the primary web-viewable file.

Page Number of Native Vault File to Use to Create Thumbnail Image field

Specifies which page of the native file is used to create the thumbnail image. The default setting is 1. The first page of the native file is used to create the thumbnail image.

Use quick sizing radio button

Specifies the fastest conversion of color graphics but the quality of the converted graphic is somewhat degraded.

Use smooth sizing radio button

Specifies a more accurate representation of the original graphic, but requires a more complex process which slows down the conversion speed slightly. This is the default setting.

Smooth sizing for grayscale graphics radio button

Use the smooth sizing option for grayscale graphics and the quick sizing option for any color graphics.

Produce jpg thumbnails radio button

Specifies that all thumbnails be created as JPG files. This is the default thumbnail file type setting.

Produce gif thumbnails radio button

Specifies that all thumbnails be created as GIF files.

Produce png thumbnails radio button

Specifies that all thumbnails be created as PNG files.

Update button

Saves changes to settings.

Reset button

Reverts to the last saved settings.

21.6.5 Configuring Rendering Options on UNIX

When running Inbound Refinery on Linux or Solaris SPARC systems and creating multi-page TIFF files or thumbnails, by default Outside In uses its internal graphics code to render fonts and graphics. Therefore, access to a running X Window System display server (X Server) and the presence of either Motif (Solaris) or LessTif (Linux) is not required. The system only needs to be able to locate usable fonts. Fonts are not provided with Outside In. For information about setting the path to usable fonts, see Specifying the Font Path.

To configure Inbound Refinery so that Image Export uses the operating system's native graphics subsystem to render fonts and graphics instead of its internal graphics code:

  1. Log in to the Inbound Refinery computer as the Inbound Refinery user.

    The Inbound Refinery computer must have access to a running X Window System display server (X Server) and the presence of either Motif (Solaris) or LessTif (Linux).

  2. Ensure that the DISPLAY variable in the Inbound Refinery startup script (.profile, .login, .bashrc, and so on) points to the running X Server. For example:
    DISPLAY=:0.0
    export DISPLAY
    
  3. Source the new .profile (for example, using /usr/bin/sh, run the command:
    ..profile
    
  4. Give Outside In Image Export permission to use the running X Server with the following command:
    xhost +localhost
    
  5. Lock the console, leaving the Inbound Refinery user logged in.
  6. Log into the refinery.
  7. Choose Settings then Options.
  8. On the Options page, click Configure under the General OutsideIn Filter Options section.
  9. Select Use native operating system's native graphics subsystem.
  10. Click Update.

21.6.6 Specifying the Font Path

For Inbound Refinery to work properly, you must specify the path to fonts used to generate font images. By default, the font path is set to the font directory in the JVM used by Inbound Refinery: java.home/lib/fonts. However, the fonts included in the default directory are limited and may cause poor renditions. Also, in some cases if a non-standard JVM is used, then the JVM font path may be different than that specified as the default. If this is the case, an error message is displayed from both Inbound Refinery and Content Server. If this error occurs, ensure the font path is set to the directory containing the fonts necessary to properly render your conversions.

To configure Inbound Refinery to locate usable fonts:

  1. Log in to the Inbound Refinery computer as the Inbound Refinery user.
  2. Under Administration, select Admin Server, then General Configuration.
  3. Enter the path to the font directories to be used by Outside In in the text field. For example, on Linux:
    /usr/lib/X11/fonts/TTF
    

    For example, on Windows:

    C:\WINDOWS\Fonts
    

    If fonts are called for and cannot be found, Outside In will exit with an error. Only TrueType fonts (*.ttf or *.ttc files) are supported.

  4. Click Save.

21.6.7 Configuring Timeout Settings for Graphics Conversions

To configure timeout settings for graphics conversions:

  1. Log into the refinery.
  2. Choose Settings then Timeouts.
  3. On the Timeouts page, enter the Minimum (in minutes), Maximum (in minutes), and Factor for graphics conversions. For details, see Calculating Timeouts.
  4. Click Update to save the changes.