24 Capture Files from a WebCenter Content Archive

You can import content from an on-premise WebCenter Content instance to Oracle Content Management. To do this, you need to create an archive of the on-premise WebCenter Content files using an admin archiver applet from the Content Server and transfer the archive to the system on which you are running the file import agent.

You can also export the folder tables as part of the archive definition which contains the folder path information of the files. In WebCenter Content 11g, the table is folders_g component (or Framework Folders, which was introduced later) and in WebCenter Content 12c it is Framework Folders component.

To import the on-premise content:
  1. Copy the on-premise archive file to the folder where you want to run the file import agent.
  2. Create a blank procedure. You'll need the procedure Id to configure the archive import job through the command-line utility in the next step.
  3. To configure the archive import job, run the capture configure-archiver-job command via the command-line utility.
  4. Go to the Metadata tab of your procedure to verify that the metadata is populated in the Metadata Fields table. These fields are read from the docmetadefinition.hda file of the archive export. In addition to these custom fields, this tab also includes default metadata fields that are always created when an archive is imported into a procedure.
  5. Verify that an import processor job is created under the Capture tab in the Import Processors table. Note that this job is offline at present. Before you make it online, you must configure a commit profile so the imported content can be stored in a repository.
  6. Create a Digital Content Type (required) and map its fields with the metadata fields in the commit profile that you'll create in the next step.
  7. Configure a commit profile with the Asset Repository as the destination. For versioned documents in WebCenter Content, enable the same version order in the target repository in Oracle Content Management by creating a document and attachment mapping on the Commit Driver Settings tab of the commit profile:
    1. For Asset Type, select the Digital Asset Type Created.
    2. For Asset Action, select Find by Search, Else Create.
    3. Add a search criteria and set the Asset Field as the field which is mapped to dDocName and Capture Field as dDocName.
    4. For the When more than one asset is found option, select Version most recent.

      Note:

      • Here dDocName is considered because it is unique and specific to a document. You can also consider dRevClassId.
      • Migration of versioned documents from WebCenter Content is limited to asset and business repositories in Oracle Content Management.
  8. Under the Capture tab in the Import Processors table, select the import processor job and then click Edit an import processor job to do the following:
    1. On the General Settings page, verify that Archiver Source is pre-selected as Import Source.
    2. On the Document Profile page, select the Import web viewable files check box if you want web viewable files to be processed, and then select an attachment type in the Attachment Type for web viewable files drop-down list.
      If you leave this check box unselected, the Attachment Type for web viewable files drop-down list remains disabled.

      Note:

      If the folders table is exported in the archive, then files' folder information it contains can be mapped to any metadata attribute using the Folder Name and Folder Path system fields. These two fields are unique to Archiver imports, and they refer to the folder name and path values that exist in WebCenter Content.

    3. On the Import Source Settings page, in the Archiver Post Processing section, select the Delete Archiver output after import check box if you don't want to preserve the archive folder. In the Hours to preserve archiver original field, the number you select signifies the hours for which the archive processed folder is preserved before it's deleted. If you enter 0, the archive processed folder is deleted when the next scheduled job is run (whenever the next job is available).
    4. If you created an attachment type on the Classification tab, then on the Document Profile page, in the Attachment Type for web viewable file drop-down list, select the type you want to process as part of the import processor job.
    5. On the Post Processing page, select only Commit Processor. Nothing else should be selected.
  9. To make the import processor job online, go to the Capture tab and select the job in the Import Processors table and click Online/Offline toggle.
    Contents of the archive folder are uploaded to your repository in Oracle Content Management. The corresponding metadata is also copied in the format of the new content items that you created. Then, a new parallel folder with the name <foldername>-processed is created in which the files are moved one by one. For example, if the folder name is Archiver Export, the parallel folder would be named Archiver Export-processed.

    Note:

    If you chose to delete the archive folder on the Import Source Settings page, in the Archiver Post Processing section, only the Archive Export-processed folder is deleted. The original archive file is retained for any future validations.
  10. To diagnose errors in the file import agent and the Content Capture Client:
    • check the log file capture.log in the file import agent. This log file includes information about the number of files imported, failed, and so on.
    • configure a client profile in Content Capture to analyze errors in batch activity logs of the Content Capture Client. After resolving the errors, you can release a batch again from the Content Capture Client.