Export Source File Metadata to CSV

Export the file metadata from your source repository. Databases generally provide this ability as the output of a query. The exported metadata is formatted into a CSV file. The CSV file can include:

Required Columns

The CSV file must include the following required columns with these exact names.

Column Name Data Type Details
FILEID STRING The unique identifier of the file as it appeared in the source system (the unique document identifier).

This identifier needs to be unique across the entire migration. If the migration is gathering data from different repositories that might have overlapping unique identifiers, the values need to be made unique at the scope of the whole migration.

For example, this could be accomplished by prepending an “A” to the unique identifiers of the first system, and a “B” to the unique identifiers of the second system. This would ensure the two sets of unique identifiers never overlapped.

FILEVERSION INTEGER Indicates the version of the file. Version numbers should be in ascending order (the most recent revision would have the highest revision value). There can be gaps in the version number sequence (some versions may have been deleted). There can be a file with version 1 and version 3 but not version 2. If the source content file system doesn't support versioning, use "1" as the version value in this column.
FILEEXTENSION STRING The extension of the file, such as pdf, doc, docx, xlsx.
OCIOSFILE STRING The object name of the file as stored in Oracle Cloud Infrastructure object storage. The object name is simply a string value, so it can include slashes that make it appear to be an entire folder path. If the content is uploaded from a directory structure those directory names will be part of the object names.
ORIGINALFILENAME STRING The original name of the file when it was uploaded into the original source system. This is the same name that will be given to this file if it is subsequently downloaded from Oracle Content Management.

Optional Columns

Your CSV can also include the following optional columns. If these details are provided, the values will be assigned to the system-level asset attributes, and the asset will appear to have been created or modified at the given point in the past by the given user.

Column Name Data Type Details
CREATOR STRING The user identifier to be assigned as the creator of the document. If there are multiple versions, data from oldest revision will be used.

The user identifiers must exist in the Oracle Content Management instance, those defined in the associated identity server. You're responsible for translating these user identifiers from the source system to the equivalent user identifiers available in Oracle Content Management. You can also specify a default user in the questionnaire to be assigned if a given user identity isn’t found.

CREATEDATE DATE The date to be assigned to the new asset as its creation date. If there are multiple versions, data from oldest revision will be used.
LASTMODIFIER STRING The user identifier to be assigned as the last modifier of the document. If there are multiple versions, data from newest revision will be used.

The user identifiers must exist in the Oracle Content Management instance, those defined in the associated identity server. You're responsible for translating these user identifiers from the source system to the equivalent user identifiers available in Oracle Content Management. You can also specify a default user in the questionnaire to be assigned if a given user identity isn’t found.

LASTMODIFIEDDATE DATE The date to be assigned to the new asset as it s last modification date. If there are multiple versions, data from newest revision will be used.

Custom Columns

You can also include custom columns. Make sure to include all metadata you want to see in the target assets (for example, an invoice number or customer name) and any metadata that will be used to assign taxonomy categories to assets (for example, supplier state or sales territory).

Columns names must be valid Oracle database column names. Generally, they must not:

  • Contain spaces
  • Contain special characters: &*$|~@%?()
  • Start with OCM_

Custom columns can specify the data type of the column following the name, in this format "ColumnName (DATATYPE)".

Datatype Example Notes
INTEGER
  • "InvoiceNumber (INTEGER)"
 
DECIMAL
  • "InvoiceAmount (DECIMAL)"
 
VARCHAR / STRING
  • "CustomerEmail (VARCHAR)“
  • "CustomerEmail (STRING)"
  • "CustomerEmail"
If no type is specified it is assumed to be STRING.
TIMESTAMP
  • "InvoiceDate (TIMESTAMP)"
  • "InvoiceDate (TIMESTAMP:dd-MMM-yy)"
Date types can also include the format of the date. The timestamp mask format is whatever is acceptable by the Java DateTimeFormatter class. The default format if none is given is “yyyy-MM-dd'T'HH:mm:ss.SSSXXX” an example of which is “2022-07-12T01:35:45.868+05:30”.

Sample SQL Query for WebCenter Content Migrations

The following sample SQL statement illustrates how to produce a result set from WebCenter Content, which can be exported to a CSV file. It can be adjusted based on your needs.

For other external systems we recommend working with a partner to generate CSV files.

SELECT
    REVISIONS.DDOCNAME    AS FILEID,
    (ROW_NUMBER() OVER (PARTITION BY REVISIONS.DDOCNAME ORDER BY REVISIONS.DREVRANK desc))    AS FILEVERSION,
    OCIDOCHASHES.HASH || '-' || OCIDOCHASHES.DCHECKINID    AS OCIOSFILE,
    DOCUMENTS.DFORMAT    AS FILEMIMETYPE,
    DOCUMENTS.DEXTENSION    AS FILEEXTENSION,
    DOCUMENTS.DORIGINALNAME    AS ORIGINALFILENAME,
    REVISIONS.DCREATEDATE    AS CREATEDATE,
    DOCMETA.XMKTCAMPAIGNHOSTCOUNTRY    AS XMKTCAMPAIGNHOSTCOUNTRY,
    DOCMETA.XLGLEVENTDATE    AS XLGLEVENTDATE
FROM REVISIONS,
    DOCUMENTS,
    OCIDOCHASHES,
    DOCMETA
WHERE REVISIONS.DID = DOCUMENTS.DID
    AND REVISIONS.DSECURITYGROUP = 'LEGAL-APPROVAL'
    AND REVISIONS.DID = OCIDOCHASHES.DID
    AND REVISIONS.DID = DOCMETA.DID
    AND OCIDOCHASHES.DRENDITIONID = 'primaryFile'
    AND DOCUMENTS.DISPRIMARY = 1
    AND DOCMETA.XSTORAGERULE = 'OCIObjectStorageRule'
ORDER BY REVISIONS.DDOCNAME

Here's a description of the SQL:

  • In the SELECT statement, the REVISIONS and DOCUMENTS clauses specify the required and optional CSV columns. The text after "AS" is what the column will be called in the CSV file.
  • In the SELECT statement, the OCIDOCHASHES clause assumes the files have been migrated from Oracle WebCenter Content to object storage. This clause produces the name of the object in object storage as the value of the OCIOSFILE column of the CSV file.
  • In the SELECT statement, the DOCMETA clauses specify custom data fields.
  • In the WHERE statement, the REVISIONS.DSECURITYGROUP clause limits the set of documents to a particular Oracle WebCenter Content security group. Additional WHERE clauses can be added to partition the data into smaller units to be migrated separately. This is necessary since each migration is limited to a single target repository and asset type.
  • In the WHERE statement, the OCIDOCHASHES.DRENDITIONID = 'primaryFile' and DOCUMENTS.DISPRIMARY = 1 clauses limit the set of files to only the primary (original) documents ignoring web renderings.
  • In the WHERE statement, the DOCMETA.XSTORAGERULE = 'OCIObjectStorageRule' clause limits the set of files to only those that have been migrated to the object storage provider.