Export Source File Metadata to CSV
Export the file metadata from your source repository. Databases generally provide this ability as the output of a query. The exported metadata is formatted into a CSV file. The CSV file can include:
Required Columns
The CSV file must include the following required columns with these exact names.
Column Name | Data Type | Details |
---|---|---|
FILEID | STRING | The unique identifier of the file as it appeared in the source system (the unique document identifier).
This identifier needs to be unique across the entire migration. If the migration is gathering data from different repositories that might have overlapping unique identifiers, the values need to be made unique at the scope of the whole migration. For example, this could be accomplished by prepending an “A” to the unique identifiers of the first system, and a “B” to the unique identifiers of the second system. This would ensure the two sets of unique identifiers never overlapped. |
FILEVERSION | INTEGER | Indicates the version of the file. Version numbers should be in ascending order (the most recent revision would have the highest revision value). There can be gaps in the version number sequence (some versions may have been deleted). There can be a file with version 1 and version 3 but not version 2. If the source content file system doesn't support versioning, use "1" as the version value in this column. |
FILEEXTENSION | STRING | The extension of the file, such as pdf, doc, docx, xlsx. |
OCIOSFILE | STRING | The object name of the file as stored in Oracle Cloud Infrastructure object storage. The object name is simply a string value, so it can include slashes that make it appear to be an entire folder path. If the content is uploaded from a directory structure those directory names will be part of the object names. |
ORIGINALFILENAME | STRING | The original name of the file when it was uploaded into the original source system. This is the same name that will be given to this file if it is subsequently downloaded from Oracle Content Management. |
Optional Columns
Your CSV can also include the following optional columns. If these details are provided, the values will be assigned to the system-level asset attributes, and the asset will appear to have been created or modified at the given point in the past by the given user.
Column Name | Data Type | Details |
---|---|---|
CREATOR | STRING | The user identifier to be assigned as the creator of the document. If there are multiple versions, data from oldest revision will be used.
The user identifiers must exist in the Oracle Content Management instance, those defined in the associated identity server. You're responsible for translating these user identifiers from the source system to the equivalent user identifiers available in Oracle Content Management. You can also specify a default user in the questionnaire to be assigned if a given user identity isn’t found. |
CREATEDATE | DATE | The date to be assigned to the new asset as its creation date. If there are multiple versions, data from oldest revision will be used. |
LASTMODIFIER | STRING | The user identifier to be assigned as the last modifier of the document. If there are multiple versions, data from newest revision will be used.
The user identifiers must exist in the Oracle Content Management instance, those defined in the associated identity server. You're responsible for translating these user identifiers from the source system to the equivalent user identifiers available in Oracle Content Management. You can also specify a default user in the questionnaire to be assigned if a given user identity isn’t found. |
LASTMODIFIEDDATE | DATE | The date to be assigned to the new asset as it s last modification date. If there are multiple versions, data from newest revision will be used. |
Custom Columns
You can also include custom columns. Make sure to include all metadata you want to see in the target assets (for example, an invoice number or customer name) and any metadata that will be used to assign taxonomy categories to assets (for example, supplier state or sales territory).
Columns names must be valid Oracle database column names. Generally, they must not:
- Contain spaces
- Contain special characters: &*$|~@%?()
- Start with OCM_
Custom columns can specify the data type of the column following the name, in this format "ColumnName (DATATYPE)".
Datatype | Example | Notes |
---|---|---|
INTEGER |
|
|
DECIMAL |
|
|
VARCHAR / STRING |
|
If no type is specified it is assumed to be STRING. |
TIMESTAMP |
|
Date types can also include the format of the date. The timestamp mask format is whatever is acceptable by the Java DateTimeFormatter class. The default format if none is given is “yyyy-MM-dd'T'HH:mm:ss.SSSXXX” an example of which is “2022-07-12T01:35:45.868+05:30”. |
Sample SQL Query for WebCenter Content Migrations
The following sample SQL statement illustrates how to produce a result set from WebCenter Content, which can be exported to a CSV file. It can be adjusted based on your needs.
For other external systems we recommend working with a partner to generate CSV files.
SELECT
REVISIONS.DDOCNAME AS FILEID,
(ROW_NUMBER() OVER (PARTITION BY REVISIONS.DDOCNAME ORDER BY REVISIONS.DREVRANK desc)) AS FILEVERSION,
OCIDOCHASHES.HASH || '-' || OCIDOCHASHES.DCHECKINID AS OCIOSFILE,
DOCUMENTS.DFORMAT AS FILEMIMETYPE,
DOCUMENTS.DEXTENSION AS FILEEXTENSION,
DOCUMENTS.DORIGINALNAME AS ORIGINALFILENAME,
REVISIONS.DCREATEDATE AS CREATEDATE,
DOCMETA.XMKTCAMPAIGNHOSTCOUNTRY AS XMKTCAMPAIGNHOSTCOUNTRY,
DOCMETA.XLGLEVENTDATE AS XLGLEVENTDATE
FROM REVISIONS,
DOCUMENTS,
OCIDOCHASHES,
DOCMETA
WHERE REVISIONS.DID = DOCUMENTS.DID
AND REVISIONS.DSECURITYGROUP = 'LEGAL-APPROVAL'
AND REVISIONS.DID = OCIDOCHASHES.DID
AND REVISIONS.DID = DOCMETA.DID
AND OCIDOCHASHES.DRENDITIONID = 'primaryFile'
AND DOCUMENTS.DISPRIMARY = 1
AND DOCMETA.XSTORAGERULE = 'OCIObjectStorageRule'
ORDER BY REVISIONS.DDOCNAME
Here's a description of the SQL:
- In the SELECT statement, the REVISIONS and DOCUMENTS clauses specify the required and optional CSV columns. The text after "AS" is what the column will be called in the CSV file.
- In the SELECT statement, the OCIDOCHASHES clause assumes the files have been migrated from Oracle WebCenter Content to object storage. This clause produces the name of the object in object storage as the value of the OCIOSFILE column of the CSV file.
- In the SELECT statement, the DOCMETA clauses specify custom data fields.
- In the WHERE statement, the REVISIONS.DSECURITYGROUP clause limits the set of documents to a particular Oracle WebCenter Content security group. Additional WHERE clauses can be added to partition the data into smaller units to be migrated separately. This is necessary since each migration is limited to a single target repository and asset type.
- In the WHERE statement, the OCIDOCHASHES.DRENDITIONID = 'primaryFile' and DOCUMENTS.DISPRIMARY = 1 clauses limit the set of files to only the primary (original) documents ignoring web renderings.
- In the WHERE statement, the DOCMETA.XSTORAGERULE = 'OCIObjectStorageRule' clause limits the set of files to only those that have been migrated to the object storage provider.