File Content Analysis

There are many use cases for a user to upload arbitrary file content to the Transportation and Global Trade Management service, e.g., scanned images can be attached to shipments or structured data can be uploaded via CSV. In each of these cases, the service does a file content analysis of the content uploaded to prevent accidental or malicious uploading of malware, attack vectors or other inappropriate files.

The uploaded file content is analyzed to ensure the uploaded file(s) are of a valid content for the intended usage (regardless of the file name extension). Using a simple file name extension alone is not sufficient in determining a true file content type. Simply renaming a file does not change the underlying file type and renaming a file to disguise its contents is a common tactic of malicious attackers. Also, using simple file detection alone is not sufficient in determining the true file content, as simply mimicking a loose file format type to disguise its true contents is another common tactic of malicious attackers.

The uploaded file content analysis includes a file detection check, a file parsing check, and a file extension check if applicable. Each one of these checks done must pass to allow the file content upload. The file upload will fail upon the first check failure. The uploaded file content is preliminary scanned to determine the type of file content being uploaded. If it permitted based on allowable MIME types, the uploaded file content is then parsed (if applicable) based on that detected type to ensure the uploaded file(s) have the same and correct file format structure that the detected file type should have. Finally after the file content is parsing to ensure integrity based on the file detected, the file content file name (if applicable) extension will then be checked to further ensure file and file name integrity.

There are a number of different File Content upload use cases in the Oracle Transportation Management service where file content is brought into the service. All of this external file content is analyzed before being persisted or processed by the Oracle Transportation Management service.

File Content Analysis Use Cases

The following table describes the file content analysis use cases in Oracle Transportation Management and the permitted file content types allowed (identified using Internet-standard MIME types).

File Content Analysis Use Cases

Use Case Description Default File MIME Types
Analytics-UploadCSVTargets CSV content for TI targets text/plain, text/csv
BatchCSVUtil Loading CSV files through Integration text/plain, text/csv
BrandingThemeUpdate Updating a branding theme image/gif, image/png, image/jpeg, image/x-wap-wbmp, image/tiff
BrandingImagesUpload Uploading branding images image/gif, image/png, image/jpeg, image/x-wap-wbmp, image/tiff
CSVContent CSV content sent in a <CSVDataLoad> or <CSVFileContent> integration transaction text/plain,text/csv
DBXML DB XML Integration application/xml, text/plain
DiagnosticLog Planning diagnostic load import text/plain
DocumentIntegration Binary or text content sent in a <Document> integration transaction application/msword,application/pdf,application/vnd.openxmlformats-officedocument.spreadsheetml.sheet, application/vnd.openxmlformats-officedocument.wordprocessingml.document, application/vnd.ms-excel,application/vnd.ms-word,application/x-tika-msoffice,application/x-tika-ooxml,application/xml, image/gif,image/png,image/jpeg,image/x-wap-wbmp,image/tiff,text/csv,text/html,text/plain,text/xml
DocumentStorage Retrieving document content from a CMS (e.g. WCC or Sharepoint) application/msword,application/pdf,application/vnd.openxmlformats-officedocument.spreadsheetml.sheet, application/vnd.openxmlformats-officedocument.wordprocessingml.document, application/vnd.ms-excel,application/vnd.ms-word,application/x-tika-msoffice,application/x-tika-ooxml,application/xml, image/gif,image/png,image/jpeg,image/x-wap-wbmp,image/tiff,text/csv,text/html,text/plain,text/xml
DocumentUpload Uploading document content via the document finder or manager application/msword,application/pdf,application/vnd.openxmlformats-officedocument.spreadsheetml.sheet, application/vnd.openxmlformats-officedocument.wordprocessingml.document, application/vnd.ms-excel,application/vnd.ms-word,application/x-tika-msoffice,application/x-tika-ooxml,application/xml, image/gif,image/png,image/jpeg,image/x-wap-wbmp,image/tiff,text/csv,text/html,text/plain,text/xml
IntegrationManager Uploading integration files. application/xml, application/xslt+xml, text/csv, application/zip, text/plain
MigrationProjectUpload Uploading a migration project zip file

application/zip, application/x-zip-compressed, application/x-zip, multipart/x-zip

(ZIP files can only contain text/plain and application/xml files)

MobileLogFile Uploading log files from mobile app text/plain (contained in application/zip)
OptimizeCSV A container optimization problem staged as XML application/xml, text/plain
OutXmlTemplate Uploading an Out XML Template application/xml, text/plain
ProcurementAttachment Attachments uploaded as part of procurement bidding application/vnd.openxmlformats-officedocument.spreadsheetml.sheet, application/vnd.ms-excel, application/x-tika-msoffice, application/x-tika-ooxml, application/pdf
ProcurementBid Carrier response data for procurement application/vnd.openxmlformats-officedocument.spreadsheetml.sheet, application/vnd.ms-excel, application/x-tika-msoffice
RateMaintenance Uploading Rates application/vnd.openxmlformats-officedocument.spreadsheetml.sheet, application/vnd.ms-excel, application/x-tika-msoffice, application/x-tika-ooxml
ReportExternal Retrieving report content from an external server other than BI Publisher text/html,application/pdf,application/rtf,application/msword,application/vnd.ms-excel,application/vnd.openxmlformats-officedocument.spreadsheetml.sheet,text/csv,text/plain,text/xml,application/xml,application/x-tika-msoffice,application/x-tika-ooxml
ReportExternalBIPublisher Retrieving report content from a BI Publisher server text/html,application/pdf,application/rtf,application/msword,application/vnd.ms-excel,application/vnd.openxmlformats-officedocument.spreadsheetml.sheet,text/csv,text/plain,text/xml,application/xml,application/x-tika-msoffice,application/x-tika-ooxml
StylesheetContent Uploading stylesheet content for notification application/xslt+xml, application/xml, text/plain
TransmissionIntegration Transmission content sent in through the integration servlets (i.e. DirLoadServlet, WMServlet, etc.) application/xml, text/plain

Additional File Content Analysis Information

Additional File Content Validation Information

Description Default
Default not allowed "Dangerous" file content types blocked to prevent risky files from being uploaded. application/ecmascript, application/javascript, application/vnd.debian.binary-package, application/x-executable, application/vnd.microsoft.portable-executable, application/vnd.ms-office.activeX+xml, application/x-msdownload, application/x-sh, application/x-perl, application/x-python, application/x-python2.7,application/x-python3, application/java, application/java-byte-code,application/x-java-class, application/java-archive, application/jar, text/ecmascript, text/javascript, text/x-jsp, text/x-python
  • The MIME types are in standard type/subtype format (as shown in the defaults above) and only in lowercase.
  • Note that Microsoft Office uses a common internal file structure for all Office document types (Word, Excel, PowerPoint, etc., because you could embed an Excel spreadsheet into a Word document or similar Microsoft Object Linking and Embedding), so the different document types cannot be told apart by the File Content Analysis code. The Apache Tika library used by File Content Analysis uses a metatype to generically identify Microsoft Office documents, which is the MIME type that will actually be matched. The specific Office document subtypes are only included in the default allowed MIME type lists for documentation purposes and possible future support from a newer version of Tika in a future release of Oracle Transportation and Global Trade Management Cloud.
  • Conversely, Microsoft radically changed this common Office file structure for Office 2017 and later, so a different metatype identifies the newer files (application/x-tika-ooxml versus Office 2013 and earlier version's application/x-tika-msoffice MIME type).
  • The several use cases which accept XML files are somewhat tolerant of "loose" compliance with the XML specification, but files that are not strictly compliant with the XML specification may not be identified correctly as the application/xml MIME type. The text/plain MIME type was included in these use cases so such non-strict files will still be accepted by Oracle Transportation and Global Trade Management Cloud while still rejecting files that are blatantly not XML.
  • CSV (comma-separated values) files cannot be specifically identified, as any text file could have line breaks and commas in it. text/plain is used to identify such files and reject more complex file structures.