Adding Custom Reference Knowledge Files

Add a custom reference knowledge file to your Oracle Big Data Preparation Cloud Service instance. You can use custom reference knowledge files to supplement the service’s knowledge base with specific enrichment classifications that are tailored for your data processing needs.

To add a custom reference knowledge file:
  1. On the Knowledge page, click Create Knowledge.
    The Create Knowledge page appears.
  2. In the Name field, enter a name to identify the new knowledge file.
  3. In the Description field, describe the purpose of the new knowledge file.
  4. Click the Browse button.
    A file browser for your local system appears.
  5. Go to the directory where your knowledge file is located, select it, and then click OK.
    Your knowledge file must be in comma- or tab-delimited format.
    Your knowledge file needs to contain a minimum of one column that serves as the classification key. Optionally, the file can contain one or more additional columns that the Oracle Big Data Preparation Cloud Service processing engine uses as enrichment recommendations when a specific column is classified using this reference knowledge.
    If your knowledge file contains international data such as double-byte characters, then it must be in UTF-8 encoding. To create a UTF-8 knowledge import file:
    1. In an application such as Microsoft Excel, save your data file as Unicode text.
    2. In a file editing utility such as Notepad, open the Unicode-encoded file and save it as UTF-8. You must save your file in UTF-8 encoding to preserve characters in your data throughout the ingestion and repair process.
  6. In the Curation Level field, set the value for the new knowledge file’s curation level.
    By default, curation levels are set to 10 for uploaded custom reference knowledge files. A value of 10 assigns priority to your custom reference knowledge over similar classifications from the Oracle Big Data Preparation Cloud Service processing engine.
    You use curation levels to break ties between data classifications when default and custom reference knowledge domains contain overlapping information. For example, ties may occur between default reference knowledge and custom reference knowledge for City classifications. The higher the curation level you assign, the higher priority that you give to a specific knowledge domain. Therefore, if you want to give preference to a particular custom knowledge reference file over the default knowledge service domains, then raise value of its curation level.
  7. Optionally, select Activate for Use if you want your new knowledge file to be available immediately to process data.
  8. Click Submit.
Your custom reference knowledge file appears on the Knowledge page and the service engine can begin enriching data with it.