Knowledge Base

A knowledge base, popularly called a model in data mining terminology is a compact representation of the knowledge or patterns present in a data set.

Oracle Spend Classification creates the standard type of knowledge base built using the Support Vector Machine algorithm. A knowledge base is created based on a taxonomy, and you can create a knowledge base using only one taxonomy. When a knowledge base is built using a specific data mining algorithm, it understands the knowledge or patterns present in the training data set through inspection. During classification, a knowledge base uses the scoring method to apply the learning and make predictions. To know more about the various classification algorithms, data mining operations, and terminology, see the Oracle Advanced Analytics guide.

Create a Knowledge Base

To create a knowledge base:

  1. Go to the Configuration page > Knowledge Base tab and click Create Knowledge Base.

  2. In the Create Knowledge Base dialog box, provide a name for the knowledge base.
  3. Select one or more data sets that you want to use as training data for your knowledge base.
  4. Select the taxonomy you had created for this data set.
  5. Select the attributes including additional classification attributes for which you want to incorporate phrase-based learning.

    The classification process treats the value of these attributes as a whole string rather than as separate keywords and compares them with other values from the same column. For example, Vision Corporation will be considered as a single phrase and compared only with other supplier names. An invoice with a supplier name such as Vision College won't be considered a match for this training.

  6. Click Create. The status is set to Complete after the knowledge base is created.

You can view the knowledge base you created in the Configuration page, on the Knowledge Base tab. After the knowledge base is created, the status is set to Complete.

Improve a Knowledge Base

You could improve a knowledge base for these reasons:

  • To improve the accuracy of classification: You might find that the knowledge base isn't classifying some transactions properly; in the way you intended the transactions to be classified. Upon further analysis, you might see that the knowledge base isn't picking up the keywords correctly, or providing wrong importance to some keywords. This might indicate that the transactions used for training the knowledge base aren't sufficient enough to help the knowledge base in identifying the patterns correctly.
  • To support new categories: As time passes, organizations might start procuring some new goods or services, which might be classified using new category codes. Since these category codes weren't part of the training data set earlier, it's but obvious that the knowledge base won't classify such transactions accurately.

Before you start the knowledge base improvement process, prepare the data sets that you will use. Download the applicable data set, make manual corrections, append, or delete transactions as per your requirements, and upload the data set. After a batch is approved, use the Create data set using manual corrections option to collect all the corrections done on the batch and create a training data set. This training data set along with the original training data sets can be used to improve the knowledge base.

Here are the steps to start the improvement process:

  1. On the Knowledge Base page, click the menu icon for a knowledge base and click Improve Knowledge Base.
  2. In the Improve Knowledge Base dialog box, the original data sets that were used for creating that knowledge base are prepopulated. Add or remove data sets to improve the knowledge base and click Improve.
    Note: Improving the knowledge base isn't an incremental process, but a complete refresh of the knowledge base. The knowledge base is built from scratch using the data sets selected for the improvement process. So use the original training set along with the incremental training sets.
  3. Select the attributes including additional classification attributes for which you want to incorporate phrase-based learning.

    The classification process treats the value of these attributes as a whole string rather than as separate keywords and compares them with other values from the same column. For example, Vision Corporation will be considered as a single phrase and compared only with other supplier names. An invoice with a supplier name such as Vision College won't be considered a match for this training.

  4. Click Improve. In the menu icon of the knowledge base, click View Activity Log to track the progress of the knowledge base improvement process.

There's usually a time lag between knowledge base creation and knowledge base improvement. The frequency of improvement depends on your requirement.