Skip Headers
Oracle® Enterprise Data Quality for Product Data Knowledge Studio Reference Guide
Release 5.6.2

Part Number E23610-03
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
View PDF

12 Classifying Data Further

This chapter describes techniques and information related to the classification of data using the Knowledge Studio.

Advanced Classification Rule Functions

This section describes the various classification functions that you can use to narrow the classification of your data, including an example.

Addition

The Addition function is intended to include two or more grammars whose union defines the classification of the item.

For example, 'Power Nail Stapler' versus 'Paper Stapler'. Stapler may be enough to classify the office product Stapler though an additional attribute is needed to correctly classify Power Nail Stapler as a power tool. So you would include Nail with the item Stapler.

Masking

The Masking function is intended to disqualify all grammars below the masked phrase. It is typically used for inclusions that are not part of the primary item to be classified.

For example, 'Drill with Charger'. Here the item is Drill and not Charger. An example follows showing the use of Masking and the associated grammars.

Negation

The Negation function is intended to disqualify all grammars where the inclusion or preposition is implied but not stated.

For example, 'Toner Cartridge HP Printer'. The item is a Toner Cartridge not a printer.

Parent

The Parent function is intended to reference a grammar at a higher level in the classification tree. Its application is to apply inheritance from the high level to a lower level where other discriminating attributes are defined.

For example, resistors contain both Fix and Variable types. The [product_resistor] term would reside at the resistor level in the schema and variable or fixed would reside at a lower level in the tree. The connection between [product_resistor] and [attr_variable] is through the term $parent + [attr_variable] where $parent references [product_resistor. This is useful for bulk classifying data at a higher level to get an initial classification, and then refining the classification at a later stage.

Function Example

This example shows a collection of items that pose complications in classification. The use of the previously described functions removes ambiguities allowing each item to be uniquely classified.

Surrounding text describes funcex1.jpg.

Add Mask

To add masking drag the root level grammar to be masked over the mask icon at the top of the classification tree.

All grammars that appear under this grammar will then be hidden from further classification. Any grammar that is hidden under the masked grammar though is visible in other phrase structures can be used for classification.

Add Negation

To add negation, hold down the control and shift keys together while dragging the negated phrase next to the associated primary class item.

Surrounding text describes funcex2.png.

Add a Parent

To add a parent, use the following steps:

  1. Drag the primary item to the upper level classification node

  2. Drag the secondary item to the lower level classification node.

  3. Right-click on the secondary item and select Add Parent.

Multiple Classification Schemas

You can configure as many classification types as you need using the Classification Type feature. The benefit is that the phrases and terms defined in a single data lens project can be reused to define any number of classification results.

For example, you may want to classify data to an UNSPSC schema and simultaneously to an eCl@ss schema or to a user-defined schema.

The process for creating a new Classification Type. For more information, see "Classification Type".

You should apply the following considerations when configuring multiple classification schemas:

Classification Type Upgrade

At some point, you may have the need to upgrade from one classification type to a newer version of that same classification type. For example, you can upgrade from UNSPSC classification version 9.2 to the newer version 11.1. You can upgrade to a newer classification type and retain all of the knowledge in the previous version.

Upgrading from one classification type to a new type requires basing the new type on the existing type. When using the process to create a new classification type, ensure that you select the Base classifications on other classifications check box so that you can select the appropriate schema to base the new type on from the Based On: list. For more information, see "Classification Type".

Surrounding text describes clasfurbaseon.png.

If Knowledge Studio encounters a classification mapping that existed in the previous version that no longer exists in the new version, a message is display that indicate the nature of change in category structure.

Both classification schemas are loaded and you can toggle between the two by using the black arrows.

User Defined Classification Types

The Knowledge Studio allows you to create your own custom schema that can be used to auto-classify; this is known as a User-Defined Classification Type.

Creating a User-Defined Schema Using Excel

The following example creates a Parent/Category/Description schema:

  1. Open Excel and create a new spreadsheet.

  2. In the first row, create a header consisting of the following three columns:

    Surrounding text describes udsch1.jpg.
  3. Enter your schema hierarchy into each column as appropriate. The Parent column can be left blank for the highest tree nodes; however, it must be entered for all children nodes.

    Surrounding text describes udsch2.jpg.
  4. Save the spreadsheet as a comma delimited file, from the File menu, select Save as….

    Surrounding text describes udsch3.jpg.
  5. From the Save as type list, select CSV (comma delimited) (*.csv).

  6. Enter a file name and click Save.

    The schema file you have just created is saved as a comma delimited file.

Creating a User-Defined Schema Using a Text Editor

You can use any text editor to create a comma-delimited file that contains the same information as described in the previous section. Ensure that the first line of this file must contain the following header information:

Parent, Category, Description

A simpler Parent/Category schema can be created in the same manner if you have a classification with no codes or the category is the code.

Creating a User-Defined Classification Type

You create a new User-Defined Classification Type as would any other type though you choose the comma delimited file that you created as the Master Classification File. The following is an example of what a user-defined schema might look like:

Surrounding text describes gloclass.jpg.

For information about creating Classification Types, see "Creating a Classification Type".

Global Classification Schemas

When creating user-defined classification schemas, you can make this it global. This allows the update by a single user of the classification schema and the changes are made available to any data lens that is using that user-defined classification schema.

When adding a new user-defined classification type, ensure that you select the Global Classification File check box.

Surrounding text describes clasfurglobal.png.

For information about creating Classification Types, see "Creating a Classification Type".

Classification System Support

The Knowledge Studio supports extensions to the UNSPSC part classification system at the vendor specific level (level 5). Consult the Oracle Consulting Services customer training for details on how to create these extensions.

Data Lens Classification System File Format

If your company has proprietary or internally developed classification systems, the Knowledge Studio has a format that allows these schemas to be imported. This format allows up to five levels of classification hierarchy. Consult the Oracle Consulting Services customer training for details on how to create these extensions.