12 Classifying Data Further

This chapter describes techniques and information related to the classification of data using the Knowledge Studio.

Advanced Classification Rule Functions

This section describes the various classification functions that you can use to narrow the classification of your data, including an example.

Addition

The Addition function is intended to include two or more grammars whose union defines the classification of the item.

For example, 'Power Nail Stapler' versus 'Paper Stapler'. Stapler may be enough to classify the office product Stapler though an additional attribute is needed to correctly classify Power Nail Stapler as a power tool. So you would include Nail with the item Stapler.

Masking

The Masking function is intended to disqualify all grammars below the masked phrase. It is typically used for inclusions that are not part of the primary item to be classified.

For example, 'Drill with Charger'. Here the item is Drill and not Charger. An example follows showing the use of Masking and the associated grammars.

Negation

The Negation function is intended to disqualify all grammars where the inclusion or preposition is implied but not stated.

For example, 'Toner Cartridge HP Printer'. The item is a Toner Cartridge not a printer.

Parent

The Parent function is intended to reference a grammar at a higher level in the classification tree. Its application is to apply inheritance from the high level to a lower level where other discriminating attributes are defined.

For example, resistors contain both Fix and Variable types. The [product_resistor] term would reside at the resistor level in the schema and variable or fixed would reside at a lower level in the tree. The connection between [product_resistor] and [attr_variable] is through the term $parent + [attr_variable] where $parent references [product_resistor. This is useful for bulk classifying data at a higher level to get an initial classification, and then refining the classification at a later stage.

Function Example

This example shows a collection of items that pose complications in classification. The use of the previously described functions removes ambiguities allowing each item to be uniquely classified.

Add Mask

To add masking drag the root level grammar to be masked over the mask icon at the top of the classification tree.

All grammars that appear under this grammar will then be hidden from further classification. Any grammar that is hidden under the masked grammar though is visible in other phrase structures can be used for classification.

Add Negation

To add negation, hold down the control and shift keys together while dragging the negated phrase next to the associated primary class item.

Add a Parent

To add a parent, use the following steps:

Drag the primary item to the upper level classification node
Drag the secondary item to the lower level classification node.
Right-click on the secondary item and select Add Parent.

Classifying to Several Schemas in a Data Lens

You can configure as many classification types as you need using the Classification Type feature.

For example, you may want to classify data to an UNSPSC schema and simultaneously to an eCl@ss schema or to a user-defined schema.

You should apply the following considerations when using several schemas in a single data lens:

When creating a name for the new classification type, you should include the classification version number information in the name to enable differentiation. For example, when using UNSPSC 11.1, use a name that is similar, like UNSPSC_11_1.
To reuse the rules already created in a previous classification type, select the Base classifications on other classifications check box, and then select the classification type on which you want to base the new type.

See "Classification Type".

Upgrading a Classification Type

At some point, you may have the need to upgrade from one classification type to a newer version of that same classification type. For example, you can upgrade from UNSPSC classification version 9.2 to the newer version 11.1. You can upgrade to a newer classification type and retain all of the knowledge in the previous version.

Upgrading from one classification type to a new type requires basing the new type on the existing type. When using the process to create a new classification type, ensure that you select the Base classifications on other classifications check box so that you can select the appropriate schema to base the new type on from the Based On: list. See "Classification Type".

Surrounding text describes clasfurbaseon.png.

If Knowledge Studio encounters a classification mapping that existed in the previous version that no longer exists in the new version, a message is display that indicate the nature of change in category structure.

Both classification schemas are loaded and you can toggle between the two by using the black arrows.

Creating User-Defined Classification Types

The Knowledge Studio allows you to create your own custom schema that can be used to auto-classify; this is known as a User-Defined Classification Type. You can use one of the template schemas as a master classification file when creating your new classification type or you can modify the examples in Excel directly to add your data. You must use the correct header row as shown from one of the preceding templates.A set of Excel (.cvs) templates are delivered with Knowledge Studio as follows:

UserDefined_Parent_Child_IDs_Template: Defines parent and child Item Definitions by ID.
UserDefined_Parent_Child_Names_Template: Defines parent and child Item Definitions by name.
UserDefined_All_Levels_Names_Template: Defines all levels of an Item Definitions by name.
UserDefined_All_Levels_Names_IDs_Template: Defines all levels of an Item Definitions by name then ID.
UserDefined_All_Levels_IDs_Template: Defines all levels of an Item Definitions by ID.
UserDefined_All_Levels_IDs_Names_Template: Defines all levels of an Item Definitions by ID then name.

These Excel template files are installed on the user's system in the C:\Users\user_name\AppData\Roaming\DataLens\system\schemas directory.

Creating a User-Defined Schema Using Excel

The following example creates a parent and child id schema:

Open Excel to a new worksheet or open the UserDefined_Parent_Child_IDs_Template.cvs template file.
In the first row, ensure that the first row of the columns A - C are Parent, Category, and Description respectively as in the following:
Enter your schema hierarchy into each column as appropriate. The Parent column can be left blank for the highest tree nodes; however, it must be entered for all children nodes as in the template.
Save the spreadsheet as a comma delimited file, from the File menu, select Save as….
From the Save as type list, select CSV (comma delimited) (*.csv).
Enter a file name and click Save.

The schema file you have just created is saved as a comma delimited file. You can use it to create a new classification type. See "Classification Type".

Creating a User-Defined Schema Using a Text Editor

You can use any text editor to create a comma-delimited file that contains the same information as described in the previous section. Ensure that the first line of this file must contain the following header information:

Parent, Category, Description

A simpler Parent/Category schema can be created in the same manner if you have a classification with no codes or the category is the code.

Creating a User-Defined Classification Type

You create a new User-Defined Classification Type as would any other type though you choose the comma delimited file that you created as the Master Classification File. The following is an example of what a user-defined schema might look like:

Surrounding text describes gloclass.jpg.

For information about creating Classification Types, see "Creating a Classification Type".

Global Classification Schemas

You can create a user-defined schema that is global. This allows the update by a single user of the classification schema and the changes are made available to any data lens that is using that user-defined classification schema.

When adding a new user-defined classification type, ensure that you select the Global Classification File check box.

Surrounding text describes clasfurglobal.png.

For information about creating Classification Types, see "Creating a Classification Type".

Classification System Support

The Knowledge Studio supports extensions to the UNSPSC part classification system at the vendor specific level (level 5).