Skip Headers
Oracle® Enterprise Data Quality for Product Data AutoBuild Reference Guide
Release 11g R1 (11.1.1.6)

Part Number E29129-02
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

1 Overview

The AutoBuild application is included in the Services for Excel additionally installed (add-in) product, which provides a custom toolbar that is added to Excel. For more information about this product, see Oracle Enterprise Data Quality for Product Data Services for Excel Reference Guide.

Autobuild can rapidly leverage your existing product information to create initial data lenses specific to your enterprise content. The use of Oracle Enterprise Data Quality for Product Data (EDQP) Smart Glossaries allows cross-domain knowledge to automatically be recognized in your data lenses. For example, a company with writing instruments inventory may have structured content containing information about each product category. This data knowledge can be transformed from an Excel worksheet into a data lens by the AutoBuild Application saving a significant amount of effort and cost.

Autobuild constructs the initial data lens using the structured category and attribute information. Given sufficient information, the AutoBuild application can accomplish the following:

Surrounding text describes abdiag.png.

The AutoBuild application significantly reduces the level of effort required to:

AutoBuild offers a familiar, easy-to-use graphical wizard interface to step you through the process from start to finish.

Quantity of Structured Information

AutoBuild can use as much or as little structured information as you have available. If you only have a list of categories, AutoBuild creates a base data lens that contains a set of conforming Item Definitions from the provided categories. If you have attribute names associated with the categories, then AutoBuild can add attributes to the data lens Item Definitions and create the associated phrase rules. If you have attribute values associated with the attributes, then AutoBuild creates terminology (term) rules to recognize the attribute information and automatically associate the phrases with the correct attributes in the Item Definitions. AutoBuild makes the best use of any structured product information you provide.

AutoBuild works best when you provide a few examples of structured items for each category. These example items should represent the product categories and attributes you want to capture in your enterprise data.

AutoBuild and Smart Glossaries

Enterprise DQ for Product Smart Glossaries are data lenses designed to be applied to a broad range of data domains. The Smart Glossaries delivered with the product recognize materials, colors, and units of measure for example. Each Smart Glossary is designed to be imported into an existing data lens, or can be used as the basis for creating a new data lens. Individual Smart Glossaries can be combined together to create a template that is used in the Autobuild process. Smart Glossaries can be created and edited using the Knowledge Studio. For more information about Smart Glossaries, see Oracle Enterprise Data Quality for Product Data Knowledge Studio Reference Guide.

When AutoBuild creates a new data lens from your structured item content, a combination of Smart Glossaries is used as the basis for the generated data lens.

Default Smart Glossary

The default Smart Glossary used by AutoBuild is named DLS_Import_Template. This Smart Glossary is installed along with the other pre-packaged Smart Glossaries on the server. This Smart Glossary is a composite of glossaries from units of measure, counts, colors, materials and finishes, and product packaging.

Each time you create a data lens using Autobuild, you can select a different Smart Glossary to be used in the autobuild process. You can also create and configure your own combination of Smart Glossaries that are most applicable to your domain data for use by AutoBuild. Although only one Smart Glossary can be applied when creating a data lens with AutoBuild, additional Smart Glossaries can be imported after the data lens has been autobuilt and opened.

The Smart Glossary used by AutoBuild to generate a data lens is also known as a data lens template.

Note:

The default Smart Glossary, DLS_Import_Template, is automatically available. You must check out any other Smart Glossary you want to use as a template to generate your data lens, from your Oracle DataLens Server. For information on Smart Glossaries, see Oracle Enterprise Data Quality for Product Data Knowledge Studio Reference Guide.

Smart Glossary Foundation for Generated Data Lens

The Smart Glossary used by AutoBuild provides all of the initial settings for the generated data lens. Standard data lens options found in the Knowledge Studio options are copied from the selected Smart Glossary to the new data lens. Transformation types are copied from the Smart Glossary to the generated data lens. This includes the classification types, standardization types, and match types. Unit Conversion rules are also copied to the generated data lens.

AutoBuild combines your structured metadata and the selected Smart Glossary to create a new data lens that contains the generated item definitions, attributes, and associated attributes and values. If there is overlap between the Smart Glossary phrases and terms and the phrases and terms in the metadata, AutoBuild may use the Smart Glossary rules and structure instead of those found in the metadata. The primary exception to this rule is if the metadata attribute values are members of valuesets. This avoids creating ambiguous terms and phrases in the data lens. Additionally, AutoBuild prunes unnecessary rules so that there are no unintended collisions between rules.

Replacing Item Definition Model Phrase structures with Smart Glossary Phrase Structures

AutoBuild can drill down into the structure of the generated phrases and replace any units of measure and count type phrase productions discovered in the structured item information with the EDQP standard phrase structures from the Smart Glossary. One benefit of using the Smart Glossary version of these phrase structures is that the Smart Glossary phrase structures have already been standardized. Except when valid values and units of conversion have been specified in the metadata, the AutoBuilt data lens will inherit the unit conversion rules from the Smart Glossary.

Prerequisites to Using AutoBuild

To use the AutoBuild application successfully, the following prerequisites must be met:

PIM Metadata Prerequisites

There are specific PIM metadata prerequisites you must observe when using AutoBuild with Oracle PIM systems. Use one of the following guides to identify the PIM metadata prerequisites for the PIM system you are using:

Oracle Enterprise Data Quality for Product Data Fusion PIM Integration Implementation and User's Guide at

http://docs.oracle.com/cd/E35636_01/doc.11116/e29148/toc.htm

Oracle Enterprise Data Quality for Product Data R12 PIM Connector User's Guide at

http://docs.oracle.com/cd/E35636_01/doc.11116/e29140/toc.htm

The remaining sections in this chapter should be used as referential material only.

Non-PIM Metadata Prerequisites

In order to make efficient use of the AutoBuild application, you must work with your metadata in the form of Excel worksheets that contain some or all of the following information about your company's products:

Category or Item Class

The categories that describe the item data you wish to process. The categories may be hierarchically grouped though this is not essential.

Item Form, Fit, and Function Attributes

Examples of these attributes include color, weight, size, material, packaging, and so on.

Valid Attributes

Examples of those attribute values that are valid for your data.

You can expect maximum benefit from the AutoBuild application when your structured item examples contain attribute values that are expressed in full, unabbreviated form. The Enterprise DQ for Product can leverage these full-form examples and automatically recognize a broad range of term abbreviations and variations in phrasing.

Identifying Structured Item Information

The main sources of structured item information that can be used with AutoBuild are the following:

  • Item categories

  • Item names and types

  • Item brand information

  • Item attributes

  • Attribute values

The following figure provides an example of each of the above types of structured item information from a sample metadata file:

Surrounding text describes samp_data_file.jpg.

Given the structure of the information in the preceding example, AutoBuild can generate a data lens similar to the following:

Surrounding text describes createdlens.png.

Sources of Structured Item Content

The source of structured item information could be any of the following:

  • Existing electronic product catalog information

  • Item master information

  • Product information management (PIM) structures

  • Category management or product marketing worksheets

  • Existing eCommerce site information

  • Structured examples used to define your current product categories

Supported Structured item Information Formats

Product information exports can vary greatly among systems and database schemas. As a result, AutoBuild is designed to support a wide variety of product information export formats.

Supported Category Formats
  • Multiple column category names (each category column represents one level in a classification hierarchy).

  • Multiple column category code/name pairs with the category columns grouped in pairs. The first column in the pair is a category code and the second column in the pair is a category name; each column pair represents one level in a classification hierarchy.

  • Single column category names (single level classification hierarchy).

  • Single column category names that are character separated (multiple level classification hierarchy with a character string separating each category name).

  • Single column UNSPSC Category Codes.

Supported Attribute Formats
  • Attribute names listed in the same row with the category information.

  • Attribute name/value pairs listed in same row with the category information. A third column can be used to identify a valueset so that only the values present in the metadata worksheet are used to create term rules.

  • Attribute name/value/unit of measure (UOM) triplets listed in same row with the category information.

  • Attributes in the same file as the category information; attribute names listed as the column headers.

  • Attributes in a separate file from the category information; attribute names listed as the column headers.

  • Multiple categories in a single worksheet with each category grouped as a distinct set of rows separated by a blank line from another category group. Attributes are in same file as the category information, attribute names listed as column headers.

  • Category and attribute information can repeat across multiple rows within the same worksheet. In addition, category and attribute information can be listed across multiple worksheets in a workbook.