3 Common Uses of EDQ-PDS 

EDQ-PDS is designed as a toolkit and starter project for performing profiling, data extraction, and matching on product data. It can be used out of the box, especially for profiling and matching. For extraction and standardization use cases, for example hile parsing and standardizing a Product Description into several standard attributes EDQ provides a number of useful processors. Work will be needed to design suitable extraction and standardization processes for the data as product data is by its nature highly variable. The following section describes scenarios in which the EDQ’s Product Data Services may be used.

Ways To Use EDQ-PDS Project

Providing Attached Matching Services

EDQ-PDS can be used to provide attached matching services (both batch and real-time) to an application or hub storing Product Data records. The matching services provided are designed to work well with either structured or unstructured product data records, and can be tailored and tuned for added accuracy where needed. The following starting points are recommended:

  • Look at the Interfaces, in particular the Product Input interface to which your product data’s fields should be mapped. For more information, see EDQ-PDS Data Interfaces

Optimizing Matching For Your Data

A common use case is that the project will be used as a starter project but with the expectation that the user would like to explore the data, and make significant modifications to the project. In this case the following starting points would be recommended:

  • Look at the Product Input Interface data interface, import and map the product data to this and run it through the EDQ-PDS Profiling Process to explore your data and look for ways to optimize matching, for example by removing anonymous values or tokens, standardizing abbreviations, or extracting key product attributes that may help with matching. Note that it is useful both to gain an initial understanding of the data through profiling, and to run the data through a batch matching process 'raw' in order to determine how matching may be improved.

  • Look at the Standardize Product Data for the provided functionality to prepare the product data for matching and key generation. There are default reference data provided with this process which may be modified if required

  • Look at the Key Generation section for provided Key Methods, and also for details of reference data used to prepare input data specifically for Key Generation.

  • Look at the Matching section for the provided matching functionality. This can either be used straight out of the box, tweaked through modifying the weightings of compound comparisons and match threshold, etc, or used as just a starting point for matching functionality, as required.

Providing a Toolkit To Profile, Extract and Standardize Product Data

  • PDS provides several published processors designed for use with product data, for extracting information from your data and these come with default data as provided in the project. For that reason it is recommended to start with the project as a whole even if starting from a scratch, so that the relevant data is available for modification as required.

  • EDQ includes a number of useful processors for working with Product Data, such as Extract Attributes, Parse, Split Records from Array, and Make Attribute Arrays. For more information, see section Processor Library in Oracle Enterprise Data Quality Online Help in Oracle Enterprise Data Quality documentation at http://docs.oracle.com/en/middleware/

  • It is always recommended to profile any product data sets you are working with, either by using standard EDQ profilers configured from scratch, or by running the data through the provided profiling process, which provides examples of how to analyze key common Product Data attributes such as Product Description. This can provide useful insights into the product data and provide a starting point for extracting or restructuring data. For more information on data preparation, see Data Preparation