2Overview of Data Quality
Overview of Data Quality
This chapter provides an overview of data quality functionality and products for Siebel CRM and Oracle Customer Hub. It includes the following topics:
Data Profiling
Data profiling typically provides profiling capabilities that are set in an application specifically designed to give control of data quality processes to business information owners, such as data analysts and data stewards. Data profiling also provides data analysis, reporting, and monitoring capabilities.
When data quality is measured, it can be effectively managed. Data profiling provides the metrics and reports that business information owners need to continuously measure, monitor, track, and improve data quality at multiple points across the organization. Data profiling also enables business information owners and IT (information technology) to work together to deploy lasting data quality programs. Business information owners use data profiling to build data quality rules and define data quality targets together with the IT team, which then manages deployment enterprise-wide.
You can use data profiling to:
Analyze and rank data according to completeness, conformity, consistency, duplication, integrity, and accuracy (you must use rules and reference data to analyze and rank data).
Identify, categorize, and quantify low-quality data
For more information about data profiling and Oracle data profiling offerings, see Oracle Fusion Middleware Upgrade Guide for Oracle Data Integrator 11g Release 1 on Oracle Technology Network (http://www.oracle.com/technetwork/indexes/documentation/index.html).
Data Parsing and Standardization
Data parsing and standardization typically provides data standardization capabilities, enabling data analysts and data stewards to standardize and validate their customer data. An interface is usually included which can be used to design, build, and manage data quality efforts.
The solution offers data parsing and standardization capabilities that can be used to:
Standardize, validate, enhance, and enrich your customer data
Standardize and validate mailing addresses for a wide range of countries
Parse and standardize freeform text data elements (you must use rules and reference data dictionaries to parse and standardize freeform text data elements.)
For more information about data parsing and standardization and Oracle offerings within the data parsing and standardization arena, see Oracle Fusion Middleware Upgrade Guide for Oracle Data Integrator 11g Release 1 on Oracle Technology Network (http://www.oracle.com/technetwork/indexes/documentation/index.html).
Data Matching and Data Cleansing
The data stored in account, contact, and prospect records in Oracle’s Siebel Business Applications represents your existing and potential customers. Because of the importance of this data, maintaining its quality is essential. To ensure data quality, functionality is provided to clean this data and to remove duplicated data.
Data Cleansing
Data cleansing is used to correct data and make data consistent in new or modified customer records and typically consists of the following functions:
Automatic population of fields in addresses. If a user enters valid values for Zip Code, City, and Country, data quality automatically supplies a State field value. Likewise, if a user enters valid values for City, State, and Country, data quality automatically supplies a Zip Code value.
Address correction. Data quality stores street address, city, state, and postal code information in a uniform and consistent format, as mandated by U.S. postal requirements. For recognized U.S. addresses, address correction provides ZIP+4 data correction and stores the data in certified U.S. Postal Service format. For example, 100 South Main Street, San Mateo, CA 94401 becomes 100 S. Main St., San Mateo, CA 94401-3256.
Capitalization. Based on configuration, data quality converts fields for account, contact, prospect, and address to mixed case, all lowercase, or all uppercase.
Standardization. Data quality ensures account, contact, and prospect information is stored in a uniform and consistent format. For example, IBM Corporation becomes IBM Corp.
Data cleansing is supported for the Account, Business Address, Contact, and List Mgmt Prospective Contact business components. For each business component, particular fields are used in data cleansing and this set of fields is configurable.
Data Matching
Data matching is the identification of potential duplicates for account, contact, and prospect records. Potential duplicate records are displayed in the Siebel application allowing you to manually merge duplicate records into a single record.
Data matching is supported for the Account, Contact, and List Mgmt Prospective Contact business components. For each business component, a set of fields is used for comparisons in the data matching process. The set of fields is configurable, and you can also specify other matching preferences such as the degree of matching required for records to be identified as potential duplicates.
In data quality you can enable and use both data cleansing and data matching at the same time, or you can use data cleansing and data matching on their own.
Data Quality Products for Data Matching and Data Cleansing
The data quality products available for performing data quality functions within Siebel CRM enterprise and Oracle Customer Hub are divided into two categories:
Data quality products that are embedded into Siebel CRM enterprise and Oracle Customer Hub
Data quality products that use an open connector to connect to third-party data quality vendors
Embedded Data Quality products
The data quality products that are embedded into Siebel CRM and Oracle Customer Hub for data matching and cleansing are:
Oracle Data Quality Matching Server. Provides real-time and batch data matching functionality using licensed third-party Informatica Identity Resolution software with functionality from Informatica Identity Resolution. For more information, see Oracle Enterprise Data Quality Matching Server.
Oracle Data Quality Address Validation Server. Provides address validation and standardization functionality using licensed third-party Informatica Identity Resolution software with functionality from Informatica Identity Resolution. For more information, see Universal Connector.
Oracle Enterprise Data Quality Matching Server. Provides real-time and batch data matching functionality using licensed Oracle Enterprise Data Quality Matching Server. For more information, see Oracle Enterprise Data Quality Matching Server.
Oracle Enterprise Data Quality Address Validation Server. Provides address validation and standardization functionality using licensed Oracle Enterprise Data Quality Address Validation Server. For more information, see Oracle Enterprise Data Quality Address Validation Server.
Open Connector to Third-Party Data Quality Vendors
The Universal Connector provides real-time and batch data matching functionality and data cleansing functionality, as long as the associated third-party software also supports data cleansing.
If using a third-party data quality vendor for data matching, then Siebel Data Quality is mandatory (since Siebel Data Quality has the underlying infrastructure for enabling data quality). Integration between the Siebel application and the third-party data quality vendor is not possible without Siebel Data Quality.
Siebel Data Quality is a user based license, containing the underlying infrastructure and business services for enabling data quality. All Siebel CRM data quality users must license data quality at the user level using Siebel Data Quality.
Related Topic
Oracle Data Quality Matching Server
The Oracle Data Quality Matching Server provides real-time and batch data matching functionality using licensed third-party IIR software.
The Oracle Data Quality Matching Server is an identity search application that searches your identity data, finds duplicates in it, and matches any duplicates found to other identity data. Running as an application server or suite of servers, Oracle Data Quality Matching Server does the following:
Reads identity data from your databases, using specified instructions and permissions.
Does not change your data but instead keeps a copy of it, thereby ensuring data consistency.
Builds the SSA_NAME3 fuzzy indexes, thereby enabling the correct identity data to be found.
Provides several simple search client procedures including, single search, batch search, and duplicate finder.
About Using the Oracle Data Quality Matching Server
You can use theOracle Data Quality Matching Server to do the following:
Perform real-time search for people, companies, contacts, addresses, and households.
Discover duplicates and establish relationships in real time.
Build relationship link tables.
Match external files and databases.
The Oracle Data Quality Matching Server connector uses the Universal Connector in a mode where match candidate acquisition takes place within the Oracle Data Quality Matching Server, not within Siebel CRM. Since the match keys are generated and stored within the Oracle Data Quality Matching Server, key generation and key refresh operations are eliminated within Siebel CRM. This integration, whereby match candidate acquisition takes place within the Oracle Data Quality Matching Server cannot be used by other third-party data quality matching engines.
For more information about Oracle Data Quality Matching Server installation and configuration, see Process of Installing the Oracle Data Quality Matching Server and Configuring Oracle Data Quality Matching Server.
For more information about IIR, see the relevant documentation included in Siebel Business Applications Third-Party Bookshelf in the product media pack on Oracle Software Delivery Cloud.
Oracle Data Quality Address Validation Server
The Oracle Data Quality Address Validation Server is an address standardization application that provides capabilities to parse, standardize, transliterate, duplicate, and validate address data, resulting in improved address data quality. The validation capability requires the licensing of appropriate postal directories for the countries where address validation is required.
The Oracle Data Quality Address Validation Server uses a licensed version of the third-party software, IIR, for data cleansing.
Features of Oracle Data Quality Address Validation Server are:
Integrated single API supporting all countries:
Oracle Data Quality Address Validation Server lets you use a single API for all countries, so that you can start working immediately and add countries without the need for additional programming. The API is compatible with all major programming languages.
Advanced validation, and correction of worldwide postal addresses, including address coverage for more than 240 countries:
Oracle Data Quality Address Validation Server matches and corrects all address data, filters out superfluous information, assesses deliverability, and generates a detailed report with suggestions for possible sources of address problems.
Parsing and standardization:
Oracle Data Quality Address Validation Server parses both structured and unstructured data, identifies residues, and formats and standardizes the data (without the need for payment of special data license fees).
Convenient updating:
Postal reference tables in many countries change frequently. Oracle Data Quality Address Validation Server has arrangements with many local postal organizations (including Informatica Address Doctor) that allows you to receive monthly, quarterly, or biannual updates. Reference tables for each country are provided in a separate, operating system-independent database that is easy to update from a CD, DVD, or by downloading over the Internet.
The Universal Connector is integrated with the Oracle Data Quality Address Validation Server for data cleansing.
About Using the Oracle Data Quality Address Validation Server
You can use the Oracle Data Quality Address Validation Server to cleanse data on account, contact, and prospect data from the UI in your Siebel application, or by running a batch job in Siebel CRM. You can also cleanse the data in EAI mode by sending in the address data in Simple Object Access Protocol (SOAP) format.
When you enter a new address using the contact or account screen in your Siebel application, all address data is validated, cleansed, and standardized before being committed to the Siebel database. If the address cannot be validated, then the address is standardized by using the Upper, Lower, or Camel case (depending on Oracle Data Quality Address Validation Server configuration). In addition, the account name, contact name, and other attributes are standardized.
When new contacts, accounts, or addresses are entered into Siebel CRM through a batch job, address standardization is applied before committing any records to the Siebel database.
In all cases:
The Oracle Data Quality Address Validation Server evaluates and modifies the record according to configuration.
Oracle Data Quality Address Validation Server returns an address validation flag and the validation status.
The Siebel database is then updated with the cleansed data, which has been formatted and standardized with address validation.
In the Siebel application, the updated cleansed record is displayed on the UI.
For more information about Oracle Data Quality Address Validation Server installation and configuration, see Process of Installing the Oracle Data Quality Address Validation Server and Configuring Siebel Business Applications for the Oracle Data Quality Address Validation Server.
For more information about IIR, see the relevant documentation included in Siebel Business Applications Third-Party Bookshelf in the product media pack on Oracle Software Delivery Cloud.
Oracle Enterprise Data Quality Matching Server
Oracle Enterprise Data Quality Match and Merge provides matching capabilities that allow you to identify matching records and optionally link or merge matched records based on survivorship rules. Flexible rule configuration enables you to tune the rules to suit the task and support an iterative approach. A separate capability for simple reviews allows you to expose the match results for review without accessing the underlying rules configuration. Used in conjunction with other Oracle Data Quality products Oracle Enterprise Data Quality Match and Merge is a flexible solution. Oracle Enterprise Data Quality Match and Merge also includes a connector that enables you to access data in Siebel CRM. Audit capabilities allow you to run data quality rules and flow-control within your data quality processes. Dashboard functionality presents results of the audit processes in graphical format, while real-time Web service functionality enables the assembled data quality process to be called as a real-time service.
Oracle Enterprise Data Quality Address Validation Server
The Oracle Enterprise Data Quality Address Validation Server is an address standardization application that provides capabilities to parse, standardize, transliterate, duplicate, and validate address data, resulting in improved address data quality. The validation capability requires the licensing of appropriate postal directories for the countries where address validation is required. The Oracle Enterprise Data Quality Address Validation Server uses a licensed version of the third-party software, Loqate, for address cleansing.
Universal Connector
The Universal Connector is a connector to third-party software that allows Siebel CRM to use the capabilities of a third-party application for data matching, data cleansing, or both data matching and data cleansing on account, contact, and prospect data within the Siebel application.
The Universal Connector supports data cleansing on account, contact, and prospect data in real-time and batch processing modes. The Universal Connector works across various languages and operating systems, though the support offered by particular third-party software for data matching or data cleansing might not cover all of the languages supported by Siebel Business Applications. For more information about:
Platforms supported, see Siebel System Requirements and Supported Platforms on Oracle Technology Network.
Third-party software, see the relevant documentation included in Siebel Business Applications Third-Party Bookshelf in the product media pack on Oracle Software Delivery Cloud.
To use the Universal Connector, you must obtain, license, and install third-party software in addition to obtaining Siebel Data Quality product licensing. The data matching and data cleansing capabilities of the Universal Connector are driven by the capabilities and configuration options of the third-party software.
The Universal Connector can be used in two different modes:
The Oracle Data Quality Matching Server connector uses the Universal Connector in a mode where match candidate acquisition takes place within the Oracle Data Quality Matching Server. This mode applies only to the Oracle Data Quality Matching Server.
Third-party data quality vendors use the Universal Connector in a mode where match candidate acquisition takes place within Siebel CRM.
You can configure the Universal Connector to specify which fields are used for data cleansing and data matching and their mapping to external application field names.
How Data Quality Relates to Other Entities in Siebel Business Applications
The data quality products integrate into the overall Siebel Business Applications environment from Oracle, as shown in the following image. as follows:
In real-time mode, the Universal Connector is called by interactive object managers such as the Call Center object manager.
In batch mode, the Universal Connector is called by the preconfigured server component, Data Quality Manager (DQMgr), either from the Siebel application user interface, or by starting tasks with the Siebel Server Manager command-line interface, the srvrmgr program. For more information, see Siebel System Administration Guide on Siebel Bookshelf.
Note: The Siebel Bookshelf is available on Oracle Technology Network ( http://www.oracle.com/technetwork/indexes/documentation/index.html) and Oracle Software Delivery Cloud. It might also be installed locally on your intranet or on a network location.The Universal Connector obtains account, contact, and prospect field data from the Siebel database using the Deduplication business service for data matching, and the Data Cleansing business service for data cleansing. Like other business services, these are reusable modules containing a set of methods. Using data quality functionality, business services simplify the task of moving data and converting data formats between the Siebel application and external applications. The business services can also be accessed by Siebel VB or Siebel eScript code or directly from a workflow process.
The fields used in data cleansing and data matching are sent to the appropriate cleansing or matching engine.
Data matching and data cleansing can also be enabled for the Enterprise Application Integration (EAI) adapter and Oracle’s Siebel Universal Customer Master (UCM) products.
For more information about business services and enabling data quality when using EAI, see Integration Platform Technologies: Siebel Enterprise Application Integration.
