Bookshelf v7.8.2: Importing Syndicated Data into Siebel Data Warehouse

Siebel Analytics Applications Installation and Administration Guide > Siebel Data Warehouse for Life Sciences Data Considerations >

Importing Syndicated Data into Siebel Data Warehouse

Pharmaceutical companies purchase weekly and monthly sales and prescription data, known as syndicated data, from third-party vendors such as IMS, NDC, and Cegedim. Syndicated data vendors acquire data from drug wholesalers and retailers on a daily, weekly, and monthly basis and compile a master file of customers (wholesalers, pharmacies, hospitals, and doctors) and sales or prescription transactions for customers. Measures include indirect sales, indirect units, and prescriptions and differ by vendor and periodicity. It is used for sales force analysis reporting and customer targeting.

The data is derived from panels of physicians, pharmacies, and so on, and projected nationally. Since the panels may change on a monthly basis, syndicated data suppliers tend to change their projections of sources on a monthly basis leading to full restatements of historical data. Thus, pharmaceutical companies are required to refresh fully the data in their data warehouses. In addition, weekly data requires incremental loading.

After it is delivered by the vendor, the syndicated data must be fully reloaded into the Siebel Pharma data warehouse in a timely manner and made available to users in order for them to make use of sales force analysis reporting.

This section includes the following topics:

Syndicated Loading Definitions

Table 64 provides terms and definitions related to syndicated loading.

Table 64. Syndicated Loading Definitions
Term	Definition
Syndicated data	Third-party data from vendors that shows sales and prescription results for client defined markets.
Brick	Micro sales geographic area defined by vendor that contains prescription and sales trends for clients' product groupings or markets. Bricks do not exist in the US where the micro sales geographic area is commonly the postal code or zip code.
NRx	Abbreviation of new prescriptions. A new prescription is defined as dispensed prescriptions given a new number by the pharmacy, not necessarily new therapy for the patient.
TRx	Abbreviation of total prescriptions. TRx = NRx + Refills. After the first time a prescription is filled, when it is refilled, the pharmacy refers back to the previous drug ID number and notes this as a refill.
Indirect sales	Total drug wholesalers product sales values to drug retailers (pharmacies) by brick or zip code. Sales values are calculated as units multiplied by client-selected price, where the price is the warehouse acquisition cost unless the client has chosen to have a different price applied to its direct sales or government depot sales.
Indirect units	Total drug wholesalers product sales units to drug retailers (pharmacies) by brick or zip code. Sales unit quantity is determined by multiplying package factor to obtain unit value (for example, one carton x 24 bottles = 24 units).

Data Types Supported in the Siebel Data Warehouse

The Siebel Data Warehouse supports multiple data types defined by data source type and periodicity, as described in Table 65.

Table 65. Data Types Supported in Siebel Data Warehouse
Data Type	Description
Prescription data by contact	Monthly NRx and TRx data for client defined markets that include competitor products data by physician.
Indirect sales brick level	Monthly wholesaler sales and unit values data for client defined markets that include competitor products aggregated by brick.
Indirect sales account level	Monthly wholesaler sales and unit values for company products by pharmacy.
Indirect sales zip level	Monthly wholesaler sales and unit values for client defined markets that include competitor products aggregated by postal code.
Direct sales account level	Monthly direct factory sales data and unit values to wholesalers for company's products by wholesaler.
Direct sales brick level	Monthly direct factory sales data and unit values to wholesalers for company's products by brick.
Direct sales zip level	Monthly direct factory sales data and unit values to wholesalers for company's products by postal code.
Weekly prescription data	Weekly NRx and TRx data for client defined markets that include competitor products by physician.
Plan level prescription data	Monthly prescription data by managed care plan for client defined markets that includes competitor products.
Sales market	Incentives. Monthly incentive compensation data sourced from internal incentive compensation systems and loaded as a new data type in the Siebel Data Warehouse.
Sales market	Modified. Monthly incentive compensation data sourced from internal incentive compensation systems and loaded as a new data type in the Siebel Data Warehouse.
Plan level physician prescription data	Monthly prescription data for physicians associated with a managed care plan.
Prescription data by zip code	Monthly prescription data for client defined markets that includes competitor products aggregated by postal code.
Prescription data by brick	Monthly prescription data for client defined markets that include competitor products aggregated by brick.

Loading Syndicated Data into the Siebel Data Warehouse

The options for loading syndicated data into the Siebel Data Warehouse include the following:

Loading the Siebel Data Warehouse from the Siebel transactional database (OLTP) S_SYND_DATA table.
Customers who are already using the EIM process or SQL Loader to load external syndicated data into the Siebel operational system can use data populated in S_SYND_DATA as the source to the Siebel Data Warehouse. Existing syndicated data from the Siebel transactional database, stored in the S_SYND_DATA table, is extracted, transformed, and loaded by prebuilt routines and populated into the W_SYND_DATA_F and W_SYND_MKT_F fact tables in the Siebel Data Warehouse. W_SYND_DATA_F stores Rx or sales measures by product. W_SYND_MKT_F stores Rx or sales measures by market and allows calculation of market share data.

The data is first copied from external data tapes (or other provided media) into the interface tables for the Siebel Industry Application (SIA), using a native database data loading utility, such as SQL* Loader, and following the rules in Table 66. Using EIM, data is transferred from the Siebel interface tables to predefined destination columns in the base tables of the Siebel Life Sciences database. The application administrator uses SQL*Loader to populate the ID field with a unique sequential value, a process that provides either full or partial table level extraction.

In order to correctly load the Siebel Data Warehouse, there is a set of population rules that are required to load S_SYND_DATA. This will guarantee minimal configuration required on the Analytics side and correct population of the Siebel Data Warehouse.

Once the data is loaded into S_SYND_DATA, you can use the LS SYNDICATED FACT GROUP, in the Pharma subject area, to load the syndicated data into the Siebel Data Warehouse.
Loading from a flat file source using syndicated data staging tables, cross-referencing against extension columns in Siebel base tables.
Syndicated data source files can be loaded directly into Siebel Data Warehouse staging tables. In the staging tables, Siebel row IDs are matched with external sources keys for joining the external source keys with Siebel Analytics keys.

You can cross-reference between Siebel row IDs and the external source keys by using existing cross-reference columns in S_CONTACT and S_ORG_EXT for customer data. For example, the IMS ID for a physician may be populated on the ME number column. The configured ETL will perform a lookup against the new column to resolve the IMS ID to Siebel row ID relationship. The process entails first populating the necessary cross-references in the Siebel base tables. The next step is to configure the SDE ETL to extract from external source data files. The prebuilt SIL ETL performs a lookup against the cross-reference to associate the syndicated data record with the correct Siebel row ID and creates the fact record and warehouse ID with the correct relationship.
Loading product data into the Siebel Data Warehouse by creating a new staging table that looks like the source data and then creating the market aggregation from the product table.
This method requires the most customization. It involves creating a new staging table with columns for each time period metric and configuring the SDE and SIL mappings to load the data directly into the Data Warehouse. You then need to create a mapping to populate the staging table and a new SIL mapping that will use the staging table as a source and the W_SYND_DATA_F table as a target. The W_SYND_MKT_F table can then be loaded from W_SYND_DATA_F.

NOTE: To make sure the data is correctly loaded in the Siebel Data Warehouse, the Siebel source table S_SYND_DATA needs to be populated correctly, based on Table 66.

Table 66. S_SYND_DATA Data Population Rules
Required Codes	Data Types	Required fields (x) and data population rules for correct rollup in Siebel Analytics
Syndicated Data mandatory codes to be populated in DATA_SRC_CD	Data Type supported with DATA_SRC_CODE	Territory ID TERR_ID	Period ID PERIOD_ID	Position ID POSITION_ID	Postal Code ZIPCODE	Account ID OU_EXT_ID	Contact ID CON_ID	Brick ID AREA_ID	Payor Level PLAN_ID Foreign Key to S_INS_PLAN	Product Level PRDINT_ID (always try to populate data at the lowest level - form strength)	Market Level MARKET_ID	Attrbute 1 (Attrib_01)	Attrbute 1 (Attrib_02)	Attrbute 1 (Attrib_03)	Attrbute 1 (Attrib_04)	Siebel Pharma Analytics Business Model Data Source Codes
RXPrf	Prescription data by Contact		x	x			x		Total	Form Strength	x	NRx	TRx	Market NRx	Market TRx	1
SlsIndBrk	Indirect Sales Brick Level		x	x				x	Total	Form Strength	x	Product Sales $	Product Sales Units	Market Sales $	Market Sales Units	22
SlsIndAct	Indirect Sales Account Level		x	x		x			Total	Form Strength	x	Product Sales $	Product Sales Units	Market Sales $	Market Sales Units	21
SlsIndZip	Indirect Sales Zip Level		x	x	x				Total	Form Strength	x	Product Sales $	Product Sales Units	Market Sales $	Market Sales Units	23
SlsDirAct	Direct Sales Account Level		x	x		x			Total	Form Strength	x	Product Sales $	Product Sales Units			11
SlsDirBrk	Direct Sales Brick Level		x	x				x	Total	Form Strength	x	Product Sales $	Product Sales Units			12
SlsDirZip	Direct Sales Zip Level		x	x	x				Total	Form Strength	x	Product Sales $	Product Sales Units			13
RXEVM	Weekly RX Data by Contact		x	x			x		Total	Form Strength	x	Product NRx	Product TRx	Market NRx	Market TRx	0
RXPT	Plan Level Rx Data		x	x					Plan	Form Strength	x	Product NRx	Product TRx	Market NRx	Market TRx	6
RXSMI	Sales Market - Incentives		x	x	x				Total	Form Strength	x	Product NRx	Product TRx	Market NRx	Market TRx	3
RXSMM	Sales Market - Modified		x	x	x				Total	Form Strength	x	Product NRx	Product TRx	Market NRx	Market TRx	4
RXXPT	Plan Level Physician Rx Data		x	x			x		Plan	Form Strength	x	Product NRx	Product TRx	Market NRx	Market TRx	5
RXZip	Prescription data by Zipcode		x	x	x				Total	Form Strength	x	Product NRx	Product TRx	Market NRx	Market TRx	7
RXBrk	Prescription data by Brick		x	x				x	Total	Form Strength	x	Product NRx	Product TRx	Market NRx	Market TRx	8

W_SYND_DATA_F Data Population Rules for Populating TRx Data

This section provides information about various columns in the W_SYND_DATA_F table related to populating TRx data.

INTEGRATION_ID
- Data type is varchar(30).
- Value can be any unique number or combination of values coming in from the external file source. If the data is not already unique, the EXPTRANS transformation will manipulate the value to make it unique.
- Typically, contains the row ID of the source table record in the Siebel transactional database.
DATASOURCE_NUM_ID
- ETL run specific (a numeric value that indicates the source for the data).
- Used in standard Siebel mappings to ensure each record carries a value indicating its source.
- Used in source qualifiers to ensure that lookups on dimension tables when populating fact tables are referencing data that was loaded from a uniform data source.
- Data source number 1 is reserved for the Siebel transactional database. The external source should use a number other than 1.
- Combination of INTEGRATION_ID and DATASOURCE_NUM. Must be a unique value (mandated by a unique index on both columns).
  This means that the system will not accept the same record from the same source with identical unique IDs.
DATA_SRC_CD
Standard Siebel mapping transforms an incoming character data source code value into a numeric range between 1 and 8, using the following formula:

DECODE(DATA_SRC_CD

RXPrf',1,
RXEVM',0,
RXSMI',3,
RXSMM',4,
RXXPT',5,
RXPT',6,
RXZip',7,
RXBrk',8

The resulting numeric value is used to populate the DATA_TYPE column in the fact table, which is used in the Analytics business models to logically partition data when running different types of reports against the W_SYND_DATA table.
PREFIX
The following transformation is performed on DATA_SRC_CD to populate the PREFIX column:

DECODE(DATA_SRC_CD,

RXPrf','Prf',
RXEVM','EVM',
RXSMI','SMI',
RXSMM','SMM',
RXXPT','XPT',
RXPT','PT',
RXZip','Zip',
RXBrk','Brk')

The resulting three-character code is used to prefix the ROW_WID of the W_SYND_DATA table. Instead of 1, the first record is populated with Prf1 when loading prescriptions by professional. This is done in order to facilitate parallel load streams in Informatica so that ROW_WIDs of records for different data types do not conflict with each other.
PERIOD_TYPE_WID
- Stores a foreign key in the W_LOV_D table for the record corresponding to the period for the data.
- PERIOD_CD comes from the S_SYND_DATA as Month or Week, and so on.
- The standard mapping uses the code to do a lookup against the W_LOV_D table to secure the ROW_WID of that period type and inserts it into the W_SYND_DATA_F table PERIOD_TYPE_WID column.
OWNER_POSTN_WID
- W_SYND_DATA maps each record to a Siebel position using OWNER_POSTN_ID. Typically, when the source of the ETL is S_SYND_DATA in the transactional database, the field is already populated, but this is not the case when the source is the syndicated data file.
- The W_ALIGNMT_DH table in the data warehouse contains information that maps Siebel positions to zip codes, bricks, and accounts. ALIGN_ITEM_TYPE contains literal type values for this data, such as zip code, brick or account.
- Since the alignment process could assign a territory to multiple zip codes, the W_ALIGNMT_DH table could contain multiple records for the same zip code to position and territory combinations. In such cases, the ALIGN_TYPE_FLG column will designate whether the relationship between the territory and the zip code is primary (P), mirror (M), share (S), and so on.
- The Source qualifier in the custom mapping that populates W_SYND_DATA_F then retrieves the correct data warehouse position ID based on the following join criteria:
SELECT
SYND.IMS_NUM,
SYND.CLIENT_NUM,
SYND.START_DT,
SYND.IMS_SPEC_CD,
SYND.PROD_GRP,
SYND.ME_NUM,
SYND.PR_CITY,
SYND.PR_COUNTRY,
SYND.ZIPCODE,
ALIGN.LVL1ANC_POSTN_ID as OWNER_POSTN_ID,
PER.ROW_WID as MAP_CON_ROW_WID,
PROD_MKT.ROW_WID as MAP_MKT_ROW_WID,
PROD.ROW_WID as PROD_ROW_WID,
PROD.RX_AVG_PRICE,
POSTN.ROW_WID as POSTN_ROW_WID,
POSTN.INTEGRATION_ID as OWNER_POSTN_ID,
SYND.TRX01,
SYND.TRX02,
SYND.TRX03,
SYND.TRX04,
SYND.TRX05,
SYND.TRX06,
SYND.TRX07,
SYND.TRX08,
SYND.TRX09,
SYND.TRX10,
SYND.TRX11,
SYND.TRX12,
SYND.TRX13,
ALIGN.ALIGN_WID,
ALIGN.ALIGN_TYPE_FLG,
PER.INTEGRATION_ID,
PROD_MKT.PROD_NAME,
PROD.PROD_NAME
from WC_SYND_DATA_FS as SYND,

W_ALIGNMT_DH as ALIGN,
W_PERSON_D as PER,
W_PRODUCT_D as PROD_MKT,
W_PRODUCT_D as PROD,
W_POSITION_D as POST

where
(SYND.ZIPCODE = ALIGN.ALIGN_ITEM and ALIGN.ALIGN_ITEM_TYPE = 'Zipcode' and ALIGN.ALIGN_TYPE_FLG = 'P') and
SYND.ME_NUM = PER.CSN and
SYND.PHYNTYPE = PROD_MKT.PROD_NAME and
SYND.PROD_GRP = PROD.PROD_NAME and
ALIGN.LVL1ANC_POSTN_ID = POSTN.INTEGRATION_ID
GEO_WID
- In the Siebel Data Warehouse, the GEO_WID is a foreign key to the W_GEO_D dimension, which contains a geographical hierarchy based on zip code as the lowest level of detail. It is preconfigured to allow users to roll up a TRx measure to levels such as city, state or province, county, country, and continent.
- In the standard Siebel mappings that populate W_SYND_DATA_F, the following three attributes are brought directly from the S_SYND_DATA table: city, country, and zip code.
- A complex lookup is then performed utilizing two possible logic paths:
  - MPLT_GEO_WID_ZIPCODE (returns IN_ZIPCODE_GEO_WID): Takes in the zip code value, and a literal NO_DUP_FLG = 'N' ports do a lookup against W_GEO_D for any record where the ZIPCODE column matches the incoming zip code value and the DUP_ZIPCODE column = 'N'. (Sample data in the W_GEO_D table contains multiple city records per zip code, only one of which is marked as DUP_ZIPCODE = 'N'). The mapping also ensures that even with multiple matches only the first record is retrieved, and with no matches, the data comes back with the appropriate Unspecified code.
  - EXP_CITY_COUNTRY_ZIP_CHKNULL & MPLT_GEO_WID1 (returns IN_GEO_WID): Uses the combination of city, country, and zip code to retrieve the first available record from W_GEO_D even if duplicates exists and regardless of the NO_DUP_FLG.
- Once the two values are brought into the EXPTRANS transformation, the following formula is used:
IIF(DATA_SRC_CD = 'RXZip' OR DATA_SRC_CD = 'RXBrk' OR DATA_SRC_CD = 'SlsDirBrk' OR DATA_SRC_CD = 'SlsDirZip' OR DATA_SRC_CD = 'SlsIndBrk' OR DATA_SRC_CD = 'SlsIndZip',IN_ZIPCODE_GEO_WID,IN_GEO_WID)

Siebel Analytics Applications Installation and Administration Guide