Siebel Analytics Applications Installation and Administration Guide > Siebel Data Warehouse for Life Sciences Data Considerations >
Importing Syndicated Data into Siebel Data Warehouse
Pharmaceutical companies purchase weekly and monthly sales and prescription data, known as syndicated data, from third-party vendors such as IMS, NDC, and Cegedim. Syndicated data vendors acquire data from drug wholesalers and retailers on a daily, weekly, and monthly basis and compile a master file of customers (wholesalers, pharmacies, hospitals, and doctors) and sales or prescription transactions for customers. Measures include indirect sales, indirect units, and prescriptions and differ by vendor and periodicity. It is used for sales force analysis reporting and customer targeting. The data is derived from panels of physicians, pharmacies, and so on, and projected nationally. Since the panels may change on a monthly basis, syndicated data suppliers tend to change their projections of sources on a monthly basis leading to full restatements of historical data. Thus, pharmaceutical companies are required to refresh fully the data in their data warehouses. In addition, weekly data requires incremental loading. After it is delivered by the vendor, the syndicated data must be fully reloaded into the Siebel Pharma data warehouse in a timely manner and made available to users in order for them to make use of sales force analysis reporting. This section includes the following topics:
Syndicated Loading Definitions
Table 64 provides terms and definitions related to syndicated loading.
Table 64. Syndicated Loading Definitions
|
|
Syndicated data |
Third-party data from vendors that shows sales and prescription results for client defined markets. |
Brick |
Micro sales geographic area defined by vendor that contains prescription and sales trends for clients' product groupings or markets. Bricks do not exist in the US where the micro sales geographic area is commonly the postal code or zip code. |
NRx |
Abbreviation of new prescriptions. A new prescription is defined as dispensed prescriptions given a new number by the pharmacy, not necessarily new therapy for the patient. |
TRx |
Abbreviation of total prescriptions. TRx = NRx + Refills. After the first time a prescription is filled, when it is refilled, the pharmacy refers back to the previous drug ID number and notes this as a refill. |
Indirect sales |
Total drug wholesalers product sales values to drug retailers (pharmacies) by brick or zip code. Sales values are calculated as units multiplied by client-selected price, where the price is the warehouse acquisition cost unless the client has chosen to have a different price applied to its direct sales or government depot sales. |
Indirect units |
Total drug wholesalers product sales units to drug retailers (pharmacies) by brick or zip code. Sales unit quantity is determined by multiplying package factor to obtain unit value (for example, one carton x 24 bottles = 24 units). |
Data Types Supported in the Siebel Data Warehouse
The Siebel Data Warehouse supports multiple data types defined by data source type and periodicity, as described in Table 65.
Table 65. Data Types Supported in Siebel Data Warehouse
|
|
Prescription data by contact |
Monthly NRx and TRx data for client defined markets that include competitor products data by physician. |
Indirect sales brick level |
Monthly wholesaler sales and unit values data for client defined markets that include competitor products aggregated by brick. |
Indirect sales account level |
Monthly wholesaler sales and unit values for company products by pharmacy. |
Indirect sales zip level |
Monthly wholesaler sales and unit values for client defined markets that include competitor products aggregated by postal code. |
Direct sales account level |
Monthly direct factory sales data and unit values to wholesalers for company's products by wholesaler. |
Direct sales brick level |
Monthly direct factory sales data and unit values to wholesalers for company's products by brick. |
Direct sales zip level |
Monthly direct factory sales data and unit values to wholesalers for company's products by postal code. |
Weekly prescription data |
Weekly NRx and TRx data for client defined markets that include competitor products by physician. |
Plan level prescription data |
Monthly prescription data by managed care plan for client defined markets that includes competitor products. |
Sales market |
Incentives. Monthly incentive compensation data sourced from internal incentive compensation systems and loaded as a new data type in the Siebel Data Warehouse. |
Sales market |
Modified. Monthly incentive compensation data sourced from internal incentive compensation systems and loaded as a new data type in the Siebel Data Warehouse. |
Plan level physician prescription data |
Monthly prescription data for physicians associated with a managed care plan. |
Prescription data by zip code |
Monthly prescription data for client defined markets that includes competitor products aggregated by postal code. |
Prescription data by brick |
Monthly prescription data for client defined markets that include competitor products aggregated by brick. |
Loading Syndicated Data into the Siebel Data Warehouse
The options for loading syndicated data into the Siebel Data Warehouse include the following: - Loading the Siebel Data Warehouse from the Siebel transactional database (OLTP) S_SYND_DATA table.
Customers who are already using the EIM process or SQL Loader to load external syndicated data into the Siebel operational system can use data populated in S_SYND_DATA as the source to the Siebel Data Warehouse. Existing syndicated data from the Siebel transactional database, stored in the S_SYND_DATA table, is extracted, transformed, and loaded by prebuilt routines and populated into the W_SYND_DATA_F and W_SYND_MKT_F fact tables in the Siebel Data Warehouse. W_SYND_DATA_F stores Rx or sales measures by product. W_SYND_MKT_F stores Rx or sales measures by market and allows calculation of market share data.
The data is first copied from external data tapes (or other provided media) into the interface tables for the Siebel Industry Application (SIA), using a native database data loading utility, such as SQL* Loader, and following the rules in Table 66. Using EIM, data is transferred from the Siebel interface tables to predefined destination columns in the base tables of the Siebel Life Sciences database. The application administrator uses SQL*Loader to populate the ID field with a unique sequential value, a process that provides either full or partial table level extraction.
In order to correctly load the Siebel Data Warehouse, there is a set of population rules that are required to load S_SYND_DATA. This will guarantee minimal configuration required on the Analytics side and correct population of the Siebel Data Warehouse.
Once the data is loaded into S_SYND_DATA, you can use the LS SYNDICATED FACT GROUP, in the Pharma subject area, to load the syndicated data into the Siebel Data Warehouse.
- Loading from a flat file source using syndicated data staging tables, cross-referencing against extension columns in Siebel base tables.
Syndicated data source files can be loaded directly into Siebel Data Warehouse staging tables. In the staging tables, Siebel row IDs are matched with external sources keys for joining the external source keys with Siebel Analytics keys.
You can cross-reference between Siebel row IDs and the external source keys by using existing cross-reference columns in S_CONTACT and S_ORG_EXT for customer data. For example, the IMS ID for a physician may be populated on the ME number column. The configured ETL will perform a lookup against the new column to resolve the IMS ID to Siebel row ID relationship. The process entails first populating the necessary cross-references in the Siebel base tables. The next step is to configure the SDE ETL to extract from external source data files. The prebuilt SIL ETL performs a lookup against the cross-reference to associate the syndicated data record with the correct Siebel row ID and creates the fact record and warehouse ID with the correct relationship.
- Loading product data into the Siebel Data Warehouse by creating a new staging table that looks like the source data and then creating the market aggregation from the product table.
This method requires the most customization. It involves creating a new staging table with columns for each time period metric and configuring the SDE and SIL mappings to load the data directly into the Data Warehouse. You then need to create a mapping to populate the staging table and a new SIL mapping that will use the staging table as a source and the W_SYND_DATA_F table as a target. The W_SYND_MKT_F table can then be loaded from W_SYND_DATA_F.
NOTE: To make sure the data is correctly loaded in the Siebel Data Warehouse, the Siebel source table S_SYND_DATA needs to be populated correctly, based on Table 66.
Table 66. S_SYND_DATA Data Population Rules
|
|
Required fields (x) and data population rules for correct rollup in Siebel Analytics |
Syndicated Data mandatory codes to be populated in DATA_SRC_CD |
Data Type supported with DATA_SRC_CODE |
|
|
|
|
|
|
|
Payor Level PLAN_ID Foreign Key to S_INS_PLAN |
Product Level PRDINT_ID (always try to populate data at the lowest level - form strength) |
|
|
|
|
|
Siebel Pharma Analytics Business Model Data Source Codes |
RXPrf |
Prescription data by Contact |
|
x |
x |
|
|
x |
|
Total |
Form Strength |
x |
NRx |
TRx |
Market NRx |
Market TRx |
1 |
SlsIndBrk |
Indirect Sales Brick Level |
|
x |
x |
|
|
|
x |
Total |
Form Strength |
x |
Product Sales $ |
Product Sales Units |
Market Sales $ |
Market Sales Units |
22 |
SlsIndAct |
Indirect Sales Account Level |
|
x |
x |
|
x |
|
|
Total |
Form Strength |
x |
Product Sales $ |
Product Sales Units |
Market Sales $ |
Market Sales Units |
21 |
SlsIndZip |
Indirect Sales Zip Level |
|
x |
x |
x |
|
|
|
Total |
Form Strength |
x |
Product Sales $ |
Product Sales Units |
Market Sales $ |
Market Sales Units |
23 |
SlsDirAct |
Direct Sales Account Level |
|
x |
x |
|
x |
|
|
Total |
Form Strength |
x |
Product Sales $ |
Product Sales Units |
|
|
11 |
SlsDirBrk |
Direct Sales Brick Level |
|
x |
x |
|
|
|
x |
Total |
Form Strength |
x |
Product Sales $ |
Product Sales Units |
|
|
12 |
SlsDirZip |
Direct Sales Zip Level |
|
x |
x |
x |
|
|
|
Total |
Form Strength |
x |
Product Sales $ |
Product Sales Units |
|
|
13 |
RXEVM |
Weekly RX Data by Contact |
|
x |
x |
|
|
x |
|
Total |
Form Strength |
x |
Product NRx |
Product TRx |
Market NRx |
Market TRx |
0 |
RXPT |
Plan Level Rx Data |
|
x |
x |
|
|
|
|
Plan |
Form Strength |
x |
Product NRx |
Product TRx |
Market NRx |
Market TRx |
6 |
RXSMI |
Sales Market - Incentives |
|
x |
x |
x |
|
|
|
Total |
Form Strength |
x |
Product NRx |
Product TRx |
Market NRx |
Market TRx |
3 |
RXSMM |
Sales Market - Modified |
|
x |
x |
x |
|
|
|
Total |
Form Strength |
x |
Product NRx |
Product TRx |
Market NRx |
Market TRx |
4 |
RXXPT |
Plan Level Physician Rx Data |
|
x |
x |
|
|
x |
|
Plan |
Form Strength |
x |
Product NRx |
Product TRx |
Market NRx |
Market TRx |
5 |
RXZip |
Prescription data by Zipcode |
|
x |
x |
x |
|
|
|
Total |
Form Strength |
x |
Product NRx |
Product TRx |
Market NRx |
Market TRx |
7 |
RXBrk |
Prescription data by Brick |
|
x |
x |
|
|
|
x |
Total |
Form Strength |
x |
Product NRx |
Product TRx |
Market NRx |
Market TRx |
8 |
W_SYND_DATA_F Data Population Rules for Populating TRx Data
This section provides information about various columns in the W_SYND_DATA_F table related to populating TRx data. - INTEGRATION_ID
- Data type is varchar(30).
- Value can be any unique number or combination of values coming in from the external file source. If the data is not already unique, the EXPTRANS transformation will manipulate the value to make it unique.
- Typically, contains the row ID of the source table record in the Siebel transactional database.
- DATASOURCE_NUM_ID
- DATA_SRC_CD
Standard Siebel mapping transforms an incoming character data source code value into a numeric range between 1 and 8, using the following formula:
DECODE(DATA_SRC_CD
RXPrf',1, RXEVM',0, RXSMI',3, RXSMM',4, RXXPT',5, RXPT',6, RXZip',7, RXBrk',8
The resulting numeric value is used to populate the DATA_TYPE column in the fact table, which is used in the Analytics business models to logically partition data when running different types of reports against the W_SYND_DATA table.
- PREFIX
The following transformation is performed on DATA_SRC_CD to populate the PREFIX column:
DECODE(DATA_SRC_CD,
RXPrf','Prf', RXEVM','EVM', RXSMI','SMI', RXSMM','SMM', RXXPT','XPT', RXPT','PT', RXZip','Zip', RXBrk','Brk')
The resulting three-character code is used to prefix the ROW_WID of the W_SYND_DATA table. Instead of 1, the first record is populated with Prf1 when loading prescriptions by professional. This is done in order to facilitate parallel load streams in Informatica so that ROW_WIDs of records for different data types do not conflict with each other.
- PERIOD_TYPE_WID
- Stores a foreign key in the W_LOV_D table for the record corresponding to the period for the data.
- PERIOD_CD comes from the S_SYND_DATA as Month or Week, and so on.
- The standard mapping uses the code to do a lookup against the W_LOV_D table to secure the ROW_WID of that period type and inserts it into the W_SYND_DATA_F table PERIOD_TYPE_WID column.
- OWNER_POSTN_WID
- W_SYND_DATA maps each record to a Siebel position using OWNER_POSTN_ID. Typically, when the source of the ETL is S_SYND_DATA in the transactional database, the field is already populated, but this is not the case when the source is the syndicated data file.
- The W_ALIGNMT_DH table in the data warehouse contains information that maps Siebel positions to zip codes, bricks, and accounts. ALIGN_ITEM_TYPE contains literal type values for this data, such as zip code, brick or account.
- Since the alignment process could assign a territory to multiple zip codes, the W_ALIGNMT_DH table could contain multiple records for the same zip code to position and territory combinations. In such cases, the ALIGN_TYPE_FLG column will designate whether the relationship between the territory and the zip code is primary (P), mirror (M), share (S), and so on.
- The Source qualifier in the custom mapping that populates W_SYND_DATA_F then retrieves the correct data warehouse position ID based on the following join criteria:
SELECT SYND.IMS_NUM, SYND.CLIENT_NUM, SYND.START_DT, SYND.IMS_SPEC_CD, SYND.PROD_GRP, SYND.ME_NUM, SYND.PR_CITY, SYND.PR_COUNTRY, SYND.ZIPCODE, ALIGN.LVL1ANC_POSTN_ID as OWNER_POSTN_ID, PER.ROW_WID as MAP_CON_ROW_WID, PROD_MKT.ROW_WID as MAP_MKT_ROW_WID, PROD.ROW_WID as PROD_ROW_WID, PROD.RX_AVG_PRICE, POSTN.ROW_WID as POSTN_ROW_WID, POSTN.INTEGRATION_ID as OWNER_POSTN_ID, SYND.TRX01, SYND.TRX02, SYND.TRX03, SYND.TRX04, SYND.TRX05, SYND.TRX06, SYND.TRX07, SYND.TRX08, SYND.TRX09, SYND.TRX10, SYND.TRX11, SYND.TRX12, SYND.TRX13, ALIGN.ALIGN_WID, ALIGN.ALIGN_TYPE_FLG, PER.INTEGRATION_ID, PROD_MKT.PROD_NAME, PROD.PROD_NAME from WC_SYND_DATA_FS as SYND,
W_ALIGNMT_DH as ALIGN, W_PERSON_D as PER, W_PRODUCT_D as PROD_MKT, W_PRODUCT_D as PROD, W_POSITION_D as POST
where (SYND.ZIPCODE = ALIGN.ALIGN_ITEM and ALIGN.ALIGN_ITEM_TYPE = 'Zipcode' and ALIGN.ALIGN_TYPE_FLG = 'P') and SYND.ME_NUM = PER.CSN and SYND.PHYNTYPE = PROD_MKT.PROD_NAME and SYND.PROD_GRP = PROD.PROD_NAME and ALIGN.LVL1ANC_POSTN_ID = POSTN.INTEGRATION_ID
- GEO_WID
- In the Siebel Data Warehouse, the GEO_WID is a foreign key to the W_GEO_D dimension, which contains a geographical hierarchy based on zip code as the lowest level of detail. It is preconfigured to allow users to roll up a TRx measure to levels such as city, state or province, county, country, and continent.
- In the standard Siebel mappings that populate W_SYND_DATA_F, the following three attributes are brought directly from the S_SYND_DATA table: city, country, and zip code.
- A complex lookup is then performed utilizing two possible logic paths:
- MPLT_GEO_WID_ZIPCODE (returns IN_ZIPCODE_GEO_WID): Takes in the zip code value, and a literal NO_DUP_FLG = 'N' ports do a lookup against W_GEO_D for any record where the ZIPCODE column matches the incoming zip code value and the DUP_ZIPCODE column = 'N'. (Sample data in the W_GEO_D table contains multiple city records per zip code, only one of which is marked as DUP_ZIPCODE = 'N'). The mapping also ensures that even with multiple matches only the first record is retrieved, and with no matches, the data comes back with the appropriate Unspecified code.
- EXP_CITY_COUNTRY_ZIP_CHKNULL & MPLT_GEO_WID1 (returns IN_GEO_WID): Uses the combination of city, country, and zip code to retrieve the first available record from W_GEO_D even if duplicates exists and regardless of the NO_DUP_FLG.
- Once the two values are brought into the EXPTRANS transformation, the following formula is used:
IIF(DATA_SRC_CD = 'RXZip' OR DATA_SRC_CD = 'RXBrk' OR DATA_SRC_CD = 'SlsDirBrk' OR DATA_SRC_CD = 'SlsDirZip' OR DATA_SRC_CD = 'SlsIndBrk' OR DATA_SRC_CD = 'SlsIndZip',IN_ZIPCODE_GEO_WID,IN_GEO_WID)
|