10 Oracle Utilities Data Model Data Mining Model

This chapter provides reference information about the data mining models provided with Oracle Utilities Data Model.

This chapter includes the following sections:

About Data Mining in Oracle Utilities Data Model

Oracle Utilities Data Model data mining includes data mining intra-ETL package, data mining core package, source data views, apply data views, target tables, support tables, and setting tables. The source views are defined on source derived tables. These source views are used to train the models. The target tables contain the mining model rules and mining prediction results. Data mining core package builds mining models using data in source views as training data, and applies mining models on the data in apply views. Mining target tables are populated with mining model rules and prediction results. The data in the target tables can be presented in reports.

As shown in Table 10-1, the Oracle Utilities Data Model mining models use the specified algorithms for the specific mining problem.

Table 10-1 Oracle Utilities Data Model Algorithm Used

Model Algorithms Used by Data Mining Model

Model 1: Customer Savings and Customer Profile by DR Program

Decision Tree (DT), Support Vector Machine (SVM)


Understanding the Mining Architecture

Figure 10-1 shows the architecture of data mining in Oracle Utilities Data Model. Oracle Utilities Data Model schema, oudm_sys, includes the following:

  • Mining Model Source Views: Views defined on source derived table, DWD_CUST_DR_PROG_PROFILE. These views are used to train mining models.

  • Mining Model Apply Views: Views defined on source derived tables, DWD_CUST_DR_PROG_PROFILE. These views are used to apply trained mining models.

  • Mining Model Support Tables: Mining algorithm settings for different algorithms used in Oracle Utilities Data Model are stored in the support tables. These support tables start with "DM". Building a mining model creates few tables and views which start with "DM$".

Note:

Do not delete tables and views that start with "DM" and "DM$". Deleting "DM$" tables and views would also delete the trained mining model.
  • Mining Model Target Tables: Mining model target tables used for storing mining model rules and prediction results. Mining model rules are generated from the trained model and predictions results are produced when a trained model is applied on apply data.

  • Mining Model Core Package: This is the core package for Oracle Utilities Data Model data mining. Each mining model has separate procedure in this package. Each procedure builds, tests, and applied mining model. It uses source views as training data, and applies trained model on apply views.

Figure 10-1 Oracle utilities Data Model Mining Packages Tables and Views

Description of Figure 10-1 follows
Description of "Figure 10-1 Oracle utilities Data Model Mining Packages Tables and Views"

Oracle Utilities Data Model Mining Result Tables

Table 10-2 shows the DWR_CUST_SGMNT table.

Table 10-2 DWR_CUST_SGMNT Data Mining Model Details Table

Name Data Type Description

DEMAND_RESPN_PROG_KEY

NUMBER(30)

Surrogate key of demand response program

CUST_SGMNT_KEY

NUMBER (30,0)

Surrogate key for customer segment

CUST_SGMNT_CD

VARCHAR2 (120)

Natural key for customer segment

CUST_SGMNT_MDL_KEY

NUMBER (30,0)

Customer segment model key

PRNT_CUST_SGMNT_KEY

NUMBER (30,0)

Parent segment key

CUST_SGMNT_NAME

VARCHAR2 (400)

Customer segment name

SGMNT_CRTRA_KEY

NUMBER (30,0)

Segment criteria key

CUST_SGMNT_DSCR

VARCHAR2 (1000)

Customer segment description

IS_LEAF_IND

CHAR (1)

Indicates whether the node is a leaf indicator. The prediction of lead node is the final prediction

TREE_LVL

NUMBER (4,0)

Tree level

SGMNT_DISPRSN

VARCHAR2 (500)

Segment dispersion, which is also known as intra cluster distance

SPPRTG_REC_CNT

NUMBER (16,0)

Supporting record count - number of customers in the segment

STAT_CD

VARCHAR2 (120)

Status code of the record

EFF_FROM_DT

DATE

Effective from date

EFF_TO_DT

DATE

Effective to date


Table 10-2 shows the DWR_CUST_SGMNT_DTL table.

Table 10-3 DWR_CUST_SGMNT_DTL Data Mining Model Details Table

Name Data Type Description

DEMAND_RESPN_PROG_KEY

NUMBER(30)

Surrogate key of demand response program

SGMNT_ID

NUMBER

Customer segment Identifier

ATTRIBUTE_NAME

VARCHAR2(4000)

Name of customer attribute

MEAN

NUMBER

Mean of the customer attribute if the attribute is numeric

MODE_VALUE

VARCHAR2(4000)

Model of the customer attribute if the attribute is categorical


Model 1: Customer Savings and Customer Profile by DR Program

Customers can reduce their usage by participating in a Demand Response (DR) program. The utility company needs to let customers know the savings the customer can obtain by participating in the demand response program. For this purpose, there is the need to segment whole customer population in two steps.

For each segment in STEP1, determine the average actual usage of customers not participating in DR program during the time of DR program instance; let us call it Average actual usage of non participants. In the same segment, for each participant customer determine delta (Average actual usage of non participants - customer actual usage) as a percentage of Average actual usage of non participants during the time of DR program instance.

%saving = 100*(Average actual usage of non participants - customer actual usage)/Average actual usage of non participants

"%saving" measure gives the energy saved by customer during the time of DR program instance due to participation.

STEP1: Segmentation Using Oracle Data Mining Clustering Algorithm

In this segmentation, the complete customer population is segmented into a predefined number of segments using customers' demographic attributes. By default, Oracle Data Mining chooses K-Means algorithm as the clustering function and 10 as number of segments. The default settings can be overridden using a setting a table.

Table 10-4 shows the structure of source table, which is used as training data.

Table 10-4 DWD_CUST_DR_PROG_PROFILE

Attribute Name Description Column Name Source Table Mapping

Customer Key

Customer Identifier

CUST_KEY

DWR_CUST

CUST_KEY

Demand Response Program Key

DR program Identifier

DEMAND_RESPN_PROG_KEY

   

DEMOGRAPHIC ATTRIBUTES

       

Customer Kind

Kind of Customer

CUST_KIND_CD

DWR_CUST

CUST_KIND_CD

Customer Type

Type of customer. For example: Industrial, Commercial, Residential

CUST_TYP_CD

DWR_CUST

CUST_TYP_CD

Dwelling Type

Facility Type

DWLNG_TYP

DWR_CUST

DWLNG_TYP

Dwelling Status

Facility Status

DWLNG_STAT

DWR_CUST

DWLNG_STAT

Dwelling Size

Facility Size

DWLNG_SZ

DWR_CUST

DWLNG_SZ

Dwelling Tenure

Facility Tenure

DWLNG_TENR

DWR_CUST

DWLNG_TENR

Income Group

Income Group

INCM_GRP

DWR_HH

INCM_GRP

Number of Children

Number of children in the household

NBR_OF_CHLDRN

DWR_HH

NBR_OF_CHLDRN

Number of Teens

Number of teens in the household

NBR_OF_TEENS

DWR_HH

NBR_OF_TEENS

Number of Adults

Number of adults in the household

NBR_OF_ADLTS

DWR_HH

NBR_OF_ADLTS

Number of Seniors

Number of seniors in the household

NBR_OF_SNRS

DWR_HH

NBR_OF_SNRS

Number of Persons

Number of persons in the household

NBR_OF_PRSN

DWR_HH

NBR_OF_PRSN

Number of Earners

Number of earners in the household

NBR_OF_ERNR

DWR_HH

NBR_OF_ERNR

Business Legal Status

The legal status of the company. For example: Public, Private, and so on.

BSNS_LEGAL_STAT_CD

DWR_CUST

BSNS_LEGAL_STAT_CD

Customer Revenue Band

Customer Revenue Band

CUST_RVN_BND_CD

DWR_CUST

CUST_RVN_BND_CD (Derive it from payment and cost)

Nationality

Nationality

NTNLTY_CD

DWR_CUST

NTNLTY_CD

Education

Education Qualification

EDU_CD

DWR_CUST

EDU_CD

Marital Status

Marital Status

MRTL_STAT_CD

DWR_CUST

MRTL_STAT_CD

Gender

Gender

GNDR_CD

DWR_CUST

GNDR_CD

Job Position

Job Position

JB_POSN

DWR_CUST

JB_POSN

Annual Revenue

Annual Revenue

ANNUAL_RVN

DWR_CUST

ANNUAL_RVN (o for residential customers)

Annual Sales

Annual Sales

ANNUAL_SL

DWR_CUST

ANNUAL_SL (o for residential customers)

Equity Amount

Equity Amount

EQTY_AMT

DWR_CUST

EQTY_AMT (o for residential customers)

City

City

CITY

DWR_CUST

CITY

State

State

STATE

DWR_CUST

STATE

Country

Country

COUNTRY

DWR_CUST

COUNTRY

Ethnic Background

Ethnic Background

ETHNIC_BCKGRND

DWR_CUST

ETHNIC_BCKGRND

Source of Income

Source of Income

SRC_OF_INCM

DWR_CUST

SRC_OF_INCM

Special Need

Special Need

SPL_NEED

DWR_CUST

SPL_NEED

Economically Active Indicator

Economically Active Indicator

ECNMCLY_ACTV_IND

DWR_CUST

ECNMCLY_ACTV_IND

Domestic Indicator

Domestic company Indicator

DMSTC_IND

DWR_CUST

DMSTC_IND

Mail Allowed Indicator

Mail Allowed Indicator

MAIL_ALWD_IND

DWR_CUST

MAIL_ALWD_IND

Third Party Marketing Allowed Indicator

Third Party Marketing Allowed Indicator

THIRD_PRTY_MKTG_ALWD_IND

DWR_CUST

THIRD_PRTY_MKTG_ALWD_IND

Customer Payment Responsible Indicator

Customer Payment Responsible Indicator

CUST_PYMT_RESPBL_IND

DWR_CUST

CUST_PYMT_RESPBL_IND

VIP

VIP Flag

VIP

DWR_CUST

VIP

TARGET/OUTPUT ATTRIBUTES

       

Customer Segment Code

Customer Segment Code

CUST_SGMNT_CD

   

TECHNICAL QUALITY

       

Creation Date

Date when this record is created

CREATED_DT

SYSDATE

 

Created By

User who created this record

CREATED_BY

USER

 

Updated Date

Date when this record updated

UPDATE_DT

SYSDATE

 

Updated By

User who updated this record

UPDATE_BY

USER

 

Effective from Date

Date from when this record is effective

EFF_FROM_DT

SYSDATE

 

Effective to Date

Date until when the record is effective

EFF_TO_DT

NULL

 

Current Indicator

Whether this record is current or not. 'Y' - Yes, 'N' - No

CURR_IND

'Y'

 

Status Code

Status of this record. 'A' - Active, 'I' - Inactive

STAT_CD

'A'

 

Algorithms Used

Following algorithm is used to segment customers:

  • K-Means

Algorithm Setting Table

DM_STNG_CUST_PROFILE is the setting table used for STEP1 segmentation using Oracle Data Mining segmentation algorithm. Following table shows the different settings used:

Table 10-5 shows setting values for STEP1 segmentation.

Table 10-5 Setting Values for STEP1 Segmentation

SETTING_NAME SETTING_VALUE

DBMS_DATA_MINING.ALGO_NAME

ALGO_KMEANS

DBMS_DATA_MINING.KMNS_DISTANCE

KMNS_EUCLIDEAN

DBMS_DATA_MINING.KMNS_ITERATIONS

4

DBMS_DATA_MINING.CLUS_NUM_CLUSTERS

5


STEP2 Segmentation and Customer Saving Calculation

As mentioned earlier, each STEP1 segment can have both DR program participant customers and non-participant customers. Each STEP1 segment is further segmented into two segments using customer participation indicator; that is, one sub-segment for participant customers and the other sub-segment for non-participant customers.

For each STEP1 segment:

  • Calculate the average of actual usage of non-participant customers during the DR program, let us call it Average actual usage of non participants

  • For each participant customer, compute %saving as follows:

%saving =

Description of dm_calc1.png follows
Description of the illustration dm_calc1.png

For each participant customer, PERCNT_CUST_SAVNG column in mining target table DWD_CUST_DR_PROG_PROFILE is updated with the calculated %saving.

Oracle Utilities Data Model Mining Setting Tables

Use the algorithm settings tables to override default values of different settings of mining algorithms.

The following two setting tables have the same structure:

  • DM_STNG_USER_ALL

  • DM_STNG_PROFILE_KMEANS

Table 10-6 shows the structure of the two setting tables.

Table 10-6 Data Mining Setting Table

Name Data Type Description

SETTING_NAME

VARCHAR2(500)

Setting Name

SETTING_VALUE

VARCHAR2(500)

Setting Value