33.2 Example: Predicting Likely Candidates for a Sales Promotion

This example shows PREDICTION query to target customers in Brazil for a special promotion that offers coupons and an affinity card.

The query uses data on marital status, education, and income to predict the customers who are most likely to take advantage of the incentives. The query applies a Decision Tree model called dt_sh_clas_sample to score the customer data. The model is created by the oml4sql-classification-decision-tree.sql example.

Example 33-1 Predict Best Candidates for an Affinity Card

SELECT cust_id
  FROM mining_data_apply_v
  WHERE
      PREDICTION(dt_sh_clas_sample 
                   USING cust_marital_status, education, cust_income_level ) = 1
  AND country_name IN 'Brazil';

  

The output is as follows:


CUST_ID
----------
    100404
    100607
    101113

The same query, but with a bias to favor false positives over false negatives, is shown here.

SELECT cust_id
  FROM mining_data_apply_v
  WHERE
      PREDICTION(dt_sh_clas_sample COST MODEL
                   USING cust_marital_status, education, cust_income_level ) = 1
  AND country_name IN 'Brazil';

The output is as follows:


CUST_ID
----------
    100139
    100163
    100275
    100404
    100607
    101113
    101170
    101463

The COST MODEL keywords cause the cost matrix associated with the model to be used in making the prediction. The cost matrix, stored in a table called dt_sh_sample_costs, specifies that a false negative is eight times more costly than a false positive. Overlooking a likely candidate for the promotion is far more costly than including an unlikely candidate.

SELECT * FROM dt_sh_sample_cost;

The output is as follows:


ACTUAL_TARGET_VALUE PREDICTED_TARGET_VALUE       COST
------------------- ---------------------- ----------
                  0                      0          0
                  0                      1          1
                  1                      0          8
                  1                      1          0