XGBoost Feature Constraints
Feature interaction constraints allow users to specify which variables can and cannot interact. By focusing on key interactions and eliminating noise, it aids in improving predicting performance. This, in turn, may lead to more generalized predictions.
The feature interaction constraints are described in terms of groupings of features that are allowed to interact. Variables that appear together in a traversal path in decision trees interact with one another because the condition of a child node is dependent on the condition of the parent node. These additional controls on model fit are beneficial to users who have a good understanding of the modeling task, including domain knowledge. Oracle Machine Learning for SQL supports more of the available XGBoost capabilities once these constraints are applied.
Monotonic constraints allow you to impose
monotonicity constraints on the features in your boosted model. There may be a
strong prior assumption that the genuine relationship is constrained in some way in
many circumstances. This could be owing to commercial factors (just specific feature
interactions are of interest) or the type of scientific subject under investigation.
A typical form of constraint is that some features have a monotonic connection to
the predicted response. In these situations, monotonic constraints may be employed
to improve the model's prediction performance. For example, let X be the
feature vector with features [x1,…, xi , …, xn] and ƒ(X) be the
prediction response. Then ƒ(X) ≤ ƒ(X’)
whenever xi ≤
xi’
is an increasing constraint; ƒ(X) ≥ ƒ(X’)
whenever
xi ≤ xi’
is a decreasing constraint. These feature constraints
are listed in DBMS_DATA_MINING — Algorithm
Settings: XGBoost.
interaction_constraints
setting is used to specify the
interaction constraints. The example predicts customers most likely to respond
positively for an affinity card loyalty program.
-----------------------------------------------------------------------
-- Build a Classification Model using Interaction Contraints
-----------------------------------------------------------------------
-- The interaction constraints setting can be used to specify permitted
-- interactions in the model. The constraints must be specified
-- in the form of nested list, where each inner list is a group of
-- features (column names) that are allowed to interact with each other.
-- For example, assume x0, x1, x2, x3, x4, x5 and x6 are
-- the feature names (column names) of interest.
-- Then setting value [[x0,x1,x2],[x0,x4],[x5,x6]] specifies that:
-- * Features x0, x1 and x2 are allowed to interact with each other
-- but with no other feature.
-- * Features x0 & x4 are allowed to interact with one another
-- but with no other feature.
-- * Features x5 and x6 are allowed to interact with each other
-- but with no other feature.
-------------------------------------------------------------------------
BEGIN DBMS_DATA_MINING.DROP_MODEL('XGB_CLASS_MODEL_INTERACTIONS');
EXCEPTION WHEN OTHERS THEN NULL; END;
/
DECLARE
v_setlst DBMS_DATA_MINING.SETTING_LIST;
BEGIN
v_setlst('ALGO_NAME') := 'ALGO_XGBOOST';
v_setlst('PREP_AUTO') := 'ON';
v_setlst('max_depth') := '2';
v_setlst('eta') := '1';
v_setlst('num_round') := '100';
v_setlst('interaction_constraints') := '[[YRS_RESIDENCE, OCCUPATION],
[OCCUPATION, Y_BOX_GAMES],
[BULK_PACK_DISKETTES,
BOOKKEEPING_APPLICATION]]';
DBMS_DATA_MINING.CREATE_MODEL2(
MODEL_NAME => 'XGB_CLASS_MODEL_INTERACTIONS',
MINING_FUNCTION => 'CLASSIFICATION',
DATA_QUERY => 'SELECT * FROM TRAIN_DATA_CLAS',
SET_LIST => v_setlst,
CASE_ID_COLUMN_NAME => 'CUST_ID',
TARGET_COLUMN_NAME => 'AFFINITY_CARD');
DBMS_OUTPUT.PUT_LINE('Created model: XGB_CLASS_MODEL_INTERACTIONS');
END;
/
To view the complete example, see https://github.com/oracle-samples/oracle-db-examples/tree/main/machine-learning/sql/23ai.