RALG_SCORE_FUNCTION
Use the RALG_SCORE_FUNCTION
setting to specify an existing registered R script for R algorithm machine learning model to use for scoring data.
The specified R script defines an R function. The first input argument defines the model object. The second input argument defines the R data.frame
that is used for scoring data.
Example 6-7 Example of RALG_SCORE_FUNCTION
data.frame.
The function argument object
is the LM model. The argument newdata
is a data.frame
containing the data to score.function(object, newdata) {res <- predict.lm(object, newdata = newdata, se.fit = TRUE); data.frame(fit=res$fit, se=res$se.fit, df=summary(object)$df[1L])}
The output of the R function must be a data.frame
. Each row represents the prediction for the corresponding scoring data from the input data.frame
. The columns of the data.frame
are specific to machine learning functions, such as:
Regression: A single numeric column for the predicted target value, with two optional columns containing the standard error of the model fit, and the degrees of freedom number. The optional columns are needed for the SQL function PREDICTION_BOUNDS
to work.
Example 6-8 Example of RALG_SCORE_FUNCTION for Regression
This example shows how to specify the name of the R script MY_LM_PREDICT_SCRIPT
that is used to score the model in the model settings table model_setting_table.
Begin
insert into model_setting_table values
(dbms_data_mining.ralg_score_function, 'MY_LM_PREDICT_SCRIPT');
End;
/
MY_LM_PREDICT_SCRIPT
is registered as: function(object, newdata) {data.frame(pre = predict(object, newdata = newdata))}
Classification: Each column represents the predicted probability of one target class. The column name is the target class name.
Example 6-9 Example of RALG_SCORE_FUNCTION for Classification
This example shows how to specify the name of the R script MY_LOGITGLM_PREDICT_SCRIPT
that is used to score the logit Classification model in the model settings table model_setting_table.
Begin
insert into model_setting_table values
(dbms_data_mining.ralg_score_function, 'MY_LOGITGLM_PREDICT_SCRIPT');
End;
/
MY_LOGITGLM_PREDICT_SCRIPT
is registered as follows. It is a logit Classification with two target classes, "0" and "1".'function(object, newdata) {
pred <- predict(object, newdata = newdata, type="response");
res <- data.frame(1-pred, pred);
names(res) <- c("0", "1");
res}'
Clustering: Each column represents the predicted probability of one cluster. The columns are arranged in order of cluster ID. Each cluster is assigned a cluster ID, and they are consecutive values starting from 1. To support CLUSTER_DISTANCE
in the R model, the output of R score function returns an extra column containing the value of the distance to each cluster in order of cluster ID after the columns for the predicted probability.
Example 6-10 Example of RALG_SCORE_FUNCTION for Clustering
This example shows how to specify the name of the R script MY_CLUSTER_PREDICT_SCRIPT
that is used to score the model in the model settings table model_setting_table.
Begin
insert into model_setting_table values
(dbms_data_mining.ralg_score_function, 'MY_CLUSTER_PREDICT_SCRIPT');
End;
/
MY_CLUSTER_PREDICT_SCRIPT
is registered as:'function(object, dat){
mod <- object[[1L]]; ce <- object[[2L]]; sc <- object[[3L]];
newdata = scale(dat, center = ce, scale = sc);
centers <- mod$centers;
ss <- sapply(as.data.frame(t(centers)),
function(v) rowSums(scale(newdata, center=v, scale=FALSE)^2));
if (!is.matrix(ss)) ss <- matrix(ss, ncol=length(ss));
disp <- -1 / (2* mod$tot.withinss/length(mod$cluster));
distr <- exp(disp*ss);
prob <- distr / rowSums(distr);
as.data.frame(cbind(prob, sqrt(ss)))}'
Feature Extraction: Each column represents the coefficient value of one feature. The columns are arranged in order of feature ID. Each feature is assigned a feature ID, which are consecutive values starting from 1.
Example 6-11 Example of RALG_SCORE_FUNCTION for Feature Extraction
This example shows how to specify the name of the R script MY_FEATURE_EXTRACTION_SCRIPT
that is used to score the model in the model settings table model_setting_table.
Begin
insert into model_setting_table values
(dbms_data_mining.ralg_score_function, 'MY_FEATURE_EXTRACTION_SCRIPT');
End;
/
MY_FEATURE_EXTRACTION_SCRIPT
is registered as: 'function(object, dat) { as.data.frame(predict(object, dat)) }'
The function fetches the centers of the features from the R model, and computes the feature coefficient based on the distance of the score data to the corresponding feature center.
Related Topics