Regression¶

Regression is a type of modeling wherein the output is continuous. For example, price, height, sales, length. These models have their own specific metrics that help to benchmark the model. How close is close enough?

The prevailing metrics for evaluating a regression model are:

• R-squared: Also known as the coefficient of determination. It is the proportion in the data of the variance that is explained by the model, see [Read More].

• Explained variance score: The variance of the model’s predictions. The mean of the squared difference between the predicted values and the true mean of the data, see [Read More].

• Mean squared error (MSE): The mean of the squared difference between the true values and predicted values, see [Read More].

• Root mean squared error (RMSE): The square root of the mean squared error, see [Read More].

• Mean absolute error (MAE): The mean of the absolute difference between the true values and predicted values, see [Read More].

• mean residuals: The mean of the difference between the true values and predicted values, see [Read More].

The prevailing charts and plots for regression are:

• Observed vs. predicted: A plot of the observed, or actual values, against the predicted values output by the models.

• Residuals QQ: The quantile-quantile plot, shows the residuals and quantiles of a standard normal distribution. It should be close to a straight line for a good model.

• Residuals vs. predicted: A plot of residuals versus predicted values. This should not carry a lot of structure in a good model.

• Residuals vs observed: A plot of residuals vs observed values. This should not carry a lot of structure in a good model.

This code snippet demonstrates how to generate the above metrics and charts. The data has to be split into a testing and training set with the features in X_train and X_test and the responses in y_train and y_test.

lin_reg = LinearRegression().fit(X_train, y_train)
lasso_reg = Lasso(alpha=0.1).fit(X_train, y_train)



To show all of the metrics in a table, run:

evaluator.metrics


Evaluator Metrics (repr)

To show all of the charts, run:

evaluator.show_in_notebook()


Observed vs Predicted

Residual Q-Q Plot

Residual vs Predicted

Residual vs Observed

This code snippet demonstrates how to add a custom metric, Number Correct, to the evaluator.

from ads.evaluations.evaluator import ADSEvaluator
evaluator = ADSEvaluator(test, models=[modelA, modelB, modelC modelD])

def num_correct(y_true, y_pred):
return sum(y_true == y_pred)