Metrics for Spatial Regression

Data analysis is essential to build better machine learning models, particularly for spatial regression. Some common tasks involve analyzing multicollinearity, normal distribution bias, nonstationarity or heterogeneity, and spatial autocorrelation.

After training a regression model, a variety of statistics-based metrics are available to assess the model results. This helps you to choose the best spatial model for the task at hand. The following table describes some of these statistics which you can access in the oraclesai.metrics module. All the methods receive a spatial regression model as a parameter.

Metric Description
koenker_bassett It is helpful to identify the presence of variance in the residuals, which can be caused by spatial heteroskedasticity, a particular type of heterogeneity.
lm_error Lagrange multiplier test to identify if a regression algorithm that includes the spatial lag over the error term is needed.
lm_lag Lagrange multiplier test to identify if a regression algorithm that includes the spatial lag over the target variable is required.
rlm_error Robust Lagrange multiplier test for Spatial Error model.
rlm_lag Robust Lagrange multiplier test for Spatial Lag model.
moran_res Test correlation between residuals and the spatial lag of residuals. A positive and significant value indicates the presence of spatial clustering, where regions with similar values tend to be together, reflecting the effect of spatial dependence. A negative and significant value indicates the presence of spatial variance or the checkerboard pattern, reflecting the effect of spatial heterogeneity.
log_likelihood Returns the log-likelihood of the regression model. It is a way to measure the model fit.
aic The Akaike Information Criteria, AIC, estimates the amount of information loss by the model.
jarque_bera It is a test for normality in the residuals of a spatial regression model.
vif The variance inflation factor (VIF) is helpful to detect multicollinearity. Multicollinearity happens when a spatial regression model has a correlation between the explanatory variables. It measures how much the variance of an estimated regression coefficient increases because of collinearity.

Sometimes, features with high multicollinearity with another feature should be removed from the spatial model.

See the oraclesai.metrics module in Python API Reference for Oracle Spatial AI for more information.