Spatial Pipeline
The SpatialPipeline class shares spatial information
through a pipeline of transformers, other estimators, and a final estimator.
Note that the final estimator step of the pipeline is not optional in this case. A typical scenario consists of having a preprocessing pipeline in charge of different tasks, such as cleaning the data, filling missing values, and standardizing the data. Then, the preprocessing pipeline is part of another pipeline with a final estimator, either a regressor or a classifier.
The following table describes the main methods of the
SpatialPipeline class.
| Method | Description |
|---|---|
fit |
Calls the fit method of the pipeline
transformers and the final estimator.
|
fit_predict |
Calls the fit and
transform methods of the pipeline transformer and
the fit and predict methods of the
final estimator.
|
predict |
Calls the transform method of all the
transformers in the pipeline and calls the predict
method of the final estimator.
|
See the SpatialPipeline class in Python API Reference for Oracle Spatial AI for more information.
The following example uses the block_groups
SpatialDataFrame and SpatialColumnTransformer to
define a feature-engineering step, which creates new columns representing the spatial
lag of specific columns. Then, the feature-engineering step is added into a
SpatialPipeline, along with a pre-processing step that standardizes
the data and a final estimator consisting of a spatial error regression model.
from oraclesai.pipeline import SpatialColumnTransformer
from oraclesai.weights import KNNWeightsDefinition
from oraclesai.preprocessing import SpatialLagTransformer
from oraclesai.regression import SpatialErrorRegressor
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler
# Define target and explanatory variables
X = block_groups[["MEAN_AGE", "HOUSE_VALUE", "MEDIAN_INCOME", "geometry"]]
# Define spatial weights
weights_definition = KNNWeightsDefinition(k=10)
# Define a Spatial Lag Transformer
spatial_lag_transformer = SpatialLagTransformer(spatial_weights_definition=weights_definition)
# Create an instance of SpatialErrorRegressor
spatial_error_regressor = SpatialErrorRegressor(spatial_weights_definition=weights_definition)
# Use SpatialColumnTransformer to concatenate column subsets
feature_engineering_step = SpatialColumnTransformer([
("imputer", SimpleImputer(), ["MEAN_AGE", "HOUSE_VALUE"]),
("spatial_lag", spatial_lag_transformer, ["HOUSE_VALUE"])])
# Create a pipeline with three steps: Feature-Engineering, Scaler, Regressor
regression_pipeline = SpatialPipeline([
("feature_engineering", feature_engineering_step),
("scaler", StandardScaler()),
("regressor", spatial_error_regressor)
])
# Train the model
regression_pipeline.fit(X, y="MEDIAN_INCOME")
# Print the score of the training set
print(f"r2_score = {regression_pipeline.score(X, y='MEDIAN_INCOME')}")The output consists of the R-squared metric from the final estimator. The example calls
the score method to run the transform methods of all the transformers
in the pipeline.
r2_score = 0.5559292598577543