Spatial Lag Transformer
The spatial lag of a particular feature reflects the average value of that feature in the neighborhood around each observation.
For example, in a given neighborhood, the spatial lag of the house price is the average house price surrounding a specific house or location. This is a feature engineering method which computes spatial lag values that can be directly used to train any machine learning models.
The SpatialLagTransformer
class calculates the spatial lag
of training data and changes the value of an observation to its spatial lag. In other
words, it changes an observation's value to the average value of its neighbors.
To create an instance of SpatialLagTransformer
, it is
necessary to define the spatial_weights_definition
parameter, which
establishes the relationship between neighboring locations.
The main methods of the class are described in the following table.
Method | Description |
---|---|
fit |
Computes the spatial lag for all the features in the training set. |
transform |
Changes the spatial lag value depending on the
use_fit_lag parameter. If
use_fit_lag=True , then it calculates the spatial
lag from the training set. Otherwise, it computes the spatial lag from
the data passed into the transform method. The function
returns a NumPy array.
|
fit_transform |
Calls the fit and
transform methods in sequence with the training
data.
|
See the SpatialLagTransformer class in Python API Reference for Oracle Spatial AI for more information.
The following example uses the block_groups
SpatialDataFrame
and the SpatialLagTransformer
method to change the MEAN_AGE
and
HOUSE_VALUE
features values to determine their spatial
lag values. Note that the MEDIAN_INCOME
feature is ignored since
it is defined as the target variable. The geometry
feature is used to
calculate the spatial lag, but it is not part of the output from the transformer.
from oraclesai.weights import KNNWeightsDefinition
from oraclesai.preprocessing import SpatialLagTransformer
# Define the variables
X = block_groups[["MEDIAN_INCOME", "MEAN_AGE", "HOUSE_VALUE", "geometry"]]
# Print original data
print(f">> Original data:\n {X[['MEAN_AGE', 'HOUSE_VALUE']].get_values()[:5]}")
# Define spatial weights
weights_definition = KNNWeightsDefinition(k=5)
# Create an instance of SpatialLagTransformer
spatial_lag_transformer = SpatialLagTransformer(spatial_weights_definition=weights_definition)
# Print the transformed data
X_spatial_lag = spatial_lag_transformer.fit_transform(X, y="MEDIAN_INCOME", geometries="geometry")
print(f"\n>> Transformed data:\n {X_spatial_lag[:5, :]}")
The resulting output is a NumPy array with the spatial lag of the
MEAN_AGE
and HOUSE_VALUE
.
>> Original data:
[[4.75847626e+01 4.56300000e+05]
[3.88231812e+01 8.36300000e+05]
[4.78076096e+01 1.12630000e+06]
[4.65636330e+01 9.60400000e+05]
[5.11550865e+01 1.01090000e+06]]
>> Transformed data:
[[4.03809292e+01 6.23460000e+05]
[3.95882790e+01 8.20100000e+05]
[4.69466225e+01 1.22280000e+06]
[4.25439751e+01 1.04664000e+06]
[4.43390564e+01 1.14368000e+06]]