oraclesai.preprocessing.spatial_train_test_split

spatial_train_test_split(X, y=None, geometries=None, test_size=0.3, numpy_result=False, random_state=None) Tuple

Splits data into train and test sub sets. Each sub set is divided into: explanatory variables X and geometries, and target variable y, where: X is a multi dimensional array of n-samples * n-features, while geometry and y are one dimensional arrays of n-samples.

Parameters:
  • X – A oraclesai.SpatialDataFrame, geopandas.GeoDataFrame, pandas.DataFrame or numpy array. When X is a SpatialDataFrame or a DataFrame, it can contain the columns for geometries and y too.

  • y – The name of the target variable column in X or a 1-d numpy array

  • geometries – The name of the spatial column in X or a 1-d numpy array of shapely geometries

  • test_size – (default=0.3) proportion of the test set. A value from 0 to 1

  • numpy_result – If True, the returned vector will always be numpy arrays. If False, the returned types will match the types of the input data.

  • random_state – (None) the seed used to generate a random number.

Returns:

A tuple containing X_train, X_test, y_train, y_test, geometries_train, geometries_test.