UDFs for OML Embedded Python Execution
Registering UDFs from Python
|
Register Spatial AI UDFs so they can be executed through OML Embedded Python Execution for SQL and REST. |
UDFs
User Defined Functions to be executed though OML Embedded Python Execution for SQL and REST APIs.
Function compute_spatial_weights
Computes the spatial weights for the given spatial table. Stores the spatial weights object in the data store specified by the save_weights_as parameter.
Parameter |
Type |
Description |
---|---|---|
table |
String |
The name of a database table. |
weights_def |
Specifies the type of spatial weights to be computed. |
|
save_weights_as |
Specifies how the computed spatial weights will be stored in datastore. |
|
spatial_col |
String |
(Optional) - The name of the spatial column for which the spatial weights will be computed. If the table only contains a single spatial column, it is not needed to specify this value. |
crs |
String or number |
(Optional) - The spatial cooridate system associated to the goemetries of the spatial column. It can be specified as an SRID number an authority string such as EPSG:4326 or a WKT string. |
Function compute_global_spatial_autocorrelation
Computes the Moran I index for the given spatial table and column. Returns the following statistics: I, expected I, p value, z value
Parameter |
Type |
Description |
---|---|---|
table |
String |
The name of a database table. |
column |
String |
The name of the column to calculate the spatial autocorrelation. |
weights |
(Optional) - Existing spatial weights object. Previously calculated for the current spatial table’s geometries. If not specified, weights_def must be provided. |
|
weights_def |
(Optional) - Specifies the type of spatial weights to be computed. If not specified, weights must be provided. |
|
save_weights_as |
(Optional) - Specifies how the computed spatial weights will be stored in datastore. It is only used if weights_def is provided. |
|
spatial_col |
String |
(Optional) - The name of the spatial column for which the spatial weights will be computed. If the table only contains a single spatial column, it is not needed to specify this value. |
crs |
String or number |
(Optional) - The spatial cooridate system associated to the goemetries of the spatial column. It can be specified as an SRID number an authority string such as EPSG:4326 or a WKT string. |
Returns
FIELD |
Type |
---|---|
I |
NUMBER |
expected_I |
NUMBER |
p_value |
NUMBER |
z_value |
NUMBER |
Function compute_local_spatial_autocorrelation
Computes the statistics for the Local Spatial Autocorrelation of all the rows from the given spatial tables using Local Moran. Returns a tabular result containing the statistics for row from the input table. Returned statistics include: I, p value, z value, quadrant.
Parameter |
Type |
Description |
---|---|---|
table |
String |
The name of a database table. |
column |
String |
The name of the column to calculate the local spatial autocorrelation. |
result_table |
String |
(Optional) - If specified, the result will be stored in this table. |
key_column |
String |
(Optional) - A column from the input table used to associated rows from the input table with the results of this operation. If no specified, ROWNUM will be used from table. |
weights |
(Optional) - Existing spatial weights object. Previously calculated for the current spatial table’s geometries. If not specified, weights_def must be provided. |
|
weights_def |
(Optional) - Specifies the type of spatial weights to be computed. If not specified, weights must be provided. |
|
save_weights_as |
(Optional) - Specifies how the computed spatial weights will be stored in datastore. It is only used if weights_def is provided. |
|
spatial_col |
String |
(Optional) - The name of the spatial column for which the spatial weights will be computed. If the table only contains a single spatial column, it is not needed to specify this value. |
crs |
String or number |
(Optional) - The spatial cooridate system associated to the goemetries of the spatial column. It can be specified as an SRID number an authority string such as EPSG:4326 or a WKT string. |
Returns
A resultset containing the same number of rows as the input table.
FIELD |
Type |
---|---|
value of key_column parameter or ‘id’ if no key_column param is provided |
Depends on the type of the column referenced by key_column |
local_moran_I |
NUMBER |
p_value |
NUMBER |
z_value |
NUMBER |
quadrant |
NUMBER (HOTSPOT/high-high=1, DOUGHNUT/high-low=2, COLDSPOT/low-low=3, DIAMOND/low-high=4) |
Function create_spatial_lag
Creates a spatial lag for the given column of the provided spatial table. Returns a tabular result which includes the calculated spatial lag for each row from the input table.
Parameter |
Type |
Description |
---|---|---|
table |
String |
The name of a database table. |
column |
String |
The name of the column for which the spatial lag will be calculated. |
result_table |
String |
(Optional) - If specified, the result will be stored in this table. |
key_column |
String |
(Optional) - A column from the input table used to associated rows from the input table with the results of this operation. If no specified, ROWNUM will be used from table. |
weights |
(Optional) - Existing spatial weights object. Previously calculated for the current spatial table’s geometries. If not specified, weights_def must be provided. |
|
weights_def |
(Optional) - Specifies the type of spatial weights to be computed. If not specified, weights must be provided. |
|
save_weights_as |
(Optional) - Specifies how the computed spatial weights will be stored in datastore. It is only used if weights_def is provided. |
|
spatial_col |
String |
(Optional) - The name of the spatial column for which the spatial weights will be computed. If the table only contains a single spatial column, it is not needed to specify this value. |
crs |
String or number |
(Optional) - The spatial cooridate system associated to the goemetries of the spatial column. It can be specified as an SRID number an authority string such as EPSG:4326 or a WKT string. |
Returns
A resultset containing the spatial lag column, containing a value for each row from the input table.
FIELD |
Type |
---|---|
value of key_column parameter or ‘id’ if no key_column param is provided |
Depends on the type of the column referenced by key_column |
<column>_SLAG (Same name as column input param with the suffix _SLAG) |
Depends on the type of the column input param |
Function clustering
Peforms clustering on the given spatial table, selecting the given columns or all the columns of the table if no columns parameter was provided. Available clustering methods are: DBSCAN, Agglomerative, KMeans.
Parameter |
Type |
Description |
---|---|---|
table |
String |
The name of a database table. |
columns |
String |
The name of the columns to be considered as features by the clustering algorithm. |
method |
String |
One of the supported clustering algorithms. Possible values: KMEANS, DBSCAN, AGGLOMERATIVE. |
scale |
Boolean |
(Optional - default=true) If true, all the values for the feature columns will be scaled. |
result_table |
String |
(Optional) - If specified, the result will be stored in this table. |
key_column |
String |
(Optional) - A column from the input table used to associated rows from the input table with the results of this operation. If no specified, ROWNUM will be used from table. |
weights |
(Optional) - Existing spatial weights object. Previously calculated for the current spatial table’s geometries. If not specified, weights_def must be provided. |
|
weights_def |
(Optional) - Specifies the type of spatial weights to be computed. If not specified, weights must be provided. |
|
save_weights_as |
(Optional) - Specifies how the computed spatial weights will be stored in datastore. It is only used if weights_def is provided. |
|
geometry_as_feature |
Boolean |
(Optional, default=false) - If true, and not spatial weights or spatial weights definition is provided, the spatial column will be used as feature for the clustering. |
spatial_col |
String |
(Optional) - The name of the spatial column for which the spatial weights will be computed when performing regionalization, or used as clustering feature if geometry_as_feature is set to true, otherwise, it is ignored. If the table only contains a single spatial column, it is not needed to specify this value. |
crs |
String or number |
(Optional) - The spatial cooridate system associated to the goemetries of the spatial column. It can be specified as an SRID number an authority string such as EPSG:4326 or a WKT string. |
plot |
(Optional) - If provided, the clustering results will be plotted and an image will be returned. |
The following parameters are specific to clustering algorithms.
KMEANS Parameters
Parameter |
Type |
Description |
---|---|---|
n_clusters |
Number |
(Optional) - The number of clusters to form as well as the number of centroids to generate. Elbow init method is used if not provided. |
init |
String |
(Optional, default=k-means++) - Method for cluster initialization. Posible values: k-means++, random. |
n_init |
Number |
(Optional, default=10) - Number of times k-means will run with different centroid seeds. |
max_iter |
Number |
(Optional, default=300) - Maximum number of iterations of the k-means algorithm for a single run. |
tol |
Float |
(Optional, default=1e-4) - Relative tolerance with regards to Frobenius norm of the difference in the cluster centers of two consecutive iterations to declare convergence |
random_state |
Number |
(Optional) - Determines random number generation for centroid initialization. Use an int to make the randomness deterministic. |
algorithm |
String |
(Optional, default=auto) - K-means algorithm to use. The classical EM-style is “full”. The “elkan” variation is more efficient on data with well-defined clusters, by using the triangle inequality. However it’s more memory intensive due to the allocation of an extra array of shape (n_samples, n_clusters). Possible values: auto, full, elkan. |
init_method |
String |
(Optional, default=elbow) - Possible values elbow, silhouette, gmeans. |
DBSCAN Parameters
Parameter |
Type |
Description |
---|---|---|
eps |
Float |
(Optional) - The maximum distance between two samples for one to be considered as in the neighborhood of the other. If eps is None, the K-Distance method is used to estimate the best value for eps. |
min_samples |
Number |
(Optional) - The number of samples in a neighborhood for a point to be considered as a core point. If min_samples is None, it is estimated using the number of features in the data. |
metric |
String |
(Optional, default=euclidean) - The metric used to calculate the distance between instances in a feature array. Possible values cityblock, cosine, euclidean, haversine, manhattan. |
algorithm |
String |
(Optional, default=auto) - The algorithm to be used by the NearestNeighbors module to compute pointwise distances and find nearest neighbors. Possible values: auto, ball_tree, kd_tree, brute. |
leaf_size |
Number |
(Optional, default=30) - Leaf size passed to BallTree or cKDTree. This can affect the speed of the construction and query, as well as the memory required to store the tree. The optimal value depends on the nature of the problem. |
p |
Float |
(Optional) - The power of the Minkowski metric to be used to calculate distance between points. If None, then |
algorithm |
String |
(Optional, default=auto) - K-means algorithm to use. The classical EM-style is “full”. The “elkan” variation is more efficient on data with well-defined clusters, by using the triangle inequality. However it’s more memory intensive due to the allocation of an extra array of shape (n_samples, n_clusters). Possible values: auto, full, elkan. |
init_method |
Boolean |
(Optional, default=true) - If true, it will use the spatial weight matrix as distance. If false, it will set the distance to all neighbors to zero |
AGGLOMERATIVE Parameters
Parameter |
Type |
Description |
---|---|---|
n_clusters |
Number |
(Optional, default=2) - The number of clusters to form. |
affinity |
String |
(Optional, default=euclidean) The metric to use when calculating the distance between observations. Possible values cityblock, cosine, euclidean, haversine, manhattan. |
linkage |
String |
(Optional, default=ward) - Determines the distance to use. The algorithm merges pairs of cluster that minimize this criterion. | * ward: Minimizes the variance of the clusters. | * average: Uses the average of the distances of each observation of the two clusters. | * complete: Uses the maximum distances between all observations of the two clusters. | * single: Uses the minimum distances between all observations of the two clusters |
distance_threshold |
Float |
(Optional) The linkage distance threshold. If specified, then n_clusters must not be specified. |
Returns
A resultset containing the label assigned to each row from the input table.
Optionally, a plot image can be returned.
FIELD |
Type |
---|---|
value of key_column parameter or ‘id’ if no key_column param is provided |
Depends on the type of the column referenced by key_column |
label |
Number |
JSON Types
The following are JSON types used accross REST and SQL functions.
Spatial Weights Definition
Describes the spatial weights to be computed.
Fields:
type: can contain one of the following values: KNN, DistanceBand, Kernel, Queen, Rook.
[swdef_type_fields]: Properties from the equivalent SpatialWeightsDefinition classes. The fields used are the same as the parameters taken by the constructor of the equivalent python classes.
KNN fields: See
oraclesai.weights.KNNWeightsDefinition
DistanceBand fields: See
oraclesai.weights.DistanceBandWeightsDefinition
Kernel fields: See
oraclesai.weights.KernelBasedWeightsDefinition
Queen fields: See
oraclesai.weights.QueenWeightsDefinition
Rook fields: See
oraclesai.weights.RookWeightsDefinition
Examples:
{
"type": "KNN",
"k": 5
}
{
"type": "DistanceBand",
"threshold": 2000.0
}
{
"type": "Queen"
}
Datastore Save Specification
Specifies how an object can be saved in an OML datastore.
Fields:
ds_name: Name of the datastore where the object will be saved
obj_name: The name used to save the object
append: If true, the object is appended to the datastore
overwrite: If an object exists with the same name and it is true, the object will be overwritten. Otherwise the operation will fail.
Example:
{
"ds_name": "datastore1",
"obj_name": "my_ob1",
"append": true,
"overwrite_obj": false
}
Datastore Object Location
Specified the location of an object in a datastore
Fields:
ds_name: Name of an existing datastore
obj_name: The name of an object in the datastore
Example:
{
"ds_name": "datastore1",
"obj_name": "my_obj1"
}
Cluster Plotting Parameters
Contains parameters used for plotting clustering results. In its simpler form, it can be empty and as long as the OML control parameter oml_graphics_flag is set to true, a plot will be generated.
Fields:
width: Width of the image
height: Height of the image
title: Title of the plot
with_noise: (default=false) if true and DBSCAN is used, noise points will be shown.
with bounds: (default=false) if true, clusters will be drawn as polygons.
with_basemap: (default=false) if true, a basemap will be added to the background.
with_legend: (default=true) if true, a legend with the clusters labels is added to the plot.
Example:
{
"width": 20,
"height": 15,
"title": "Clusters",
"with_noise": true,
"with_bounds": false,
"with_basemap": true,
"with_legend": true
}