oraclesai.clustering.AgglomerativeClustering
- class AgglomerativeClustering(n_clusters=2, metric='euclidean', linkage='ward', distance_threshold=None, n_jobs=None, spatial_weights_definition=None)
Agglomerative Clustering Algorithm. Each observation starts in its own cluster; then, the two closest clusters are merged to form one cluster; the process is repeated until a stopping condition is met or until one cluster remains. By defining spatial weights, the algorithm executes Regionalization, including a spatial constraint that causes elements of the same cluster to be geographically connected.
- Parameters:
n_clusters – int, default=2. The number of clusters to form
metric – str or callable, default=”euclidean”. The metric to use when calculating the distance between observations.
linkage – {‘ward’, ‘complete’, ‘average’, ‘single’}, default=’ward’. Determines the distance to use. The algorithm merges pairs of cluster that minimize this criterion. ‘ward’ minimizes the variance of the clusters. ‘average’ uses the average of the distances of each observation of the two clusters. ‘complete’ uses the maximum distances between all observations of the two clusters. ‘single’ uses the minimum distances between all observations of the two clusters.
distance_threshold – float, default=None. The linkage distance threshold. If not None, then
n_clusters
must be Nonen_jobs – int, default=None. The number of parallel jobs to run
spatial_weights_definition – SpatialWeightsDefinition, default=None. Spatial relationship specification. Defines the criteria used to identify neighbors, for example, KNNWeightsDefinition, DistanceBandWeightsDefinition, etc.
Methods
__init__
([n_clusters, metric, linkage, ...])fit
(X[, y, geometries, spatial_weights, crs])Initially, all observations are associated with a different cluster; then it merges the two closest clusters according to the
linkage
parameter; it continues doing this until the number of clusters is equal ton_clusters
or until the distance between the two nearest clusters is greater than``distance_threshold``.fit_predict
(X[, y, geometries, ...])Trains the clustering model and returns the labels assigned to each observation.
get_params
([deep])Get parameters for this estimator.
set_params
(**params)Set the parameters of this estimator.
Attributes
METRIC_PRECOMPUTED
NON_NEIGHBOR_DISTANCE
The Isoperimetric quotient (IPQ) for the resulting clusters.
Array indicating the cluster associated with each sample.
The number of clusters.
The Silhouette score for the resulting clusters.