public class KMeans
extends java.lang.Object
Modifier and Type | Class and Description |
---|---|
static class |
KMeans.KMeansCounters
Counters incremented by the KMeans algorithm
|
static class |
KMeans.KMeansIterationResult
Holds the result of a cluster iteration.
|
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
CLUSTERS_FILE_PREFIX
The file name prefix of the intermediate clusters file generated between iterations
|
static java.lang.String |
CONF_CLUSTERS_FILE
Job configuration: the name of the file that will contain the clusters information between iterations
|
static java.lang.String |
CONF_CRIT_FUN_CLASS
Job configuration: the
CriterionFunction implementation class that should be used |
static java.lang.String |
CONF_CRIT_FUN_TERMINATION_VALUE
Job configuration: the minimum double value used to determine if a criterion function is converging
|
static java.lang.String |
CONF_K
Job configuration: the number of cluster to generate as an integer value
|
static java.lang.String |
CONF_MAX_MEMBER_DISTANCE
Maximum distance between a cluster center and a cluster member.
|
static java.lang.String |
CONF_SHAPE_GEN_CLASS
Job configuration: the
ClusterShapeGenerator implementation class used |
static java.lang.String |
JSON_CLUSTER_FILE
The name of the resulting JSON file
|
static java.lang.String |
MR_OUT_DIR
The name of the directory where the intermediate results are stored
|
Constructor and Description |
---|
KMeans() |
Modifier and Type | Method and Description |
---|---|
static Path |
buildClustersFilePath(Path outPath, int iteration)
Gets the path where the results of the given iteration can be stored
|
static ClusterInfo[] |
readClusters(FileStatus[] clusterSeqFiles, int k, Configuration conf)
Reads the information from a list of cluster files
|
static void |
writeClusters(Path clustersFile, ClusterInfo[] clusters, Configuration conf)
Writes the given
ClusterInfo array to a sequence file |
static void |
writeJsonClusters(Path clustersFile, ClusterInfo[] clusters, Configuration conf)
Writes the given
ClusterInfo array to a JSON file |
public static final java.lang.String CLUSTERS_FILE_PREFIX
public static final java.lang.String CONF_CLUSTERS_FILE
public static final java.lang.String CONF_CRIT_FUN_CLASS
CriterionFunction
implementation class that should be usedpublic static final java.lang.String CONF_CRIT_FUN_TERMINATION_VALUE
public static final java.lang.String CONF_K
public static final java.lang.String CONF_MAX_MEMBER_DISTANCE
public static final java.lang.String CONF_SHAPE_GEN_CLASS
ClusterShapeGenerator
implementation class usedpublic static final java.lang.String JSON_CLUSTER_FILE
public static final java.lang.String MR_OUT_DIR
public static Path buildClustersFilePath(Path outPath, int iteration)
outPath
- the current K-means clustering out pathiteration
- the current iteration numberpublic static ClusterInfo[] readClusters(FileStatus[] clusterSeqFiles, int k, Configuration conf) throws java.io.IOException
clusterSeqFiles
- an array of cluster files FileStatusk
- the number of clustersconf
- the job configurationClusterInfo
instancesjava.io.IOException
public static void writeClusters(Path clustersFile, ClusterInfo[] clusters, Configuration conf) throws java.io.IOException
ClusterInfo
array to a sequence fileclustersFile
- the sequence file path where the clusters will be written toclusters
- an array of ClusterInfo
instances to be writtenconf
- the job configurationjava.io.IOException
public static void writeJsonClusters(Path clustersFile, ClusterInfo[] clusters, Configuration conf) throws java.io.IOException
ClusterInfo
array to a JSON fileclustersFile
- the JSON file path where the clusters will be written toclusters
- an array of ClusterInfo
instances to be writtenconf
- the job configurationjava.io.IOException
Copyright © 2017, 2019 Oracle and/or its affiliates. All Rights Reserved.