KMeans$ (flink 1.2-SNAPSHOT API)

java.lang.Object
- org.apache.flink.examples.scala.clustering.KMeans$

```
public class KMeans$
extends Object
```
This example implements a basic K-Means clustering algorithm.
K-Means is an iterative clustering algorithm and works as follows: K-Means is given a set of data points to be clustered and an initial set of ''K'' cluster centers. In each iteration, the algorithm computes the distance of each data point to each cluster center. Each point is assigned to the cluster center which is closest to it. Subsequently, each cluster center is moved to the center (''mean'') of all points that have been assigned to it. The moved cluster centers are fed into the next iteration. The algorithm terminates after a fixed number of iterations (as in this implementation) or if cluster centers do not (significantly) move in an iteration. This is the Wikipedia entry for the .org/wiki/K-means_clustering K-Means Clustering algorithm.
This implementation works on two-dimensional data points. It computes an assignment of data points to cluster centers, i.e., each data point is annotated with the id of the final cluster (center) it belongs to.
Input files are plain text files and must be formatted as follows:
- Data points are represented as two double values separated by a blank character. Data points are separated by newline characters. For example "1.2 2.3\n5.3 7.2\n" gives two data points (x=1.2, y=2.3) and (x=5.3, y=7.2). - Cluster centers are represented by an integer id and a point value. For example "1 6.2 3.2\n2 2.9 5.7\n" gives two centers (id=1, x=6.2, y=3.2) and (id=2, x=2.9, y=5.7).
Usage:
```
   KMeans --points <path> --centroids <path> --output <path> --iterations <n>
 
```
If no parameters are provided, the program is run with default data from KMeansData and 10 iterations.
This example shows how to use:
- Bulk iterations - Broadcast variables in bulk iterations - Scala case classes

Field Summary

Fields
Modifier and Type Field and Description

static KMeans$ MODULE$
Static reference to the singleton instance of this Scala object.

Fields
Modifier and Type	Field and Description
`static KMeans$`	`MODULE$` Static reference to the singleton instance of this Scala object.

Constructor Summary

Constructors
Constructor and Description

KMeans$()

Constructors
Constructor and Description
`KMeans$()`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`DataSet<KMeans.Centroid>`	`getCentroidDataSet(ParameterTool params, ExecutionEnvironment env)`
`DataSet<KMeans.Point>`	`getPointDataSet(ParameterTool params, ExecutionEnvironment env)`
`void`	`main(String[] args)`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail
- MODULE$
```
public static final KMeans$ MODULE$
```
  Static reference to the singleton instance of this Scala object.

Constructor Detail
- KMeans$
```
public KMeans$()
```

Method Detail

main
```
public void main(String[] args)
```

getCentroidDataSet

public DataSet<KMeans.Centroid> getCentroidDataSet(ParameterTool params,
                                                   ExecutionEnvironment env)

getPointDataSet

public DataSet<KMeans.Point> getPointDataSet(ParameterTool params,
                                             ExecutionEnvironment env)

Back to Flink Website

Class KMeans$

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

MODULE$

Constructor Detail

KMeans$

Method Detail

main

getCentroidDataSet

getPointDataSet

Back to Flink Website