public class OnlineKMeans extends Object implements Estimator<OnlineKMeans,OnlineKMeansModel>, OnlineKMeansParams<OnlineKMeans>
KMeans
, supporting to train a K-Means model
continuously according to an unbounded stream of train data.
OnlineKMeans makes updates with the "mini-batch" KMeans rule, generalized to incorporate forgetfulness (i.e. decay). After the centroids estimated on the current batch are acquired, OnlineKMeans computes the new centroids from the weighted average between the original and the estimated centroids. The weight of the estimated centroids is the number of points assigned to them. The weight of the original centroids is also the number of points, but additionally multiplying with the decay factor.
The decay factor scales the contribution of the clusters as estimated thus far. If the decay factor is 1, all batches are weighted equally. If the decay factor is 0, new centroids are determined entirely by recent data. Lower values correspond to more forgetting.
BATCH_STRATEGY, COUNT_STRATEGY
GLOBAL_BATCH_SIZE
DECAY_FACTOR
K
DISTANCE_MEASURE
FEATURES_COL
PREDICTION_COL
Constructor and Description |
---|
OnlineKMeans() |
Modifier and Type | Method and Description |
---|---|
OnlineKMeansModel |
fit(org.apache.flink.table.api.Table... inputs)
Trains on the given inputs and produces a Model.
|
Map<Param<?>,Object> |
getParamMap()
Returns a map which should contain value for every parameter that meets one of the following
conditions.
|
static OnlineKMeans |
load(org.apache.flink.table.api.bridge.java.StreamTableEnvironment tEnv,
String path) |
void |
save(String path)
Saves the metadata AND bounded model data table (if exists) to the given path.
|
OnlineKMeans |
setInitialModelData(org.apache.flink.table.api.Table initModelDataTable)
Sets the initial model data of the online training process with the provided model data
table.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getBatchStrategy
getGlobalBatchSize, setGlobalBatchSize
getDecayFactor, setDecayFactor
getK, setK
getDistanceMeasure, setDistanceMeasure
getFeaturesCol, setFeaturesCol
getPredictionCol, setPredictionCol
get, getParam, set
public OnlineKMeansModel fit(org.apache.flink.table.api.Table... inputs)
Estimator
fit
in interface Estimator<OnlineKMeans,OnlineKMeansModel>
inputs
- a list of tablespublic void save(String path) throws IOException
save
in interface Stage<OnlineKMeans>
IOException
public static OnlineKMeans load(org.apache.flink.table.api.bridge.java.StreamTableEnvironment tEnv, String path) throws IOException
IOException
public Map<Param<?>,Object> getParamMap()
WithParams
1) set(...) has been called to set value for this parameter.
2) The parameter is a public final field of this WithParams instance. This includes fields inherited from its interfaces and super-classes.
The subclass which implements this interface could meet this requirement by returning a
member field of the given map type, after having initialized this member field using the
ParamUtils.initializeMapWithDefaultValues(Map, WithParams)
method.
getParamMap
in interface WithParams<OnlineKMeans>
public OnlineKMeans setInitialModelData(org.apache.flink.table.api.Table initModelDataTable)
Copyright © 2019–2023 The Apache Software Foundation. All rights reserved.