public class StringIndexer extends Object implements Estimator<StringIndexer,StringIndexerModel>, StringIndexerParams<StringIndexer>
A string indexer maps one or more columns (string/numerical value) of the input to one or more indexed output columns (integer value). The output indices of two data points are the same iff their corresponding input columns are the same. The indices are in [0, numDistinctValuesInThisColumn].
The input columns are cast to string if they are numeric values. By default, the output model
is arbitrarily ordered. Users can control this by setting StringIndexerParams.STRING_ORDER_TYPE
.
User can also control the max number of output indices by setting StringIndexerParams.MAX_INDEX_NUM
. This parameter only works if StringIndexerParams.STRING_ORDER_TYPE
is set as 'frequencyDesc'.
The `keep` option of HasHandleInvalid
means that we transform the invalid input into a
special index, whose value is the number of distinct values in this column.
ALPHABET_ASC_ORDER, ALPHABET_DESC_ORDER, ARBITRARY_ORDER, FREQUENCY_ASC_ORDER, FREQUENCY_DESC_ORDER, MAX_INDEX_NUM, STRING_ORDER_TYPE
INPUT_COLS
OUTPUT_COLS
ERROR_INVALID, HANDLE_INVALID, KEEP_INVALID, SKIP_INVALID
Constructor and Description |
---|
StringIndexer() |
Modifier and Type | Method and Description |
---|---|
StringIndexerModel |
fit(org.apache.flink.table.api.Table... inputs)
Trains on the given inputs and produces a Model.
|
Map<Param<?>,Object> |
getParamMap()
Returns a map which should contain value for every parameter that meets one of the following
conditions.
|
static StringIndexer |
load(org.apache.flink.table.api.bridge.java.StreamTableEnvironment tEnv,
String path) |
void |
save(String path)
Saves the metadata and bounded data of this stage to the given path.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getMaxIndexNum, getStringOrderType, setMaxIndexNum, setStringOrderType
getInputCols, setInputCols
getOutputCols, setOutputCols
getHandleInvalid, setHandleInvalid
get, getParam, set
public void save(String path) throws IOException
Stage
save
in interface Stage<StringIndexer>
IOException
public static StringIndexer load(org.apache.flink.table.api.bridge.java.StreamTableEnvironment tEnv, String path) throws IOException
IOException
public Map<Param<?>,Object> getParamMap()
WithParams
1) set(...) has been called to set value for this parameter.
2) The parameter is a public final field of this WithParams instance. This includes fields inherited from its interfaces and super-classes.
The subclass which implements this interface could meet this requirement by returning a
member field of the given map type, after having initialized this member field using the
ParamUtils.initializeMapWithDefaultValues(Map, WithParams)
method.
getParamMap
in interface WithParams<StringIndexer>
public StringIndexerModel fit(org.apache.flink.table.api.Table... inputs)
Estimator
fit
in interface Estimator<StringIndexer,StringIndexerModel>
inputs
- a list of tablesCopyright © 2019–2023 The Apache Software Foundation. All rights reserved.