public class RegexTokenizer extends Object implements Transformer<RegexTokenizer>, RegexTokenizerParams<RegexTokenizer>
Moreover, it provides parameters to filter tokens with a minimal length and converts input to lowercase. The output of each input string is an array of strings that can be empty.
Modifier and Type | Class and Description |
---|---|
static class |
RegexTokenizer.RegexTokenizerUdf
The main logic of $
RegexTokenizer , which converts the input string to an array of
tokens. |
GAPS, MIN_TOKEN_LENGTH, PATTERN, TO_LOWERCASE
INPUT_COL
OUTPUT_COL
Constructor and Description |
---|
RegexTokenizer() |
Modifier and Type | Method and Description |
---|---|
Map<Param<?>,Object> |
getParamMap()
Returns a map which should contain value for every parameter that meets one of the following
conditions.
|
static RegexTokenizer |
load(org.apache.flink.table.api.bridge.java.StreamTableEnvironment tEnv,
String path) |
void |
save(String path)
Saves the metadata and bounded data of this stage to the given path.
|
org.apache.flink.table.api.Table[] |
transform(org.apache.flink.table.api.Table... inputs)
Applies the AlgoOperator on the given input tables and returns the result tables.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
get, getParam, set
getGaps, getMinTokenLength, getPattern, getToLowercase, setGaps, setMinTokenLength, setPattern, setToLowercase
getInputCol, setInputCol
getOutputCol, setOutputCol
public org.apache.flink.table.api.Table[] transform(org.apache.flink.table.api.Table... inputs)
AlgoOperator
transform
in interface AlgoOperator<RegexTokenizer>
inputs
- a list of tablespublic void save(String path) throws IOException
Stage
save
in interface Stage<RegexTokenizer>
IOException
public Map<Param<?>,Object> getParamMap()
WithParams
1) set(...) has been called to set value for this parameter.
2) The parameter is a public final field of this WithParams instance. This includes fields inherited from its interfaces and super-classes.
The subclass which implements this interface could meet this requirement by returning a
member field of the given map type, after having initialized this member field using the
ParamUtils.initializeMapWithDefaultValues(Map, WithParams)
method.
getParamMap
in interface WithParams<RegexTokenizer>
public static RegexTokenizer load(org.apache.flink.table.api.bridge.java.StreamTableEnvironment tEnv, String path) throws IOException
IOException
Copyright © 2019–2023 The Apache Software Foundation. All rights reserved.