Class PythonAggregateFunction
- java.lang.Object
-
- org.apache.flink.table.functions.UserDefinedFunction
-
- org.apache.flink.table.functions.ImperativeAggregateFunction<T,ACC>
-
- org.apache.flink.table.functions.AggregateFunction
-
- org.apache.flink.table.functions.python.PythonAggregateFunction
-
- All Implemented Interfaces:
Serializable
,FunctionDefinition
,PythonFunction
@Internal public class PythonAggregateFunction extends AggregateFunction implements PythonFunction
The wrapper of user defined python aggregate function.- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description PythonAggregateFunction(String name, byte[] serializedAggregateFunction, String[] inputTypesString, String resultTypeString, String accumulatorTypeString, PythonFunctionKind pythonFunctionKind, boolean deterministic, boolean takesRowAsInput, PythonEnv pythonEnv)
PythonAggregateFunction(String name, byte[] serializedAggregateFunction, PythonFunctionKind pythonFunctionKind, boolean deterministic, boolean takesRowAsInput, PythonEnv pythonEnv)
PythonAggregateFunction(String name, byte[] serializedAggregateFunction, DataType[] inputTypes, DataType resultType, DataType accumulatorType, PythonFunctionKind pythonFunctionKind, boolean deterministic, boolean takesRowAsInput, PythonEnv pythonEnv)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
accumulate(Object accumulator, Object... args)
Object
createAccumulator()
Creates and initializes the accumulator for thisImperativeAggregateFunction
.TypeInformation
getAccumulatorType()
Returns theTypeInformation
of theImperativeAggregateFunction
's accumulator.PythonEnv
getPythonEnv()
Returns the Python execution environment.PythonFunctionKind
getPythonFunctionKind()
Returns the kind of the user-defined python function.TypeInformation
getResultType()
Returns theTypeInformation
of theImperativeAggregateFunction
's result.byte[]
getSerializedPythonFunction()
Returns the serialized representation of the user-defined python function.TypeInference
getTypeInference(DataTypeFactory typeFactory)
Returns the logic for performing type inference of a call to this function definition.Object
getValue(Object accumulator)
Called every time when an aggregation result should be materialized.boolean
isDeterministic()
Returns information about the determinism of the function's results.boolean
takesRowAsInput()
Returns Whether the Python function takes row as input instead of each columns of a row.String
toString()
Returns the name of the UDF that is used for plan explanation and logging.-
Methods inherited from class org.apache.flink.table.functions.AggregateFunction
getKind
-
Methods inherited from class org.apache.flink.table.functions.UserDefinedFunction
close, functionIdentifier, open
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface org.apache.flink.table.functions.FunctionDefinition
getRequirements, supportsConstantFolding
-
-
-
-
Constructor Detail
-
PythonAggregateFunction
public PythonAggregateFunction(String name, byte[] serializedAggregateFunction, DataType[] inputTypes, DataType resultType, DataType accumulatorType, PythonFunctionKind pythonFunctionKind, boolean deterministic, boolean takesRowAsInput, PythonEnv pythonEnv)
-
PythonAggregateFunction
public PythonAggregateFunction(String name, byte[] serializedAggregateFunction, String[] inputTypesString, String resultTypeString, String accumulatorTypeString, PythonFunctionKind pythonFunctionKind, boolean deterministic, boolean takesRowAsInput, PythonEnv pythonEnv)
-
PythonAggregateFunction
public PythonAggregateFunction(String name, byte[] serializedAggregateFunction, PythonFunctionKind pythonFunctionKind, boolean deterministic, boolean takesRowAsInput, PythonEnv pythonEnv)
-
-
Method Detail
-
getValue
public Object getValue(Object accumulator)
Description copied from class:AggregateFunction
Called every time when an aggregation result should be materialized. The returned value could be either an early and incomplete result (periodically emitted as data arrives) or the final result of the aggregation.- Specified by:
getValue
in classAggregateFunction
- Parameters:
accumulator
- the accumulator which contains the current intermediate results- Returns:
- the aggregation result
-
createAccumulator
public Object createAccumulator()
Description copied from class:ImperativeAggregateFunction
Creates and initializes the accumulator for thisImperativeAggregateFunction
.The accumulator is an intermediate data structure that stores the aggregated values until a final aggregation result is computed.
- Specified by:
createAccumulator
in classImperativeAggregateFunction
- Returns:
- the accumulator with the initial value
-
getSerializedPythonFunction
public byte[] getSerializedPythonFunction()
Description copied from interface:PythonFunction
Returns the serialized representation of the user-defined python function.- Specified by:
getSerializedPythonFunction
in interfacePythonFunction
-
getPythonEnv
public PythonEnv getPythonEnv()
Description copied from interface:PythonFunction
Returns the Python execution environment.- Specified by:
getPythonEnv
in interfacePythonFunction
-
getPythonFunctionKind
public PythonFunctionKind getPythonFunctionKind()
Description copied from interface:PythonFunction
Returns the kind of the user-defined python function.- Specified by:
getPythonFunctionKind
in interfacePythonFunction
-
takesRowAsInput
public boolean takesRowAsInput()
Description copied from interface:PythonFunction
Returns Whether the Python function takes row as input instead of each columns of a row.- Specified by:
takesRowAsInput
in interfacePythonFunction
-
isDeterministic
public boolean isDeterministic()
Description copied from interface:FunctionDefinition
Returns information about the determinism of the function's results.It returns
true
if and only if a call to this function is guaranteed to always return the same result given the same parameters.true
is assumed by default. If the function is not purely functional likerandom(), date(), now(), ...
this method must returnfalse
.Furthermore, return
false
if the planner should always execute this function on the cluster side. In other words: the planner should not perform constant expression reduction during planning for constant calls to this function.- Specified by:
isDeterministic
in interfaceFunctionDefinition
-
getResultType
public TypeInformation getResultType()
Description copied from class:ImperativeAggregateFunction
Returns theTypeInformation
of theImperativeAggregateFunction
's result.- Overrides:
getResultType
in classImperativeAggregateFunction
- Returns:
- The
TypeInformation
of theImperativeAggregateFunction
's result ornull
if the result type should be automatically inferred.
-
getAccumulatorType
public TypeInformation getAccumulatorType()
Description copied from class:ImperativeAggregateFunction
Returns theTypeInformation
of theImperativeAggregateFunction
's accumulator.- Overrides:
getAccumulatorType
in classImperativeAggregateFunction
- Returns:
- The
TypeInformation
of theImperativeAggregateFunction
's accumulator ornull
if the accumulator type should be automatically inferred.
-
getTypeInference
public TypeInference getTypeInference(DataTypeFactory typeFactory)
Description copied from class:UserDefinedFunction
Returns the logic for performing type inference of a call to this function definition.The type inference process is responsible for inferring unknown types of input arguments, validating input arguments, and producing result types. The type inference process happens independent of a function body. The output of the type inference is used to search for a corresponding runtime implementation.
Instances of type inference can be created by using
TypeInference.newBuilder()
.See
BuiltInFunctionDefinitions
for concrete usage examples.The type inference for user-defined functions is automatically extracted using reflection. It does this by analyzing implementation methods such as
eval() or accumulate()
and the generic parameters of a function class if present. If the reflective information is not sufficient, it can be supported and enriched withDataTypeHint
andFunctionHint
annotations.Note: Overriding this method is only recommended for advanced users. If a custom type inference is specified, it is the responsibility of the implementer to make sure that the output of the type inference process matches with the implementation method:
The implementation method must comply with each
DataType.getConversionClass()
returned by the type inference. For example, ifDataTypes.TIMESTAMP(3).bridgedTo(java.sql.Timestamp.class)
is an expected argument type, the method must accept a calleval(java.sql.Timestamp)
.Regular Java calling semantics (including type widening and autoboxing) are applied when calling an implementation method which means that the signature can be
eval(java.lang.Object)
.The runtime will take care of converting the data to the data format specified by the
DataType.getConversionClass()
coming from the type inference logic.- Specified by:
getTypeInference
in interfaceFunctionDefinition
- Overrides:
getTypeInference
in classAggregateFunction
-
toString
public String toString()
Description copied from class:UserDefinedFunction
Returns the name of the UDF that is used for plan explanation and logging.- Overrides:
toString
in classUserDefinedFunction
-
-