@Internal public class PythonAggregateFunction extends AggregateFunction implements PythonFunction
Constructor and Description |
---|
PythonAggregateFunction(String name,
byte[] serializedAggregateFunction,
DataType[] inputTypes,
DataType resultType,
DataType accumulatorType,
PythonFunctionKind pythonFunctionKind,
boolean deterministic,
boolean takesRowAsInput,
PythonEnv pythonEnv) |
PythonAggregateFunction(String name,
byte[] serializedAggregateFunction,
PythonFunctionKind pythonFunctionKind,
boolean deterministic,
boolean takesRowAsInput,
PythonEnv pythonEnv) |
PythonAggregateFunction(String name,
byte[] serializedAggregateFunction,
String[] inputTypesString,
String resultTypeString,
String accumulatorTypeString,
PythonFunctionKind pythonFunctionKind,
boolean deterministic,
boolean takesRowAsInput,
PythonEnv pythonEnv) |
Modifier and Type | Method and Description |
---|---|
void |
accumulate(Object accumulator,
Object... args) |
Object |
createAccumulator()
Creates and initializes the accumulator for this
ImperativeAggregateFunction . |
TypeInformation |
getAccumulatorType()
Returns the
TypeInformation of the ImperativeAggregateFunction 's accumulator. |
PythonEnv |
getPythonEnv()
Returns the Python execution environment.
|
PythonFunctionKind |
getPythonFunctionKind()
Returns the kind of the user-defined python function.
|
TypeInformation |
getResultType()
Returns the
TypeInformation of the ImperativeAggregateFunction 's result. |
byte[] |
getSerializedPythonFunction()
Returns the serialized representation of the user-defined python function.
|
TypeInference |
getTypeInference(DataTypeFactory typeFactory)
Returns the logic for performing type inference of a call to this function definition.
|
Object |
getValue(Object accumulator)
Called every time when an aggregation result should be materialized.
|
boolean |
isDeterministic()
Returns information about the determinism of the function's results.
|
boolean |
takesRowAsInput()
Returns Whether the Python function takes row as input instead of each columns of a row.
|
String |
toString()
Returns the name of the UDF that is used for plan explanation and logging.
|
getKind
close, functionIdentifier, open
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
getRequirements
public PythonAggregateFunction(String name, byte[] serializedAggregateFunction, DataType[] inputTypes, DataType resultType, DataType accumulatorType, PythonFunctionKind pythonFunctionKind, boolean deterministic, boolean takesRowAsInput, PythonEnv pythonEnv)
public PythonAggregateFunction(String name, byte[] serializedAggregateFunction, String[] inputTypesString, String resultTypeString, String accumulatorTypeString, PythonFunctionKind pythonFunctionKind, boolean deterministic, boolean takesRowAsInput, PythonEnv pythonEnv)
public PythonAggregateFunction(String name, byte[] serializedAggregateFunction, PythonFunctionKind pythonFunctionKind, boolean deterministic, boolean takesRowAsInput, PythonEnv pythonEnv)
public Object getValue(Object accumulator)
AggregateFunction
getValue
in class AggregateFunction
accumulator
- the accumulator which contains the current intermediate resultspublic Object createAccumulator()
ImperativeAggregateFunction
ImperativeAggregateFunction
.
The accumulator is an intermediate data structure that stores the aggregated values until a final aggregation result is computed.
createAccumulator
in class ImperativeAggregateFunction
public byte[] getSerializedPythonFunction()
PythonFunction
getSerializedPythonFunction
in interface PythonFunction
public PythonEnv getPythonEnv()
PythonFunction
getPythonEnv
in interface PythonFunction
public PythonFunctionKind getPythonFunctionKind()
PythonFunction
getPythonFunctionKind
in interface PythonFunction
public boolean takesRowAsInput()
PythonFunction
takesRowAsInput
in interface PythonFunction
public boolean isDeterministic()
FunctionDefinition
It returns true
if and only if a call to this function is guaranteed to
always return the same result given the same parameters. true
is assumed by
default. If the function is not purely functional like random(), date(), now(), ...
this method must return false
.
Furthermore, return false
if the planner should always execute this function
on the cluster side. In other words: the planner should not perform constant expression
reduction during planning for constant calls to this function.
isDeterministic
in interface FunctionDefinition
public TypeInformation getResultType()
ImperativeAggregateFunction
TypeInformation
of the ImperativeAggregateFunction
's result.getResultType
in class ImperativeAggregateFunction
TypeInformation
of the ImperativeAggregateFunction
's result or
null
if the result type should be automatically inferred.public TypeInformation getAccumulatorType()
ImperativeAggregateFunction
TypeInformation
of the ImperativeAggregateFunction
's accumulator.getAccumulatorType
in class ImperativeAggregateFunction
TypeInformation
of the ImperativeAggregateFunction
's accumulator
or null
if the accumulator type should be automatically inferred.public TypeInference getTypeInference(DataTypeFactory typeFactory)
UserDefinedFunction
The type inference process is responsible for inferring unknown types of input arguments, validating input arguments, and producing result types. The type inference process happens independent of a function body. The output of the type inference is used to search for a corresponding runtime implementation.
Instances of type inference can be created by using TypeInference.newBuilder()
.
See BuiltInFunctionDefinitions
for concrete usage examples.
The type inference for user-defined functions is automatically extracted using reflection.
It does this by analyzing implementation methods such as eval() or accumulate()
and
the generic parameters of a function class if present. If the reflective information is not
sufficient, it can be supported and enriched with DataTypeHint
and FunctionHint
annotations.
Note: Overriding this method is only recommended for advanced users. If a custom type inference is specified, it is the responsibility of the implementer to make sure that the output of the type inference process matches with the implementation method:
The implementation method must comply with each DataType.getConversionClass()
returned by the type inference. For example, if DataTypes.TIMESTAMP(3).bridgedTo(java.sql.Timestamp.class)
is an expected argument type, the
method must accept a call eval(java.sql.Timestamp)
.
Regular Java calling semantics (including type widening and autoboxing) are applied when
calling an implementation method which means that the signature can be eval(java.lang.Object)
.
The runtime will take care of converting the data to the data format specified by the
DataType.getConversionClass()
coming from the type inference logic.
getTypeInference
in interface FunctionDefinition
getTypeInference
in class AggregateFunction
public String toString()
UserDefinedFunction
toString
in class UserDefinedFunction
Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.