I1
- The type of the first input DataSet of the Join transformation.I2
- The type of the second input DataSet of the Join transformation.OUT
- The type of the result of the Join transformation.@Public public static class JoinOperator.ProjectJoin<I1,I2,OUT extends Tuple> extends JoinOperator.EquiJoin<I1,I2,OUT>
JoinOperator.DefaultJoin<I1,I2>, JoinOperator.EquiJoin<I1,I2,OUT>, JoinOperator.JoinOperatorSets<I1,I2>, JoinOperator.ProjectJoin<I1,I2,OUT extends Tuple>
joinType, keys1, keys2
minResources, name, parallelism, preferredResources
Modifier | Constructor and Description |
---|---|
protected |
ProjectJoin(DataSet<I1> input1,
DataSet<I2> input2,
Keys<I1> keys1,
Keys<I2> keys2,
JoinOperatorBase.JoinHint hint,
int[] fields,
boolean[] isFromFirst,
TupleTypeInfo<OUT> returnType) |
protected |
ProjectJoin(DataSet<I1> input1,
DataSet<I2> input2,
Keys<I1> keys1,
Keys<I2> keys2,
JoinOperatorBase.JoinHint hint,
int[] fields,
boolean[] isFromFirst,
TupleTypeInfo<OUT> returnType,
org.apache.flink.api.java.operators.JoinOperator.JoinProjection<I1,I2> joinProj) |
Modifier and Type | Method and Description |
---|---|
protected DualInputSemanticProperties |
extractSemanticAnnotationsFromUdf(Class<?> udfClass) |
protected org.apache.flink.api.java.operators.JoinOperator.ProjectFlatJoinFunction<I1,I2,OUT> |
getFunction() |
<OUT extends Tuple> |
projectFirst(int... firstFieldIndexes)
Continues a ProjectJoin transformation and adds fields of the first join input to the
projection.
|
<OUT extends Tuple> |
projectSecond(int... secondFieldIndexes)
Continues a ProjectJoin transformation and adds fields of the second join input to the
projection.
|
<OUT extends Tuple> |
types(Class<?>... types)
Deprecated.
Deprecated method only kept for compatibility.
|
JoinOperator<I1,I2,OUT> |
withForwardedFieldsFirst(String... forwardedFieldsFirst)
Adds semantic information about forwarded fields of the first input of the user-defined
function.
|
JoinOperator<I1,I2,OUT> |
withForwardedFieldsSecond(String... forwardedFieldsSecond)
Adds semantic information about forwarded fields of the second input of the user-defined
function.
|
getSemanticProperties, translateToDataFlow, udfWithForwardedFieldsFirstAnnotation, udfWithForwardedFieldsSecondAnnotation
getJoinHint, getJoinType, getKeys1, getKeys2, getPartitioner, withPartitioner
getAnalyzedUdfSemanticsFlag, getBroadcastSets, getParameters, returns, returns, returns, setAnalyzedUdfSemanticsFlag, setSemanticProperties, withBroadcastSet, withParameters
getInput1, getInput1Type, getInput2, getInput2Type
getMinResources, getName, getParallelism, getPreferredResources, getResultType, name, setParallelism
aggregate, checkSameExecutionContext, clean, coGroup, collect, combineGroup, count, cross, crossWithHuge, crossWithTiny, distinct, distinct, distinct, distinct, fillInType, filter, first, flatMap, fullOuterJoin, fullOuterJoin, getExecutionEnvironment, getType, groupBy, groupBy, groupBy, iterate, iterateDelta, join, join, joinWithHuge, joinWithTiny, leftOuterJoin, leftOuterJoin, map, mapPartition, max, maxBy, min, minBy, output, partitionByHash, partitionByHash, partitionByHash, partitionByRange, partitionByRange, partitionByRange, partitionCustom, partitionCustom, partitionCustom, print, print, printOnTaskManager, printToErr, printToErr, project, rebalance, reduce, reduceGroup, rightOuterJoin, rightOuterJoin, runOperation, sortPartition, sortPartition, sortPartition, sum, union, write, write, writeAsCsv, writeAsCsv, writeAsCsv, writeAsCsv, writeAsFormattedText, writeAsFormattedText, writeAsText, writeAsText
protected ProjectJoin(DataSet<I1> input1, DataSet<I2> input2, Keys<I1> keys1, Keys<I2> keys2, JoinOperatorBase.JoinHint hint, int[] fields, boolean[] isFromFirst, TupleTypeInfo<OUT> returnType)
protected org.apache.flink.api.java.operators.JoinOperator.ProjectFlatJoinFunction<I1,I2,OUT> getFunction()
getFunction
in class JoinOperator.EquiJoin<I1,I2,OUT extends Tuple>
public <OUT extends Tuple> JoinOperator.ProjectJoin<I1,I2,OUT> projectFirst(int... firstFieldIndexes)
If the first join input is a Tuple
DataSet
, fields can be selected by
their index. If the first join input is not a Tuple DataSet, no parameters should be
passed.
Additional fields of the first and second input can be added by chaining the method
calls of projectFirst(int...)
and
projectSecond(int...)
.
Note: With the current implementation, the Project transformation loses type information.
firstFieldIndexes
- If the first input is a Tuple DataSet, the indexes of the
selected fields. For a non-Tuple DataSet, do not provide parameters. The order of
fields in the output tuple is defined by to the order of field indexes.Tuple
,
DataSet
,
JoinOperator.ProjectJoin
public <OUT extends Tuple> JoinOperator.ProjectJoin<I1,I2,OUT> projectSecond(int... secondFieldIndexes)
If the second join input is a Tuple
DataSet
, fields can be selected by
their index. If the second join input is not a Tuple DataSet, no parameters should be
passed.
Additional fields of the first and second input can be added by chaining the method
calls of projectFirst(int...)
and
projectSecond(int...)
.
Note: With the current implementation, the Project transformation loses type information.
secondFieldIndexes
- If the second input is a Tuple DataSet, the indexes of the
selected fields. For a non-Tuple DataSet, do not provide parameters. The order of
fields in the output tuple is defined by to the order of field indexes.Tuple
,
DataSet
,
JoinOperator.ProjectJoin
@Deprecated @PublicEvolving public <OUT extends Tuple> JoinOperator<I1,I2,OUT> types(Class<?>... types)
types
- public JoinOperator<I1,I2,OUT> withForwardedFieldsFirst(String... forwardedFieldsFirst)
TwoInputUdfOperator
Fields that are forwarded at the same position are specified by their position. The
specified position must be valid for the input and output data type and have the same type.
For example withForwardedFieldsFirst("f2")
declares that the third field of a
Java input tuple from the first input is copied to the third field of an output tuple.
Fields which are unchanged copied from the first input to another position in the output
are declared by specifying the source field reference in the first input and the target field
reference in the output. withForwardedFieldsFirst("f0->f2")
denotes that the first
field of the first input Java tuple is unchanged copied to the third field of the Java output
tuple. When using a wildcard ("*") ensure that the number of declared fields and their types
in first input and output type match.
Multiple forwarded fields can be annotated in one (withForwardedFieldsFirst("f2;
f3->f0; f4")
) or separate Strings (withForwardedFieldsFirst("f2", "f3->f0", "f4")
).
Please refer to the JavaDoc of Function
or
Flink's documentation for details on field references such as nested fields and wildcard.
It is not possible to override existing semantic information about forwarded fields of the
first input which was for example added by a FunctionAnnotation.ForwardedFieldsFirst
class
annotation.
NOTE: Adding semantic information for functions is optional! If used correctly, semantic information can help the Flink optimizer to generate more efficient execution plans. However, incorrect semantic information can cause the optimizer to generate incorrect execution plans which compute wrong results! So be careful when adding semantic information.
withForwardedFieldsFirst
in class TwoInputUdfOperator<I1,I2,OUT extends Tuple,JoinOperator<I1,I2,OUT extends Tuple>>
forwardedFieldsFirst
- A list of forwarded field expressions for the first input of the
function.FunctionAnnotation
,
FunctionAnnotation.ForwardedFieldsFirst
public JoinOperator<I1,I2,OUT> withForwardedFieldsSecond(String... forwardedFieldsSecond)
TwoInputUdfOperator
Fields that are forwarded at the same position are specified by their position. The
specified position must be valid for the input and output data type and have the same type.
For example withForwardedFieldsSecond("f2")
declares that the third field of a
Java input tuple from the second input is copied to the third field of an output tuple.
Fields which are unchanged copied from the second input to another position in the output
are declared by specifying the source field reference in the second input and the target
field reference in the output. withForwardedFieldsSecond("f0->f2")
denotes that the
first field of the second input Java tuple is unchanged copied to the third field of the Java
output tuple. When using a wildcard ("*") ensure that the number of declared fields and their
types in second input and output type match.
Multiple forwarded fields can be annotated in one (withForwardedFieldsSecond("f2;
f3->f0; f4")
) or separate Strings (withForwardedFieldsSecond("f2", "f3->f0", "f4")
).
Please refer to the JavaDoc of Function
or
Flink's documentation for details on field references such as nested fields and wildcard.
It is not possible to override existing semantic information about forwarded fields of the
second input which was for example added by a FunctionAnnotation.ForwardedFieldsSecond
class
annotation.
NOTE: Adding semantic information for functions is optional! If used correctly, semantic information can help the Flink optimizer to generate more efficient execution plans. However, incorrect semantic information can cause the optimizer to generate incorrect execution plans which compute wrong results! So be careful when adding semantic information.
withForwardedFieldsSecond
in class TwoInputUdfOperator<I1,I2,OUT extends Tuple,JoinOperator<I1,I2,OUT extends Tuple>>
forwardedFieldsSecond
- A list of forwarded field expressions for the second input of
the function.FunctionAnnotation
,
FunctionAnnotation.ForwardedFieldsSecond
protected DualInputSemanticProperties extractSemanticAnnotationsFromUdf(Class<?> udfClass)
extractSemanticAnnotationsFromUdf
in class JoinOperator.EquiJoin<I1,I2,OUT extends Tuple>
Copyright © 2014–2023 The Apache Software Foundation. All rights reserved.