public class JoinDataSet<L,R> extends DataSet<scala.Tuple2<L,R>> implements JoinFunctionAssigner<L,R>
DataSet
that results from a join
operation. The result of a default join is a
tuple containing the two values from the two sides of the join. The result of the join can be
changed by specifying a custom join function using the apply
method or by providing a
RichFlatJoinFunction
.
Example:
val left = ...
val right = ...
val joinResult = left.join(right).where(0, 2).equalTo(0, 1) {
(left, right) => new MyJoinResult(left, right)
}
Or, using key selector functions with tuple data types:
val left = ...
val right = ...
val joinResult = left.join(right).where({_._1}).equalTo({_._1) {
(left, right) => new MyJoinResult(left, right)
}
Constructor and Description |
---|
JoinDataSet(JoinOperator.EquiJoin<L,R,scala.Tuple2<L,R>> defaultJoin,
DataSet<L> leftInput,
DataSet<R> rightInput,
Keys<L> leftKeys,
Keys<R> rightKeys) |
Modifier and Type | Method and Description |
---|---|
<O> DataSet<O> |
apply(FlatJoinFunction<L,R,O> joiner,
TypeInformation<O> evidence$5,
scala.reflect.ClassTag<O> evidence$6)
Creates a new
DataSet by passing each pair of joined values to the given function. |
<O> DataSet<O> |
apply(scala.Function2<L,R,O> fun,
TypeInformation<O> evidence$1,
scala.reflect.ClassTag<O> evidence$2)
Creates a new
DataSet where the result for each pair of joined elements is the result
of the given function. |
<O> DataSet<O> |
apply(scala.Function3<L,R,Collector<O>,scala.runtime.BoxedUnit> fun,
TypeInformation<O> evidence$3,
scala.reflect.ClassTag<O> evidence$4)
Creates a new
DataSet by passing each pair of joined values to the given function. |
<O> DataSet<O> |
apply(JoinFunction<L,R,O> fun,
TypeInformation<O> evidence$7,
scala.reflect.ClassTag<O> evidence$8)
Creates a new
DataSet by passing each pair of joined values to the given function. |
<K> Partitioner<K> |
getPartitioner()
Gets the custom partitioner used by this join, or null, if none is set.
|
<K> JoinDataSet<L,R> |
withPartitioner(Partitioner<K> partitioner,
TypeInformation<K> evidence$9) |
aggregate, aggregate, clean, coGroup, collect, combineGroup, combineGroup, count, cross, crossWithHuge, crossWithTiny, distinct, distinct, distinct, distinct, filter, filter, first, flatMap, flatMap, flatMap, fullOuterJoin, fullOuterJoin, getExecutionEnvironment, getParallelism, getType, groupBy, groupBy, groupBy, iterate, iterateDelta, iterateDelta, iterateDelta, iterateDelta, iterateWithTermination, javaSet, join, join, joinWithHuge, joinWithTiny, leftOuterJoin, leftOuterJoin, map, map, mapPartition, mapPartition, mapPartition, max, max, min, min, name, output, partitionByHash, partitionByHash, partitionByHash, partitionByRange, partitionByRange, partitionByRange, partitionCustom, partitionCustom, partitionCustom, print, print, printOnTaskManager, printToErr, printToErr, rebalance, reduce, reduce, reduceGroup, reduceGroup, reduceGroup, registerAggregator, rightOuterJoin, rightOuterJoin, setParallelism, sortPartition, sortPartition, sortPartition, sum, sum, union, withBroadcastSet, withForwardedFields, withForwardedFieldsFirst, withForwardedFieldsSecond, withParameters, write, writeAsCsv, writeAsText
public <O> DataSet<O> apply(scala.Function2<L,R,O> fun, TypeInformation<O> evidence$1, scala.reflect.ClassTag<O> evidence$2)
DataSet
where the result for each pair of joined elements is the result
of the given function.apply
in interface JoinFunctionAssigner<L,R>
public <O> DataSet<O> apply(scala.Function3<L,R,Collector<O>,scala.runtime.BoxedUnit> fun, TypeInformation<O> evidence$3, scala.reflect.ClassTag<O> evidence$4)
DataSet
by passing each pair of joined values to the given function.
The function can output zero or more elements using the Collector
which will form the
result.apply
in interface JoinFunctionAssigner<L,R>
public <O> DataSet<O> apply(FlatJoinFunction<L,R,O> joiner, TypeInformation<O> evidence$5, scala.reflect.ClassTag<O> evidence$6)
DataSet
by passing each pair of joined values to the given function.
The function can output zero or more elements using the Collector
which will form the
result.
A RichFlatJoinFunction
can be used to access the
broadcast variables and the RuntimeContext
.
apply
in interface JoinFunctionAssigner<L,R>
public <O> DataSet<O> apply(JoinFunction<L,R,O> fun, TypeInformation<O> evidence$7, scala.reflect.ClassTag<O> evidence$8)
DataSet
by passing each pair of joined values to the given function.
The function must output one value. The concatenation of those will be new the DataSet.
A RichJoinFunction
can be used to access the
broadcast variables and the RuntimeContext
.
apply
in interface JoinFunctionAssigner<L,R>
public <K> JoinDataSet<L,R> withPartitioner(Partitioner<K> partitioner, TypeInformation<K> evidence$9)
withPartitioner
in interface JoinFunctionAssigner<L,R>
public <K> Partitioner<K> getPartitioner()
Copyright © 2014–2017 The Apache Software Foundation. All rights reserved.