IN1
- The data type of the first input data stream.IN2
- The data type of the second input data stream.O
- The data type of the returned elements.@Public @FunctionalInterface public interface CoGroupFunction<IN1,IN2,O> extends Function, Serializable
DataStream
s by first
grouping each data stream after a key and then "joining" the groups by calling this function with
the two streams for each key. If a key is present in only one of the two inputs, it may be that
one of the groups is empty.
The basic syntax for using CoGroup on two data streams is as follows:
DataStream<X> stream1 = ...;
DataStream<Y> stream2 = ...;
stream1.coGroup(stream2)
.where(<key-definition>)
.equalTo(<key-definition>)
.window(<windowAssigner>)
.apply(new MyCoGroupFunction());
stream1
is here considered the first input, stream2
the second input.
Some keys may only be contained in one of the two original data streams. In that case, the CoGroup function is invoked with in empty input for the side of the data stream that did not contain elements with that specific key.
Modifier and Type | Method and Description |
---|---|
void |
coGroup(Iterable<IN1> first,
Iterable<IN2> second,
Collector<O> out)
This method must be implemented to provide a user implementation of a coGroup.
|
void coGroup(Iterable<IN1> first, Iterable<IN2> second, Collector<O> out) throws Exception
first
- The records from the first input.second
- The records from the second.out
- A collector to return elements.Exception
- The function may throw Exceptions, which will cause the program to cancel,
and may trigger the recovery logic.Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.