T
- The type of the elements in the list of the original untransformed broadcast variable.O
- The type of the transformed broadcast variable.@Public @FunctionalInterface public interface BroadcastVariableInitializer<T,O>
The broadcast variable initializer will in many cases only be executed by one parallel function instance per TaskManager, when acquiring the broadcast variable for the first time inside that particular TaskManager. It is possible that a broadcast variable is read and initialized multiple times, if the tasks that use the variables are not overlapping in their execution time; in such cases, it can happen that one function instance released the broadcast variable, and another function instance materializes it again.
This is an example of how to use a broadcast variable initializer, transforming the shared list of values into a shared map.
public class MyFunction extends RichMapFunction<Long, String> {
private Map<Long, String> map;
public void open(Configuration cfg) throws Exception {
getRuntimeContext().getBroadcastVariableWithInitializer("mapvar",
new BroadcastVariableInitializer<Tuple2<Long, String>, Map<Long, String>>() {
public Map<Long, String> initializeBroadcastVariable(Iterable<Tuple2<Long, String>> data) {
Map<Long, String> map = new HashMap<>();
for (Tuple2<Long, String> t : data) {
map.put(t.f0, t.f1);
}
return map;
}
});
}
public String map(Long value) {
// replace the long by the String, based on the map
return map.get(value);
}
}
Modifier and Type | Method and Description |
---|---|
O |
initializeBroadcastVariable(Iterable<T> data)
The method that reads the data elements from the broadcast variable and creates the
transformed data structure.
|
O initializeBroadcastVariable(Iterable<T> data)
The iterable with the data elements can be traversed only once, i.e., only the first call
to iterator()
will succeed.
data
- The sequence of elements in the broadcast variable.Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.