@Internal public enum SubtaskStateMapper extends Enum<SubtaskStateMapper>
SubtaskStateMapper
narrows down the subtasks that need to be read during rescaling to
recover from a particular subtask when in-flight data has been stored in the checkpoint.
Mappings of old subtasks to new subtasks may be unique or non-unique. A unique assignment
means that a particular old subtask is only assigned to exactly one new subtask. Non-unique
assignments require filtering downstream. That means that the receiver side has to cross-verify
for a deserialized record if it truly belongs to the new subtask or not. Most SubtaskStateMapper
will only produce unique assignments and are thus optimal. Some rescaler,
such as RANGE
, create a mixture of unique and non-unique mappings, where downstream
tasks need to filter on some mapped subtasks.
Enum Constant and Description |
---|
ARBITRARY
Extra state is redistributed to other subtasks without any specific guarantee (only that up-
and downstream are matched).
|
FIRST
Restores extra subtasks to the first subtask.
|
FULL
Replicates the state to all subtasks.
|
RANGE
Remaps old ranges to new ranges.
|
ROUND_ROBIN
Redistributes subtask state in a round robin fashion.
|
UNSUPPORTED |
Modifier and Type | Method and Description |
---|---|
RescaleMappings |
getNewToOldSubtasksMapping(int oldParallelism,
int newParallelism)
Returns a mapping new subtask index to all old subtask indexes.
|
abstract int[] |
getOldSubtasks(int newSubtaskIndex,
int oldNumberOfSubtasks,
int newNumberOfSubtasks)
Returns all old subtask indexes that need to be read to restore all buffers for the given new
subtask index on rescale.
|
boolean |
isAmbiguous()
Returns true iff this mapper can potentially lead to ambiguous mappings where the different
new subtasks map to the same old subtask.
|
static SubtaskStateMapper |
valueOf(String name)
Returns the enum constant of this type with the specified name.
|
static SubtaskStateMapper[] |
values()
Returns an array containing the constants of this enum type, in
the order they are declared.
|
public static final SubtaskStateMapper ARBITRARY
public static final SubtaskStateMapper FIRST
public static final SubtaskStateMapper FULL
This strategy should only be used as a fallback.
public static final SubtaskStateMapper RANGE
Example:
old assignment: 0 -> [0;43); 1 -> [43;87); 2 -> [87;128)
new assignment: 0 -> [0;64]; 1 -> [64;128)
subtask 0 recovers data from old subtask 0 + 1 and subtask 1 recovers data from old subtask 1
+ 2
For all downscale from n to [n-1 .. n/2], each new subtasks get exactly two old subtasks assigned.
For all upscale from n to [n+1 .. 2*n-1], most subtasks get two old subtasks assigned, except the two outermost.
Larger scale factors (<n/2
, >2*n
), will increase the number of old
subtasks accordingly. However, they will also create more unique assignment, where an old
subtask is exclusively assigned to a new subtask. Thus, the number of non-unique mappings is
upper bound by 2*n.
public static final SubtaskStateMapper ROUND_ROBIN
newIndex ->
oldIndexes
. The mapping is accessed by using Bitset oldIndexes =
mapping.get(newIndex)
.
For oldParallelism < newParallelism
, that mapping is trivial. For example if
oldParallelism = 6 and newParallelism = 10.
New index | Old indexes |
0 | 0 |
1 | 1 |
... | |
5 | 5 |
6 | |
... | |
9 |
For oldParallelism > newParallelism
, new indexes get multiple assignments by
wrapping around assignments in a round-robin fashion. For example if oldParallelism = 10 and
newParallelism = 4.
New index | Old indexes |
0 | 0, 4, 8 |
1 | 1, 5, 9 |
2 | 2, 6 |
3 | 3, 7 |
public static final SubtaskStateMapper UNSUPPORTED
public static SubtaskStateMapper[] values()
for (SubtaskStateMapper c : SubtaskStateMapper.values()) System.out.println(c);
public static SubtaskStateMapper valueOf(String name)
name
- the name of the enum constant to be returned.IllegalArgumentException
- if this enum type has no constant with the specified nameNullPointerException
- if the argument is nullpublic abstract int[] getOldSubtasks(int newSubtaskIndex, int oldNumberOfSubtasks, int newNumberOfSubtasks)
public RescaleMappings getNewToOldSubtasksMapping(int oldParallelism, int newParallelism)
public boolean isAmbiguous()
Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.