public class TaskStateSnapshot extends Object implements CompositeStateHandle
One instance of this class contains the information that one task will send to acknowledge a
checkpoint request by the checkpoint coordinator. Tasks run operator instances in parallel, so
the union of all TaskStateSnapshot
that are collected by the checkpoint coordinator from
all tasks represent the whole state of a job at the time of the checkpoint.
This class should be called TaskState once the old class with this name that we keep for backwards compatibility goes away.
Modifier and Type | Field and Description |
---|---|
static TaskStateSnapshot |
FINISHED_ON_RESTORE |
Constructor and Description |
---|
TaskStateSnapshot() |
TaskStateSnapshot(int size,
boolean isTaskFinished) |
TaskStateSnapshot(Map<OperatorID,OperatorSubtaskState> subtaskStatesByOperatorID) |
Modifier and Type | Method and Description |
---|---|
static TaskStateSnapshot |
deserializeTaskStateSnapshot(SerializedValue<TaskStateSnapshot> subtaskState,
ClassLoader classLoader) |
void |
discardState()
Discards the state referred to and solemnly owned by this handle, to free up resources in the
persistent storage.
|
boolean |
equals(Object o) |
long |
getCheckpointedSize()
Returns the persisted data size during checkpoint execution in bytes.
|
InflightDataRescalingDescriptor |
getInputRescalingDescriptor()
Returns the input channel mapping for rescaling with in-flight data or
InflightDataRescalingDescriptor.NO_RESCALE . |
InflightDataRescalingDescriptor |
getOutputRescalingDescriptor()
Returns the output channel mapping for rescaling with in-flight data or
InflightDataRescalingDescriptor.NO_RESCALE . |
long |
getStateSize()
Returns the size of the state in bytes.
|
OperatorSubtaskState |
getSubtaskStateByOperatorID(OperatorID operatorID)
Returns the subtask state for the given operator id (or null if not contained).
|
Set<Map.Entry<OperatorID,OperatorSubtaskState>> |
getSubtaskStateMappings()
Returns the set of all mappings from operator id to the corresponding subtask state.
|
int |
hashCode() |
boolean |
hasState()
Returns true if at least one
OperatorSubtaskState in subtaskStatesByOperatorID has
state. |
boolean |
isTaskDeployedAsFinished()
Returns whether all the operators of the task are already finished on restoring.
|
boolean |
isTaskFinished()
Returns whether all the operators of the task have called finished methods.
|
OperatorSubtaskState |
putSubtaskStateByOperatorID(OperatorID operatorID,
OperatorSubtaskState state)
Maps the given operator id to the given subtask state.
|
void |
registerSharedStates(SharedStateRegistry stateRegistry,
long checkpointID)
Register both newly created and already referenced shared states in the given
SharedStateRegistry . |
static SerializedValue<TaskStateSnapshot> |
serializeTaskStateSnapshot(TaskStateSnapshot subtaskState) |
String |
toString() |
public static final TaskStateSnapshot FINISHED_ON_RESTORE
public TaskStateSnapshot()
public TaskStateSnapshot(int size, boolean isTaskFinished)
public TaskStateSnapshot(Map<OperatorID,OperatorSubtaskState> subtaskStatesByOperatorID)
public boolean isTaskDeployedAsFinished()
public boolean isTaskFinished()
@Nullable public OperatorSubtaskState getSubtaskStateByOperatorID(OperatorID operatorID)
public OperatorSubtaskState putSubtaskStateByOperatorID(@Nonnull OperatorID operatorID, @Nonnull OperatorSubtaskState state)
public Set<Map.Entry<OperatorID,OperatorSubtaskState>> getSubtaskStateMappings()
public boolean hasState()
OperatorSubtaskState
in subtaskStatesByOperatorID has
state.public InflightDataRescalingDescriptor getInputRescalingDescriptor()
InflightDataRescalingDescriptor.NO_RESCALE
.public InflightDataRescalingDescriptor getOutputRescalingDescriptor()
InflightDataRescalingDescriptor.NO_RESCALE
.public void discardState() throws Exception
StateObject
discardState
in interface StateObject
Exception
public long getStateSize()
StateObject
0
.
The values produced by this method are only used for informational purposes and for metrics/monitoring. If this method returns wrong values, the checkpoints and recovery will still behave correctly. However, efficiency may be impacted (wrong space pre-allocation) and functionality that depends on metrics (like monitoring) will be impacted.
Note for implementors: This method should not perform any I/O operations while obtaining
the state size (hence it does not declare throwing an IOException
). Instead, the
state size should be stored in the state object, or should be computable from the state
stored in this object. The reason is that this method is called frequently by several parts
of the checkpointing and issuing I/O requests from this method accumulates a heavy I/O load
on the storage system at higher scale.
getStateSize
in interface StateObject
public long getCheckpointedSize()
CompositeStateHandle
StateObject.getStateSize()
. If the size is unknown, this method would return same
result as StateObject.getStateSize()
.getCheckpointedSize
in interface CompositeStateHandle
public void registerSharedStates(SharedStateRegistry stateRegistry, long checkpointID)
CompositeStateHandle
SharedStateRegistry
. This method is called when the checkpoint successfully completes or is
recovered from failures.
After this is completed, newly created shared state is considered as published is no
longer owned by this handle. This means that it should no longer be deleted as part of calls
to StateObject.discardState()
. Instead, StateObject.discardState()
will trigger an unregistration
from the registry.
registerSharedStates
in interface CompositeStateHandle
stateRegistry
- The registry where shared states are registered.@Nullable public static SerializedValue<TaskStateSnapshot> serializeTaskStateSnapshot(TaskStateSnapshot subtaskState)
@Nullable public static TaskStateSnapshot deserializeTaskStateSnapshot(SerializedValue<TaskStateSnapshot> subtaskState, ClassLoader classLoader)
Copyright © 2014–2023 The Apache Software Foundation. All rights reserved.