Class TaskStateSnapshot

  • All Implemented Interfaces:
    Serializable, CompositeStateHandle, StateObject

    public class TaskStateSnapshot
    extends Object
    implements CompositeStateHandle
    This class encapsulates state handles to the snapshots of all operator instances executed within one task. A task can run multiple operator instances as a result of operator chaining, and all operator instances from the chain can register their state under their operator id. Each operator instance is a physical execution responsible for processing a partition of the data that goes through a logical operator. This partitioning happens to parallelize execution of logical operators, e.g. distributing a map function.

    One instance of this class contains the information that one task will send to acknowledge a checkpoint request by the checkpoint coordinator. Tasks run operator instances in parallel, so the union of all TaskStateSnapshot that are collected by the checkpoint coordinator from all tasks represent the whole state of a job at the time of the checkpoint.

    This class should be called TaskState once the old class with this name that we keep for backwards compatibility goes away.

    See Also:
    Serialized Form
    • Constructor Detail

      • TaskStateSnapshot

        public TaskStateSnapshot()
      • TaskStateSnapshot

        public TaskStateSnapshot​(int size,
                                 boolean isTaskFinished)
    • Method Detail

      • isTaskDeployedAsFinished

        public boolean isTaskDeployedAsFinished()
        Returns whether all the operators of the task are already finished on restoring.
      • isTaskFinished

        public boolean isTaskFinished()
        Returns whether all the operators of the task have called finished methods.
      • getSubtaskStateByOperatorID

        @Nullable
        public OperatorSubtaskState getSubtaskStateByOperatorID​(OperatorID operatorID)
        Returns the subtask state for the given operator id (or null if not contained).
      • putSubtaskStateByOperatorID

        public OperatorSubtaskState putSubtaskStateByOperatorID​(@Nonnull
                                                                OperatorID operatorID,
                                                                @Nonnull
                                                                OperatorSubtaskState state)
        Maps the given operator id to the given subtask state. Returns the subtask state of a previous mapping, if such a mapping existed or null otherwise.
      • hasState

        public boolean hasState()
        Returns true if at least one OperatorSubtaskState in subtaskStatesByOperatorID has state.
      • discardState

        public void discardState()
                          throws Exception
        Description copied from interface: StateObject
        Discards the state referred to and solemnly owned by this handle, to free up resources in the persistent storage. This method is called when the state represented by this object will not be used anymore.
        Specified by:
        discardState in interface StateObject
        Throws:
        Exception
      • getStateSize

        public long getStateSize()
        Description copied from interface: StateObject
        Returns the size of the state in bytes. If the size is not known, this method should return 0.

        The values produced by this method are only used for informational purposes and for metrics/monitoring. If this method returns wrong values, the checkpoints and recovery will still behave correctly. However, efficiency may be impacted (wrong space pre-allocation) and functionality that depends on metrics (like monitoring) will be impacted.

        Note for implementors: This method should not perform any I/O operations while obtaining the state size (hence it does not declare throwing an IOException). Instead, the state size should be stored in the state object, or should be computable from the state stored in this object. The reason is that this method is called frequently by several parts of the checkpointing and issuing I/O requests from this method accumulates a heavy I/O load on the storage system at higher scale.

        Specified by:
        getStateSize in interface StateObject
        Returns:
        Size of the state in bytes.
      • getCheckpointedSize

        public long getCheckpointedSize()
        Description copied from interface: CompositeStateHandle
        Returns the persisted data size during checkpoint execution in bytes. If incremental checkpoint is enabled, this value represents the incremental persisted data size, and usually smaller than StateObject.getStateSize(). If the size is unknown, this method would return same result as StateObject.getStateSize().
        Specified by:
        getCheckpointedSize in interface CompositeStateHandle
        Returns:
        The persisted data size during checkpoint execution in bytes.
      • registerSharedStates

        public void registerSharedStates​(SharedStateRegistry stateRegistry,
                                         long checkpointID)
        Description copied from interface: CompositeStateHandle
        Register both newly created and already referenced shared states in the given SharedStateRegistry. This method is called when the checkpoint successfully completes or is recovered from failures.

        After this is completed, newly created shared state is considered as published is no longer owned by this handle. This means that it should no longer be deleted as part of calls to StateObject.discardState(). Instead, StateObject.discardState() will trigger an unregistration from the registry.

        Specified by:
        registerSharedStates in interface CompositeStateHandle
        Parameters:
        stateRegistry - The registry where shared states are registered.
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class Object