@Internal public class DefaultOperatorStateBackend extends Object implements OperatorStateBackend
Modifier and Type | Field and Description |
---|---|
static String |
DEFAULT_OPERATOR_STATE_NAME
The default namespace for state in cases where no state name is provided
|
Constructor and Description |
---|
DefaultOperatorStateBackend(ExecutionConfig executionConfig,
CloseableRegistry closeStreamOnCancelRegistry,
Map<String,PartitionableListState<?>> registeredOperatorStates,
Map<String,BackendWritableBroadcastState<?,?>> registeredBroadcastStates,
Map<String,PartitionableListState<?>> accessedStatesByName,
Map<String,BackendWritableBroadcastState<?,?>> accessedBroadcastStatesByName,
AbstractSnapshotStrategy<OperatorStateHandle> snapshotStrategy) |
Modifier and Type | Method and Description |
---|---|
void |
close() |
void |
dispose()
Disposes the object and releases all resources.
|
<K,V> BroadcastState<K,V> |
getBroadcastState(MapStateDescriptor<K,V> stateDescriptor)
Creates (or restores) a
broadcast state . |
ExecutionConfig |
getExecutionConfig() |
<S> ListState<S> |
getListState(ListStateDescriptor<S> stateDescriptor)
Creates (or restores) a list state.
|
<S> ListState<S> |
getOperatorState(ListStateDescriptor<S> stateDescriptor)
Deprecated.
This was deprecated as part of a refinement to the function names.
Please use
getListState(ListStateDescriptor) instead. |
Set<String> |
getRegisteredBroadcastStateNames()
Returns a set with the names of all currently registered broadcast states.
|
Set<String> |
getRegisteredStateNames()
Returns a set with the names of all currently registered states.
|
<T extends Serializable> |
getSerializableListState(String stateName)
Deprecated.
Using Java serialization for persisting state is not encouraged.
Please use
getListState(ListStateDescriptor) instead. |
<S> ListState<S> |
getUnionListState(ListStateDescriptor<S> stateDescriptor)
Creates (or restores) a list state.
|
RunnableFuture<SnapshotResult<OperatorStateHandle>> |
snapshot(long checkpointId,
long timestamp,
CheckpointStreamFactory streamFactory,
CheckpointOptions checkpointOptions)
Operation that writes a snapshot into a stream that is provided by the given
CheckpointStreamFactory and
returns a @RunnableFuture that gives a state handle to the snapshot. |
public static final String DEFAULT_OPERATOR_STATE_NAME
public DefaultOperatorStateBackend(ExecutionConfig executionConfig, CloseableRegistry closeStreamOnCancelRegistry, Map<String,PartitionableListState<?>> registeredOperatorStates, Map<String,BackendWritableBroadcastState<?,?>> registeredBroadcastStates, Map<String,PartitionableListState<?>> accessedStatesByName, Map<String,BackendWritableBroadcastState<?,?>> accessedBroadcastStatesByName, AbstractSnapshotStrategy<OperatorStateHandle> snapshotStrategy)
public ExecutionConfig getExecutionConfig()
public Set<String> getRegisteredStateNames()
OperatorStateStore
getRegisteredStateNames
in interface OperatorStateStore
public Set<String> getRegisteredBroadcastStateNames()
OperatorStateStore
getRegisteredBroadcastStateNames
in interface OperatorStateStore
public void close() throws IOException
close
in interface Closeable
close
in interface AutoCloseable
IOException
public void dispose()
Disposable
dispose
in interface OperatorStateBackend
dispose
in interface Disposable
public <K,V> BroadcastState<K,V> getBroadcastState(MapStateDescriptor<K,V> stateDescriptor) throws StateMigrationException
OperatorStateStore
broadcast state
. This type of state can only be created to store
the state of a BroadcastStream
. Each state is registered under a unique name.
The provided serializer is used to de/serialize the state in case of checkpointing (snapshot/restore).
The returned broadcast state has key-value
format.
CAUTION: the user has to guarantee that all task instances store the same elements in this type of state.
Each operator instance individually maintains and stores elements in the broadcast state. The fact that the incoming stream is a broadcast one guarantees that all instances see all the elements. Upon recovery or re-scaling, the same state is given to each of the instances. To avoid hotspots, each task reads its previous partition, and if there are more tasks (scale up), then the new instances read from the old instances in a round robin fashion. This is why each instance has to guarantee that it stores the same elements as the rest. If not, upon recovery or rescaling you may have unpredictable redistribution of the partitions, thus unpredictable results.
getBroadcastState
in interface OperatorStateStore
K
- The type of the keys in the broadcast state.V
- The type of the values in the broadcast state.stateDescriptor
- The descriptor for this state, providing a name, a serializer for the keys and one for the
values.StateMigrationException
public <S> ListState<S> getListState(ListStateDescriptor<S> stateDescriptor) throws Exception
OperatorStateStore
Note the semantic differences between an operator list state and a keyed list state
(see KeyedStateStore.getListState(ListStateDescriptor)
). Under the context of operator state,
the list is a collection of state items that are independent from each other and eligible for redistribution
across operator instances in case of changed operator parallelism. In other words, these state items are
the finest granularity at which non-keyed state can be redistributed, and should not be correlated with
each other.
The redistribution scheme of this list state upon operator rescaling is a round-robin pattern, such that the logical whole state (a concatenation of all the lists of state elements previously managed by each operator before the restore) is evenly divided into as many sublists as there are parallel operators.
getListState
in interface OperatorStateStore
S
- The generic type of the statestateDescriptor
- The descriptor for this state, providing a name and serializer.Exception
public <S> ListState<S> getUnionListState(ListStateDescriptor<S> stateDescriptor) throws Exception
OperatorStateStore
Note the semantic differences between an operator list state and a keyed list state
(see KeyedStateStore.getListState(ListStateDescriptor)
). Under the context of operator state,
the list is a collection of state items that are independent from each other and eligible for redistribution
across operator instances in case of changed operator parallelism. In other words, these state items are
the finest granularity at which non-keyed state can be redistributed, and should not be correlated with
each other.
The redistribution scheme of this list state upon operator rescaling is a broadcast pattern, such that the logical whole state (a concatenation of all the lists of state elements previously managed by each operator before the restore) is restored to all parallel operators so that each of them will get the union of all state items before the restore.
getUnionListState
in interface OperatorStateStore
S
- The generic type of the statestateDescriptor
- The descriptor for this state, providing a name and serializer.Exception
@Deprecated public <S> ListState<S> getOperatorState(ListStateDescriptor<S> stateDescriptor) throws Exception
getListState(ListStateDescriptor)
instead.OperatorStateStore
The items in the list are repartitionable by the system in case of changed operator parallelism.
getOperatorState
in interface OperatorStateStore
S
- The generic type of the statestateDescriptor
- The descriptor for this state, providing a name and serializer.Exception
@Deprecated public <T extends Serializable> ListState<T> getSerializableListState(String stateName) throws Exception
getListState(ListStateDescriptor)
instead.OperatorStateStore
This is a simple convenience method. For more flexibility on how state serialization
should happen, use the OperatorStateStore.getListState(ListStateDescriptor)
method.
getSerializableListState
in interface OperatorStateStore
stateName
- The name of state to createException
@Nonnull public RunnableFuture<SnapshotResult<OperatorStateHandle>> snapshot(long checkpointId, long timestamp, @Nonnull CheckpointStreamFactory streamFactory, @Nonnull CheckpointOptions checkpointOptions) throws Exception
SnapshotStrategy
CheckpointStreamFactory
and
returns a @RunnableFuture
that gives a state handle to the snapshot. It is up to the implementation if
the operation is performed synchronous or asynchronous. In the later case, the returned Runnable must be executed
first before obtaining the handle.snapshot
in interface SnapshotStrategy<SnapshotResult<OperatorStateHandle>>
checkpointId
- The ID of the checkpoint.timestamp
- The timestamp of the checkpoint.streamFactory
- The factory that we can use for writing our state to streams.checkpointOptions
- Options for how to perform this checkpoint.StateObject
.Exception
Copyright © 2014–2020 The Apache Software Foundation. All rights reserved.