public class DefaultVertexParallelismAndInputInfosDecider extends Object implements VertexParallelismAndInputInfosDecider
VertexParallelismAndInputInfosDecider
. This implementation will
decide parallelism and JobVertexInputInfo
s as follows:
1. For job vertices whose inputs are all ALL_TO_ALL edges, evenly distribute data to downstream subtasks, make different downstream subtasks consume roughly the same amount of data.
2. For other cases, evenly distribute subpartitions to downstream subtasks, make different downstream subtasks consume roughly the same number of subpartitions.
Modifier and Type | Method and Description |
---|---|
int |
computeSourceParallelismUpperBound(JobVertexID jobVertexId,
int maxParallelism)
Compute source parallelism upper bound for the source vertex.
|
ParallelismAndInputInfos |
decideParallelismAndInputInfosForVertex(JobVertexID jobVertexId,
List<BlockingResultInfo> consumedResults,
int vertexInitialParallelism,
int vertexMinParallelism,
int vertexMaxParallelism)
Decide the parallelism and
JobVertexInputInfo s for this job vertex. |
long |
getDataVolumePerTask()
Get the average size of data volume to expect each task instance to process.
|
public ParallelismAndInputInfos decideParallelismAndInputInfosForVertex(JobVertexID jobVertexId, List<BlockingResultInfo> consumedResults, int vertexInitialParallelism, int vertexMinParallelism, int vertexMaxParallelism)
VertexParallelismAndInputInfosDecider
JobVertexInputInfo
s for this job vertex.decideParallelismAndInputInfosForVertex
in interface VertexParallelismAndInputInfosDecider
jobVertexId
- The job vertex idconsumedResults
- The information of consumed blocking resultsvertexInitialParallelism
- The initial parallelism of the job vertex. If it's a positive
number, it will be respected. If it's not set(equals to ExecutionConfig.PARALLELISM_DEFAULT
), a parallelism will be automatically decided for
the vertex.vertexMinParallelism
- The min parallelism of the job vertex.vertexMaxParallelism
- The max parallelism of the job vertex.public int computeSourceParallelismUpperBound(JobVertexID jobVertexId, int maxParallelism)
VertexParallelismAndInputInfosDecider
computeSourceParallelismUpperBound
in interface VertexParallelismAndInputInfosDecider
jobVertexId
- The job vertex idmaxParallelism
- The max parallelism of the job vertex.public long getDataVolumePerTask()
VertexParallelismAndInputInfosDecider
getDataVolumePerTask
in interface VertexParallelismAndInputInfosDecider
Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.