Modifier and Type | Field and Description |
---|---|
protected HashMap<String,DistributedCache.DistributedCacheEntry> |
cacheFile
Hash map for files in the distributed cache: registered name to cache entry.
|
protected int |
defaultParallelism
The default parallelism to use for nodes that have no explicitly specified parallelism.
|
protected ExecutionConfig |
executionConfig
Config object for runtime execution parameters.
|
protected String |
jobName
The name of the job.
|
protected List<GenericDataSinkBase<?>> |
sinks
A collection of all sinks in the plan.
|
Constructor and Description |
---|
Plan(Collection<? extends GenericDataSinkBase<?>> sinks)
Creates a new program plan, describing the data flow that ends at the given data sinks.
|
Plan(Collection<? extends GenericDataSinkBase<?>> sinks,
int defaultParallelism)
Creates a new program plan with the given default parallelism, describing the data flow that
ends at the given data sinks.
|
Plan(Collection<? extends GenericDataSinkBase<?>> sinks,
String jobName)
Creates a new dataflow plan with the given name, describing the data flow that ends at the
given data sinks.
|
Plan(Collection<? extends GenericDataSinkBase<?>> sinks,
String jobName,
int defaultParallelism)
Creates a new program plan with the given name and default parallelism, describing the data
flow that ends at the given data sinks.
|
Plan(GenericDataSinkBase<?> sink)
Creates a new program plan with single data sink.
|
Plan(GenericDataSinkBase<?> sink,
int defaultParallelism)
Creates a new program plan with single data sink and the given default parallelism.
|
Plan(GenericDataSinkBase<?> sink,
String jobName)
Creates a new program plan with the given name, containing initially a single data sink.
|
Plan(GenericDataSinkBase<?> sink,
String jobName,
int defaultParallelism)
Creates a new program plan with the given name and default parallelism, containing initially
a single data sink.
|
Modifier and Type | Method and Description |
---|---|
void |
accept(Visitor<Operator<?>> visitor)
Traverses the job depth first from all data sinks on towards the sources.
|
void |
addDataSink(GenericDataSinkBase<?> sink)
Adds a data sink to the set of sinks in this program.
|
Set<Map.Entry<String,DistributedCache.DistributedCacheEntry>> |
getCachedFiles()
Return the registered cached files.
|
Collection<? extends GenericDataSinkBase<?>> |
getDataSinks()
Gets all the data sinks of this job.
|
int |
getDefaultParallelism()
Gets the default parallelism for this job.
|
ExecutionConfig |
getExecutionConfig()
Gets the execution config object.
|
JobID |
getJobId()
Gets the ID of the job that the dataflow plan belongs to.
|
String |
getJobName()
Gets the name of this job.
|
int |
getMaximumParallelism() |
String |
getPostPassClassName()
Gets the optimizer post-pass class for this job.
|
void |
registerCachedFile(String name,
DistributedCache.DistributedCacheEntry entry)
Register cache files at program level.
|
void |
setDefaultParallelism(int defaultParallelism)
Sets the default parallelism for this plan.
|
void |
setExecutionConfig(ExecutionConfig executionConfig)
Sets the runtime config object defining execution parameters.
|
void |
setJobId(JobID jobId)
Sets the ID of the job that the dataflow plan belongs to.
|
void |
setJobName(String jobName)
Sets the jobName for this Plan.
|
protected final List<GenericDataSinkBase<?>> sinks
protected String jobName
protected int defaultParallelism
protected HashMap<String,DistributedCache.DistributedCacheEntry> cacheFile
protected ExecutionConfig executionConfig
public Plan(Collection<? extends GenericDataSinkBase<?>> sinks, String jobName)
If not all of the sinks of a data flow are given to the plan, the flow might not be translated entirely.
sinks
- The collection will the sinks of the job's data flow.jobName
- The name to display for the job.public Plan(Collection<? extends GenericDataSinkBase<?>> sinks, String jobName, int defaultParallelism)
If not all of the sinks of a data flow are given to the plan, the flow might not be translated entirely.
sinks
- The collection will the sinks of the job's data flow.jobName
- The name to display for the job.defaultParallelism
- The default parallelism for the job.public Plan(GenericDataSinkBase<?> sink, String jobName)
If not all of the sinks of a data flow are given, the flow might not be translated entirely, but only the parts of the flow reachable by traversing backwards from the given data sinks.
sink
- The data sink of the data flow.jobName
- The name to display for the job.public Plan(GenericDataSinkBase<?> sink, String jobName, int defaultParallelism)
If not all of the sinks of a data flow are given, the flow might not be translated entirely, but only the parts of the flow reachable by traversing backwards from the given data sinks.
sink
- The data sink of the data flow.jobName
- The name to display for the job.defaultParallelism
- The default parallelism for the job.public Plan(Collection<? extends GenericDataSinkBase<?>> sinks)
If not all of the sinks of a data flow are given, the flow might not be translated entirely, but only the parts of the flow reachable by traversing backwards from the given data sinks.
sinks
- The collection will the sinks of the data flow.public Plan(Collection<? extends GenericDataSinkBase<?>> sinks, int defaultParallelism)
If not all of the sinks of a data flow are given, the flow might not be translated entirely, but only the parts of the flow reachable by traversing backwards from the given data sinks.
sinks
- The collection will the sinks of the data flow.defaultParallelism
- The default parallelism for the job.public Plan(GenericDataSinkBase<?> sink)
If not all of the sinks of a data flow are given to the plan, the flow might not be translated entirely.
sink
- The data sink of the data flow.public Plan(GenericDataSinkBase<?> sink, int defaultParallelism)
If not all of the sinks of a data flow are given to the plan, the flow might not be translated entirely.
sink
- The data sink of the data flow.defaultParallelism
- The default parallelism for the job.public void addDataSink(GenericDataSinkBase<?> sink)
sink
- The data sink to add.public Collection<? extends GenericDataSinkBase<?>> getDataSinks()
public String getJobName()
public void setJobName(String jobName)
jobName
- The jobName to set.public JobID getJobId()
public void setJobId(JobID jobId)
null
,
then the dataflow represents its own independent job.jobId
- The ID of the job that the dataflow plan belongs to.public int getDefaultParallelism()
public void setDefaultParallelism(int defaultParallelism)
defaultParallelism
- The default parallelism for the plan.public String getPostPassClassName()
public ExecutionConfig getExecutionConfig()
public void setExecutionConfig(ExecutionConfig executionConfig)
executionConfig
- The execution config to use.public void accept(Visitor<Operator<?>> visitor)
accept
in interface Visitable<Operator<?>>
visitor
- The visitor to be called with this object as the parameter.Visitable.accept(Visitor)
public void registerCachedFile(String name, DistributedCache.DistributedCacheEntry entry) throws IOException
entry
- contains all relevant informationname
- user defined name of that fileIOException
public Set<Map.Entry<String,DistributedCache.DistributedCacheEntry>> getCachedFiles()
public int getMaximumParallelism()
Copyright © 2014–2024 The Apache Software Foundation. All rights reserved.