@PublicEvolving public final class Pipeline extends Object implements Estimator<Pipeline,Pipeline>, Transformer<Pipeline>, Model<Pipeline>
Estimator
s and Transformer
s to
execute an algorithm.
A pipeline itself can either act as an Estimator or a Transformer, depending on the stages it includes. More specifically:
Estimator
, one needs to call fit(TableEnvironment, Table)
before use the pipeline as a Transformer
.
In this case the Pipeline is an Estimator
and can produce a Pipeline as a Model
.
Estimator
, it is a Transformer
and can be applied to a
Table directly. In this case, fit(TableEnvironment, Table)
will simply
return the pipeline itself.
In addition, a pipeline can also be used as a PipelineStage
in another pipeline, just
like an ordinary Estimator
or Transformer
as describe above.
Constructor and Description |
---|
Pipeline() |
Pipeline(List<org.apache.flink.ml.api.core.PipelineStage> stages) |
Pipeline(String pipelineJson) |
Modifier and Type | Method and Description |
---|---|
Pipeline |
appendStage(org.apache.flink.ml.api.core.PipelineStage stage)
Appends a PipelineStage to the tail of this pipeline.
|
Pipeline |
fit(TableEnvironment tEnv,
Table input)
Train the pipeline to fit on the records in the given
Table . |
Params |
getParams()
Returns the all the parameters.
|
List<org.apache.flink.ml.api.core.PipelineStage> |
getStages()
Returns a list of all stages in this pipeline in order, the list is immutable.
|
void |
loadJson(String json) |
boolean |
needFit()
Check whether the pipeline acts as an
Estimator or not. |
String |
toJson() |
Table |
transform(TableEnvironment tEnv,
Table input)
Generate a result table by applying all the stages in this pipeline to the input table in
order.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
get, set
public Pipeline()
public Pipeline(String pipelineJson)
public Pipeline(List<org.apache.flink.ml.api.core.PipelineStage> stages)
public Pipeline appendStage(org.apache.flink.ml.api.core.PipelineStage stage)
stage
- the stage to be appendedpublic List<org.apache.flink.ml.api.core.PipelineStage> getStages()
public boolean needFit()
Estimator
or not. When the return value is
true, that means this pipeline contains an Estimator
and thus users must invoke
fit(TableEnvironment, Table)
before they can use this pipeline as a Transformer
. Otherwise, the pipeline can be used as a Transformer
directly.true
if this pipeline has an Estimator, false
otherwisepublic Params getParams()
WithParams
public Pipeline fit(TableEnvironment tEnv, Table input)
Table
.
This method go through all the PipelineStage
s in order and does the following on
each stage until the last Estimator
(inclusive).
Estimator
, invoke Estimator.fit(TableEnvironment,
Table)
with the input table to generate a Model
, transform the the input table
with the generated Model
to get a result table, then pass the result table to
the next stage as input.
Transformer
, invoke Transformer.transform(TableEnvironment, Table)
on the input table to get a result
table, and pass the result table to the next stage as input.
After all the Estimator
s are trained to fit their input tables, a new pipeline
will be created with the same stages in this pipeline, except that all the Estimators in the
new pipeline are replaced with their corresponding Models generated in the above process.
If there is no Estimator
in the pipeline, the method returns a copy of this
pipeline.
fit
in interface Estimator<Pipeline,Pipeline>
tEnv
- the table environment to which the input table is bound.input
- the table with records to train the Pipeline.public Table transform(TableEnvironment tEnv, Table input)
transform
in interface Transformer<Pipeline>
tEnv
- the table environment to which the input table is bound.input
- the table to be transformedpublic String toJson()
public void loadJson(String json)
Copyright © 2014–2021 The Apache Software Foundation. All rights reserved.