This documentation is for an out-of-date version of Apache Flink. We recommend you use the latest stable version.
$$ \newcommand{\R}{\mathbb{R}} \newcommand{\E}{\mathbb{E}} \newcommand{\x}{\mathbf{x}} \newcommand{\y}{\mathbf{y}} \newcommand{\wv}{\mathbf{w}} \newcommand{\av}{\mathbf{\alpha}} \newcommand{\bv}{\mathbf{b}} \newcommand{\N}{\mathbb{N}} \newcommand{\id}{\mathbf{I}} \newcommand{\ind}{\mathbf{1}} \newcommand{\0}{\mathbf{0}} \newcommand{\unit}{\mathbf{e}} \newcommand{\one}{\mathbf{1}} \newcommand{\zero}{\mathbf{0}} \newcommand\rfrac[2]{^{#1}\!/_{#2}} \newcommand{\norm}[1]{\left\lVert#1\right\rVert} $$

Polynomial Features

Description

The polynomial features transformer maps a vector into the polynomial feature space of degree $d$. The dimension of the input vector determines the number of polynomial factors whose values are the respective vector entries. Given a vector $(x, y, z, \ldots)^T$ the resulting feature vector looks like:

Flinkā€™s implementation orders the polynomials in decreasing order of their degree.

Given the vector $\left(3,2\right)^T$, the polynomial features vector of degree 3 would look like

This transformer can be prepended to all Transformer and Predictor implementations which expect an input of type LabeledVector or any sub-type of Vector.

Operations

PolynomialFeatures is a Transformer. As such, it supports the fit and transform operation.

Fit

PolynomialFeatures is not trained on data and, thus, supports all types of input data.

Transform

PolynomialFeatures transforms all subtypes of Vector and LabeledVector into their respective types:

  • transform[T <: Vector]: DataSet[T] => DataSet[T]
  • transform: DataSet[LabeledVector] => DataSet[LabeledVector]

Parameters

The polynomial features transformer can be controlled by the following parameters:

Parameters Description
Degree

The maximum polynomial degree. (Default value: 10)

Examples

// Obtain the training data set
val trainingDS: DataSet[LabeledVector] = ...

// Setup polynomial feature transformer of degree 3
val polyFeatures = PolynomialFeatures()
.setDegree(3)

// Setup the multiple linear regression learner
val mlr = MultipleLinearRegression()

// Control the learner via the parameter map
val parameters = ParameterMap()
.add(MultipleLinearRegression.Iterations, 20)
.add(MultipleLinearRegression.Stepsize, 0.5)

// Create pipeline PolynomialFeatures -> MultipleLinearRegression
val pipeline = polyFeatures.chainPredictor(mlr)

// train the model
pipeline.fit(trainingDS)