## Class MultipleLinearRegression

• All Implemented Interfaces:
WithParameters, Estimator<MultipleLinearRegression>, Predictor<MultipleLinearRegression>

public class MultipleLinearRegression
extends Object
implements Predictor<MultipleLinearRegression>
Multiple linear regression using the ordinary least squares (OLS) estimator.

The linear regression finds a solution to the problem

y = w_0 + w_1*x_1 + w_2*x_2 ... + w_n*x_n = w_0 + w^T*x

such that the sum of squared residuals is minimized

min_{w, w_0} \sum (y - w^T*x - w_0)^2

The minimization problem is solved by (stochastic) gradient descent. For each labeled vector (x,y), the gradient is calculated. The weighted average of all gradients is subtracted from the current value w which gives the new value of w_new. The weight is defined as stepsize/math.sqrt(iteration).

The optimization runs at most a maximum number of iterations or, if a convergence threshold has been set, until the convergence criterion has been met. As convergence criterion the relative change of the sum of squared residuals is used:

(S_{k-1} - S_k)/S_{k-1} < \rho

with S_k being the sum of squared residuals in iteration k and \rho being the convergence threshold.

At the moment, the whole partition is used for SGD, making it effectively a batch gradient descent. Once a sampling operator has been introduced, the algorithm can be optimized.

• ### Nested Class Summary

Nested Classes
Modifier and Type Class and Description
static class  MultipleLinearRegression.ConvergenceThreshold$ static class  MultipleLinearRegression.Iterations$
static class  MultipleLinearRegression.LearningRateMethodValue$ static class  MultipleLinearRegression.Stepsize$
• ### Constructor Summary

Constructors
Constructor and Description
MultipleLinearRegression()
• ### Method Summary

All Methods
Modifier and Type Method and Description
static MultipleLinearRegression apply()
static <Testing,PredictionValue>DataSet<scala.Tuple2<PredictionValue,PredictionValue>> evaluate(DataSet<Testing> testing, ParameterMap evaluateParameters, EvaluateDataSetOperation<Self,Testing,PredictionValue> evaluator)
static <Testing,PredictionValue>ParameterMap evaluate$default$2()
static <Training> void fit(DataSet<Training> training, ParameterMap fitParameters, FitOperation<Self,Training> fitOperation)
static <Training> ParameterMap fit$default$2()
static Object fitMLR()
Trains the linear model to fit the training data.
static GenericLossFunction lossFunction()
static ParameterMap parameters()
static <Testing,Prediction>DataSet<Prediction> predict(DataSet<Testing> testing, ParameterMap predictParameters, PredictDataSetOperation<Self,Testing,Prediction> predictor)
static <Testing,Prediction>ParameterMap predict$default$2()
static <T extends Vector>Object predictVectors()
MultipleLinearRegression setConvergenceThreshold(double convergenceThreshold)
MultipleLinearRegression setIterations(int iterations)
MultipleLinearRegression setLearningRateMethod(LearningRateMethod.LearningRateMethodTrait learningRateMethod)
MultipleLinearRegression setStepsize(double stepsize)
DataSet<Object> squaredResidualSum(DataSet<LabeledVector> input)
scala.Option<DataSet<WeightVector>> weightsOption()
static String WEIGHTVECTOR_BROADCAST()
• ### Methods inherited from class java.lang.Object

clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
• ### Methods inherited from interface org.apache.flink.ml.pipeline.Predictor

evaluate, predict
• ### Methods inherited from interface org.apache.flink.ml.pipeline.Estimator

fit
• ### Methods inherited from interface org.apache.flink.ml.common.WithParameters

parameters
• ### Constructor Detail

• #### MultipleLinearRegression

public MultipleLinearRegression()
• ### Method Detail

public static String WEIGHTVECTOR_BROADCAST()
• #### lossFunction

public static GenericLossFunction lossFunction()
• #### apply

public static MultipleLinearRegression apply()
• #### fitMLR

public static Object fitMLR()
Trains the linear model to fit the training data. The resulting weight vector is stored in the MultipleLinearRegression instance.

Returns:
(undocumented)
• #### predictVectors

public static <T extends Vector> Object predictVectors()
• #### parameters

public static ParameterMap parameters()
• #### fit

public static <Training> void fit(DataSet<Training> training,
ParameterMap fitParameters,
FitOperation<Self,Training> fitOperation)
• #### fit$default$2

public static <Training> ParameterMap fit$default$2()
• #### predict

public static <Testing,Prediction> DataSet<Prediction> predict(DataSet<Testing> testing,
ParameterMap predictParameters,
PredictDataSetOperation<Self,Testing,Prediction> predictor)
• #### evaluate

public static <Testing,PredictionValue> DataSet<scala.Tuple2<PredictionValue,PredictionValue>> evaluate(DataSet<Testing> testing,
ParameterMap evaluateParameters,
EvaluateDataSetOperation<Self,Testing,PredictionValue> evaluator)
• #### predict$default$2

public static <Testing,Prediction> ParameterMap predict$default$2()
• #### evaluate$default$2

public static <Testing,PredictionValue> ParameterMap evaluate$default$2()
• #### weightsOption

public scala.Option<DataSet<WeightVector>> weightsOption()
• #### setIterations

public MultipleLinearRegression setIterations(int iterations)
• #### setStepsize

public MultipleLinearRegression setStepsize(double stepsize)
• #### setConvergenceThreshold

public MultipleLinearRegression setConvergenceThreshold(double convergenceThreshold)
• #### setLearningRateMethod

public MultipleLinearRegression setLearningRateMethod(LearningRateMethod.LearningRateMethodTrait learningRateMethod)
• #### squaredResidualSum

public DataSet<Object> squaredResidualSum(DataSet<LabeledVector> input)