SVM (flink 1.2-SNAPSHOT API)

java.lang.Object
- org.apache.flink.ml.classification.SVM

All Implemented Interfaces:

WithParameters, Estimator<SVM>, Predictor<SVM>
```
public class SVM
extends Object
implements Predictor<SVM>
```
Implements a soft-margin SVM using the communication-efficient distributed dual coordinate ascent algorithm (CoCoA) with hinge-loss function.
It can be used for binary classification problems, with the labels set as +1.0 to indiciate a positive example and -1.0 to indicate a negative example.
The algorithm solves the following minimization problem:
min_{w in bbb"R"^d} lambda/2 ||w||^2 + 1/n sum_(i=1)^n l_{i}(w^Tx_i)
with w being the weight vector, lambda being the regularization constant, x_{i} in bbb"R"^d being the data points and l_{i} being the convex loss functions, which can also depend on the labels y_{i} in bbb"R". In the current implementation the regularizer is the 2-norm and the loss functions are the hinge-loss functions:
l_{i} = max(0, 1 - y_{i} * w^Tx_i
With these choices, the problem definition is equivalent to a SVM with soft-margin. Thus, the algorithm allows us to train a SVM with soft-margin.
The minimization problem is solved by applying stochastic dual coordinate ascent (SDCA). In order to make the algorithm efficient in a distributed setting, the CoCoA algorithm calculates several iterations of SDCA locally on a data block before merging the local updates into a valid global state. This state is redistributed to the different data partitions where the next round of local SDCA iterations is then executed. The number of outer iterations and local SDCA iterations control the overall network costs, because there is only network communication required for each outer iteration. The local SDCA iterations are embarrassingly parallel once the individual data partitions have been distributed across the cluster.
Further details of the algorithm can be found here.

Nested Class Summary

Nested Classes
Modifier and Type	Class and Description
`static class`	`SVM.Blocks$`
`static class`	`SVM.Iterations$`
`static class`	`SVM.LocalIterations$`
`static class`	`SVM.OutputDecisionFunction$`
`static class`	`SVM.Regularization$`
`static class`	`SVM.Seed$`
`static class`	`SVM.Stepsize$`
`static class`	`SVM.ThresholdValue$`

Constructor Summary

Constructors
Constructor and Description

SVM()

Constructors
Constructor and Description
`SVM()`

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`static SVM`	`apply()`
`static Object`	`fitSVM()` `FitOperation` which trains a SVM with soft-margin based on the given training data set.
`static <T extends Vector> Object`	`predictVectors()` Provides the operation that makes the predictions for individual examples.
`SVM`	`setBlocks(int blocks)` Sets the number of data blocks/partitions
`SVM`	`setIterations(int iterations)` Sets the number of outer iterations
`SVM`	`setLocalIterations(int localIterations)` Sets the number of local SDCA iterations
`SVM`	`setOutputDecisionFunction(boolean outputDecisionFunction)` Sets whether the predictions should return the raw decision function value or the thresholded binary value.
`SVM`	`setRegularization(double regularization)` Sets the regularization constant
`SVM`	`setSeed(long seed)` Sets the seed value for the random number generator
`SVM`	`setStepsize(double stepsize)` Sets the stepsize for the weight vector updates
`SVM`	`setThreshold(double threshold)` Sets the threshold above which elements are classified as positive.
`static String`	`WEIGHT_VECTOR()`
`scala.Option<DataSet<DenseVector>>`	`weightsOption()` Stores the learned weight vector after the fit operation

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.apache.flink.ml.pipeline.Predictor
evaluate, predict

Methods inherited from interface org.apache.flink.ml.pipeline.Estimator
fit

Methods inherited from interface org.apache.flink.ml.common.WithParameters
parameters

- Constructor Detail
  - SVM
```
public SVM()
```
- Method Detail
  - WEIGHT_VECTOR
```
public static String WEIGHT_VECTOR()
```
  - apply
```
public static SVM apply()
```
  - predictVectors
```
public static <T extends Vector> Object predictVectors()
```
    Provides the operation that makes the predictions for individual examples.
    
    Returns:
    
    A PredictOperation, through which it is possible to predict a value, given a feature vector
  - fitSVM
```
public static Object fitSVM()
```
    FitOperation which trains a SVM with soft-margin based on the given training data set.
  - weightsOption
```
public scala.Option<DataSet<DenseVector>> weightsOption()
```
    Stores the learned weight vector after the fit operation
  - setBlocks
```
public SVM setBlocks(int blocks)
```
    Sets the number of data blocks/partitions
    
    Parameters:
    
    blocks -
    
    Returns:
    
    itself
  - setIterations
```
public SVM setIterations(int iterations)
```
    Sets the number of outer iterations
    
    Parameters:
    
    iterations -
    
    Returns:
    
    itself
  - setLocalIterations
```
public SVM setLocalIterations(int localIterations)
```
    Sets the number of local SDCA iterations
    
    Parameters:
    
    localIterations -
    
    Returns:
    
    itselft
  - setRegularization
```
public SVM setRegularization(double regularization)
```
    Sets the regularization constant
    
    Parameters:
    
    regularization -
    
    Returns:
    
    itself
  - setStepsize
```
public SVM setStepsize(double stepsize)
```
    Sets the stepsize for the weight vector updates
    
    Parameters:
    
    stepsize -
    
    Returns:
    
    itself
  - setSeed
```
public SVM setSeed(long seed)
```
    Sets the seed value for the random number generator
    
    Parameters:
    
    seed -
    
    Returns:
    
    itself
  - setThreshold
```
public SVM setThreshold(double threshold)
```
    Sets the threshold above which elements are classified as positive.
    The predict and evaluate functions will return +1.0 for items with a decision function value above this threshold, and -1.0 for items below it.
    
    Parameters:
    
    threshold -
    
    Returns:
  - setOutputDecisionFunction
```
public SVM setOutputDecisionFunction(boolean outputDecisionFunction)
```
    Sets whether the predictions should return the raw decision function value or the thresholded binary value.
    When setting this to true, predict and evaluate return the raw decision value, which is the distance from the separating hyperplane. When setting this to false, they return thresholded (+1.0, -1.0) values.
    
    Parameters:
    
    outputDecisionFunction - When set to true, predict and evaluate return the raw decision function values. When set to false, they return the thresholded binary values (+1.0, -1.0).

Back to Flink Website

Class SVM

Nested Class Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.flink.ml.pipeline.Predictor

Methods inherited from interface org.apache.flink.ml.pipeline.Estimator

Methods inherited from interface org.apache.flink.ml.common.WithParameters

Constructor Detail

SVM

Method Detail

WEIGHT_VECTOR

apply

predictVectors

fitSVM

weightsOption

setBlocks

setIterations

setLocalIterations

setRegularization

setStepsize

setSeed

setThreshold

setOutputDecisionFunction

Back to Flink Website