This documentation is for an unreleased version of Apache Flink Machine Learning Library. We recommend you use the latest stable version.
Data Types
Data Types #
Flink ML supports all data types that have been supported by Flink Table API, as well as data types listed in sections below.
Vector #
Flink ML provides support for vectors of double values. A Vector
in Flink ML
can be either dense(DenseVector
) or sparse(SparseVector
), depending on how
users create them accordig to the vector’s sparsity. Each vector is initialized
with a fixed size and users may get or set the double value of any 0-based index
location in the vector.
Flink ML also has a class named Vectors
providing utility methods for
instantiating vectors.
int n = 4;
int[] indices = new int[] {0, 2, 3};
double[] values = new double[] {0.1, 0.3, 0.4};
SparseVector vector = Vectors.sparse(n, indices, values);
# Create a dense vector of 64-bit floats from a Python list or numbers.
>>> Vectors.dense([1, 2, 3])
DenseVector([1.0, 2.0, 3.0])
>>> Vectors.dense(1.0, 2.0)
DenseVector([1.0, 2.0])
# Create a sparse vector, using either a dict, a list of (index, value) pairs, or two separate
# arrays of indices and values.
>>> Vectors.sparse(4, {1: 1.0, 3: 5.5})
SparseVector(4, {1: 1.0, 3: 5.5})
>>> Vectors.sparse(4, [(1, 1.0), (3, 5.5)])
SparseVector(4, {1: 1.0, 3: 5.5})
>>> Vectors.sparse(4, [1, 3], [1.0, 5.5])
SparseVector(4, {1: 1.0, 3: 5.5})