Data Types

Data Types #

Flink ML supports all data types that have been supported by Flink Table API, as well as data types listed in sections below.

Vector #

Flink ML provides support for vectors of double values. A Vector in Flink ML can be either dense(DenseVector) or sparse(SparseVector), depending on how users create them accordig to the vector’s sparsity. Each vector is initialized with a fixed size and users may get or set the double value of any 0-based index location in the vector.

Flink ML also has a class named Vectors providing utility methods for instantiating vectors.

int n = 4;
int[] indices = new int[] {0, 2, 3};
double[] values = new double[] {0.1, 0.3, 0.4};

SparseVector vector = Vectors.sparse(n, indices, values);
# Create a dense vector of 64-bit floats from a Python list or numbers.
>>> Vectors.dense([1, 2, 3])
DenseVector([1.0, 2.0, 3.0])
>>> Vectors.dense(1.0, 2.0)
DenseVector([1.0, 2.0])

# Create a sparse vector, using either a dict, a list of (index, value) pairs, or two separate
# arrays of indices and values.

>>> Vectors.sparse(4, {1: 1.0, 3: 5.5})
SparseVector(4, {1: 1.0, 3: 5.5})
>>> Vectors.sparse(4, [(1, 1.0), (3, 5.5)])
SparseVector(4, {1: 1.0, 3: 5.5})
>>> Vectors.sparse(4, [1, 3], [1.0, 5.5])
SparseVector(4, {1: 1.0, 3: 5.5})