@PublicEvolving public interface SupportsComputedColumnPushDown
Computed columns add additional columns to the table's schema. They are defined by logical expressions that reference other physically existing columns.
An example in SQL looks like:
CREATE TABLE t (str STRING, ts AS PARSE_TIMESTAMP(str), i INT) // `ts` is a computed column
By default, if this interface is not implemented, computed columns are added to the physically produced row in a subsequent operation after the source.
However, it might be beneficial to perform the computation as early as possible in order to be close to the actual data generation. Especially in cases where computed columns are used for generating watermarks, a source must push down the computation as deep as possible such that the computation can happen within a source's data partition.
This interface provides a
SupportsComputedColumnPushDown.ComputedColumnConverter that needs to be applied to every
row during runtime.
Note: The final output data type emitted by a source changes from the physically produced data type to the full data type of the table's schema. For the example above, this means:
ROW<str STRING, i INT> // before conversion
ROW<str STRING, ts TIMESTAMP(3), i INT> // after conversion
Note: If a source implements
SupportsProjectionPushDown, the projection must be
applied to the physical data in the first step. The
(already aware of the projection) will then use the projected physical data and insert computed
columns into the result. In the example below, the projections
[i, d] are derived from
the DDL (
i) and query (
c are required). The
pushed converter will rely on this order and will process
[i, d] to produce
CREATE TABLE t (i INT, s STRING, c AS i + 2, d DOUBLE);
SELECT d, c FROM t;
|Modifier and Type
|Interface and Description
Generates and adds computed columns to
RowData if necessary.
|Modifier and Type
|Method and Description
void applyComputedColumn(SupportsComputedColumnPushDown.ComputedColumnConverter converter, DataType outputDataType)
RowData containing the physical
fields of the external system into a new
RowData with push-downed computed columns.
Note: Use the passed data type instead of
describing the final output data type when creating
TypeInformation. If the source
SupportsProjectionPushDown, the projection is already considered in both
the converter and the given output data type.
Copyright © 2014–2021 The Apache Software Foundation. All rights reserved.