@PublicEvolving public static final class Schema.Builder extends Object
Schema
.Modifier and Type | Method and Description |
---|---|
Schema |
build()
Returns an instance of an unresolved
Schema . |
Schema.Builder |
column(String columnName,
AbstractDataType<?> dataType)
Declares a physical column that is appended to this schema.
|
Schema.Builder |
column(String columnName,
String serializableTypeString)
Declares a physical column that is appended to this schema.
|
Schema.Builder |
columnByExpression(String columnName,
Expression expression)
Declares a computed column that is appended to this schema.
|
Schema.Builder |
columnByExpression(String columnName,
String sqlExpression)
Declares a computed column that is appended to this schema.
|
Schema.Builder |
columnByMetadata(String columnName,
AbstractDataType<?> dataType)
Declares a metadata column that is appended to this schema.
|
Schema.Builder |
columnByMetadata(String columnName,
AbstractDataType<?> dataType,
boolean isVirtual)
Declares a metadata column that is appended to this schema.
|
Schema.Builder |
columnByMetadata(String columnName,
AbstractDataType<?> dataType,
String metadataKey)
Declares a metadata column that is appended to this schema.
|
Schema.Builder |
columnByMetadata(String columnName,
AbstractDataType<?> dataType,
String metadataKey,
boolean isVirtual)
Declares a metadata column that is appended to this schema.
|
Schema.Builder |
columnByMetadata(String columnName,
String serializableTypeString)
Declares a metadata column that is appended to this schema.
|
Schema.Builder |
columnByMetadata(String columnName,
String serializableTypeString,
String metadataKey,
boolean isVirtual)
Declares a metadata column that is appended to this schema.
|
Schema.Builder |
fromColumns(List<Schema.UnresolvedColumn> unresolvedColumns)
Adopts all columns from the given list.
|
Schema.Builder |
fromFields(List<String> fieldNames,
List<? extends AbstractDataType<?>> fieldDataTypes)
Adopts the given field names and field data types as physical columns of the schema.
|
Schema.Builder |
fromFields(String[] fieldNames,
AbstractDataType<?>[] fieldDataTypes)
Adopts the given field names and field data types as physical columns of the schema.
|
Schema.Builder |
fromResolvedSchema(ResolvedSchema resolvedSchema)
Adopts all members from the given resolved schema.
|
Schema.Builder |
fromRowDataType(DataType dataType)
Adopts all fields of the given row as physical columns of the schema.
|
Schema.Builder |
fromSchema(Schema unresolvedSchema)
Adopts all members from the given unresolved schema.
|
Schema.Builder |
primaryKey(List<String> columnNames)
Declares a primary key constraint for a set of given columns.
|
Schema.Builder |
primaryKey(String... columnNames)
Declares a primary key constraint for a set of given columns.
|
Schema.Builder |
primaryKeyNamed(String constraintName,
List<String> columnNames)
Declares a primary key constraint for a set of given columns.
|
Schema.Builder |
primaryKeyNamed(String constraintName,
String... columnNames)
Declares a primary key constraint for a set of given columns.
|
Schema.Builder |
watermark(String columnName,
Expression watermarkExpression)
Declares that the given column should serve as an event-time (i.e. rowtime) attribute and
specifies a corresponding watermark strategy as an expression.
|
Schema.Builder |
watermark(String columnName,
String sqlExpression)
Declares that the given column should serve as an event-time (i.e. rowtime) attribute and
specifies a corresponding watermark strategy as an expression.
|
Schema.Builder |
withComment(String comment)
Apply comment to the previous column.
|
public Schema.Builder fromSchema(Schema unresolvedSchema)
public Schema.Builder fromResolvedSchema(ResolvedSchema resolvedSchema)
public Schema.Builder fromRowDataType(DataType dataType)
public Schema.Builder fromFields(String[] fieldNames, AbstractDataType<?>[] fieldDataTypes)
public Schema.Builder fromFields(List<String> fieldNames, List<? extends AbstractDataType<?>> fieldDataTypes)
public Schema.Builder fromColumns(List<Schema.UnresolvedColumn> unresolvedColumns)
public Schema.Builder column(String columnName, AbstractDataType<?> dataType)
Physical columns are regular columns known from databases. They define the names, the types, and the order of fields in the physical data. Thus, physical columns represent the payload that is read from and written to an external system. Connectors and formats use these columns (in the defined order) to configure themselves. Other kinds of columns can be declared between physical columns but will not influence the final physical schema.
columnName
- column namedataType
- data type of the columnpublic Schema.Builder column(String columnName, String serializableTypeString)
See column(String, AbstractDataType)
for a detailed explanation.
This method uses a type string that can be easily persisted in a durable catalog.
columnName
- column nameserializableTypeString
- data type of the column as a serializable stringLogicalType.asSerializableString()
public Schema.Builder columnByExpression(String columnName, Expression expression)
Computed columns are virtual columns that are generated by evaluating an expression that can reference other columns declared in the same table. Both physical columns and metadata columns can be accessed. The column itself is not physically stored within the table. The column’s data type is derived automatically from the given expression and does not have to be declared manually.
Computed columns are commonly used for defining time attributes. For example, the computed column can be used if the original field is not TIMESTAMP(3) type or is nested in a JSON string.
Any scalar expression can be used for in-memory/temporary tables. However, currently, only SQL expressions can be persisted in a catalog. User-defined functions (also defined in different catalogs) are supported.
Example: .columnByExpression("ts", $("json_obj").get("ts").cast(TIMESTAMP(3))
columnName
- column nameexpression
- computation of the columnpublic Schema.Builder columnByExpression(String columnName, String sqlExpression)
See columnByExpression(String, Expression)
for a detailed explanation.
This method uses a SQL expression that can be easily persisted in a durable catalog.
Example: .columnByExpression("ts", "CAST(json_obj.ts AS TIMESTAMP(3))")
columnName
- column namesqlExpression
- computation of the column using SQLpublic Schema.Builder columnByMetadata(String columnName, AbstractDataType<?> dataType)
Metadata columns allow to access connector and/or format specific fields for every row of a table. For example, a metadata column can be used to read and write the timestamp from and to Kafka records for time-based operations. The connector and format documentation lists the available metadata fields for every component.
Every metadata field is identified by a string-based key and has a documented data type. For convenience, the runtime will perform an explicit cast if the data type of the column differs from the data type of the metadata field. Of course, this requires that the two data types are compatible.
Note: This method assumes that the metadata key is equal to the column name and the metadata column can be used for both reading and writing.
columnName
- column namedataType
- data type of the columnpublic Schema.Builder columnByMetadata(String columnName, String serializableTypeString)
See column(String, AbstractDataType)
for a detailed explanation.
This method uses a type string that can be easily persisted in a durable catalog.
columnName
- column nameserializableTypeString
- data type of the columnpublic Schema.Builder columnByMetadata(String columnName, AbstractDataType<?> dataType, boolean isVirtual)
Metadata columns allow to access connector and/or format specific fields for every row of a table. For example, a metadata column can be used to read and write the timestamp from and to Kafka records for time-based operations. The connector and format documentation lists the available metadata fields for every component.
Every metadata field is identified by a string-based key and has a documented data type. For convenience, the runtime will perform an explicit cast if the data type of the column differs from the data type of the metadata field. Of course, this requires that the two data types are compatible.
By default, a metadata column can be used for both reading and writing. However, in
many cases an external system provides more read-only metadata fields than writable
fields. Therefore, it is possible to exclude metadata columns from persisting by setting
the isVirtual
flag to true
.
Note: This method assumes that the metadata key is equal to the column name.
columnName
- column namedataType
- data type of the columnisVirtual
- whether the column should be persisted or notpublic Schema.Builder columnByMetadata(String columnName, AbstractDataType<?> dataType, @Nullable String metadataKey)
Metadata columns allow to access connector and/or format specific fields for every row of a table. For example, a metadata column can be used to read and write the timestamp from and to Kafka records for time-based operations. The connector and format documentation lists the available metadata fields for every component.
Every metadata field is identified by a string-based key and has a documented data type. The metadata key can be omitted if the column name should be used as the identifying metadata key. For convenience, the runtime will perform an explicit cast if the data type of the column differs from the data type of the metadata field. Of course, this requires that the two data types are compatible.
Note: This method assumes that a metadata column can be used for both reading and writing.
columnName
- column namedataType
- data type of the columnmetadataKey
- identifying metadata key, if null the column name will be used as
metadata keypublic Schema.Builder columnByMetadata(String columnName, AbstractDataType<?> dataType, @Nullable String metadataKey, boolean isVirtual)
Metadata columns allow to access connector and/or format specific fields for every row of a table. For example, a metadata column can be used to read and write the timestamp from and to Kafka records for time-based operations. The connector and format documentation lists the available metadata fields for every component.
Every metadata field is identified by a string-based key and has a documented data type. The metadata key can be omitted if the column name should be used as the identifying metadata key. For convenience, the runtime will perform an explicit cast if the data type of the column differs from the data type of the metadata field. Of course, this requires that the two data types are compatible.
By default, a metadata column can be used for both reading and writing. However, in
many cases an external system provides more read-only metadata fields than writable
fields. Therefore, it is possible to exclude metadata columns from persisting by setting
the isVirtual
flag to true
.
columnName
- column namedataType
- data type of the columnmetadataKey
- identifying metadata key, if null the column name will be used as
metadata keyisVirtual
- whether the column should be persisted or notpublic Schema.Builder columnByMetadata(String columnName, String serializableTypeString, @Nullable String metadataKey, boolean isVirtual)
See columnByMetadata(String, AbstractDataType, String, boolean)
for a
detailed explanation.
This method uses a type string that can be easily persisted in a durable catalog.
columnName
- column nameserializableTypeString
- data type of the columnmetadataKey
- identifying metadata key, if null the column name will be used as
metadata keyisVirtual
- whether the column should be persisted or notpublic Schema.Builder withComment(@Nullable String comment)
public Schema.Builder watermark(String columnName, Expression watermarkExpression)
The column must be of type TIMESTAMP(3)
or TIMESTAMP_LTZ(3)
and be a
top-level column in the schema. It may be a computed column.
The watermark generation expression is evaluated by the framework for every record during runtime. The framework will periodically emit the largest generated watermark. If the current watermark is still identical to the previous one, or is null, or the value of the returned watermark is smaller than that of the last emitted one, then no new watermark will be emitted. A watermark is emitted in an interval defined by the configuration.
Any scalar expression can be used for declaring a watermark strategy for
in-memory/temporary tables. However, currently, only SQL expressions can be persisted in
a catalog. The expression's return data type must be TIMESTAMP(3)
. User-defined
functions (also defined in different catalogs) are supported.
Example: .watermark("ts", $("ts).minus(lit(5).seconds())
columnName
- the column name used as a rowtime attributewatermarkExpression
- the expression used for watermark generationpublic Schema.Builder watermark(String columnName, String sqlExpression)
See watermark(String, Expression)
for a detailed explanation.
This method uses a SQL expression that can be easily persisted in a durable catalog.
Example: .watermark("ts", "ts - INTERVAL '5' SECOND")
public Schema.Builder primaryKey(String... columnNames)
The primary key will be assigned a random name.
columnNames
- columns that form a unique primary keypublic Schema.Builder primaryKey(List<String> columnNames)
The primary key will be assigned a generated name in the format PK_col1_col2
.
columnNames
- columns that form a unique primary keypublic Schema.Builder primaryKeyNamed(String constraintName, String... columnNames)
constraintName
- name for the primary key, can be used to reference the constraintcolumnNames
- columns that form a unique primary keypublic Schema.Builder primaryKeyNamed(String constraintName, List<String> columnNames)
constraintName
- name for the primary key, can be used to reference the constraintcolumnNames
- columns that form a unique primary keyCopyright © 2014–2024 The Apache Software Foundation. All rights reserved.