Descriptors#

TableDescriptor#

Describes a CatalogTable representing a source or sink.

TableDescriptor is a template for creating a CatalogTable instance. It closely resembles the “CREATE TABLE” SQL DDL statement, containing schema, connector options, and other characteristics. Since tables in Flink are typically backed by external systems, the descriptor describes how a connector (and possibly its format) are configured.

This can be used to register a table in the Table API, see create_temporary_table() in TableEnvironment.

TableDescriptor.for_connector(connector)

Creates a new Builder for a table using the given connector.

TableDescriptor.get_schema()

TableDescriptor.get_options()

TableDescriptor.get_partition_keys()

TableDescriptor.get_comment()

TableDescriptor.Builder.schema(schema)

Define the schema of the TableDescriptor.

TableDescriptor.Builder.option(key, value)

Sets the given option on the table.

TableDescriptor.Builder.format(format[, ...])

Defines the format to be used for this table.

TableDescriptor.Builder.partitioned_by(...)

Define which columns this table is partitioned by.

TableDescriptor.Builder.comment(comment)

Define the comment for this table.

TableDescriptor.Builder.build()

Returns an immutable instance of TableDescriptor.

FormatDescriptor#

Describes a Format and its options for use with TableDescriptor.

Formats are responsible for encoding and decoding data in table connectors. Note that not every connector has a format, while others may have multiple formats (e.g. the Kafka connector has separate formats for keys and values). Common formats are “json”, “csv”, “avro”, etc.

FormatDescriptor.for_format(format)

Creates a new Builder describing a format with the given format identifier.

FormatDescriptor.get_format()

FormatDescriptor.get_options()

FormatDescriptor.Builder.option(key, value)

Sets the given option on the format.

FormatDescriptor.Builder.build()

Returns an immutable instance of FormatDescriptor.

Schema#

Schema of a table or view.

A schema represents the schema part of a {@code CREATE TABLE (schema) WITH (options)} DDL statement in SQL. It defines columns of different kind, constraints, time attributes, and watermark strategies. It is possible to reference objects (such as functions or types) across different catalogs.

This class is used in the API and catalogs to define an unresolved schema that will be translated to ResolvedSchema. Some methods of this class perform basic validation, however, the main validation happens during the resolution. Thus, an unresolved schema can be incomplete and might be enriched or merged with a different schema at a later stage.

Since an instance of this class is unresolved, it should not be directly persisted. The str() shows only a summary of the contained objects.

Schema.Builder.from_schema(unresolved_schema)

Adopts all members from the given unresolved schema.

Schema.Builder.from_row_data_type(data_type)

Adopts all fields of the given row as physical columns of the schema.

Schema.Builder.from_fields(field_names, ...)

Adopts the given field names and field data types as physical columns of the schema.

Schema.Builder.column(column_name, data_type)

Declares a physical column that is appended to this schema.

Schema.Builder.column_by_expression(...)

Declares a computed column that is appended to this schema.

Schema.Builder.column_by_metadata(...[, ...])

Declares a metadata column that is appended to this schema.

Schema.Builder.watermark(column_name, ...)

Declares that the given column should serve as an event-time (i.e.

Schema.Builder.primary_key(*column_names)

Declares a primary key constraint for a set of given columns.

Schema.Builder.primary_key_named(...)

Declares a primary key constraint for a set of given columns.

Schema.Builder.build()

Returns an instance of an unresolved Schema.

TableSchema#

A table schema that represents a table’s structure with field names and data types.

TableSchema.copy()

Returns a deep copy of the table schema.

TableSchema.get_field_data_types()

Returns all field data types as a list.

TableSchema.get_field_data_type(field)

Returns the specified data type for the given field index or field name.

TableSchema.get_field_count()

Returns the number of fields.

TableSchema.get_field_names()

Returns all field names as a list.

TableSchema.get_field_name(field_index)

Returns the specified name for the given field index.

TableSchema.to_row_data_type()

Converts a table schema into a (nested) data type describing a pyflink.table.types.DataTypes.ROW().

TableSchema.Builder.field(name, data_type)

Add a field with name and data type.

TableSchema.Builder.build()

Returns a TableSchema instance.

ChangelogMode#

The set of changes contained in a changelog.

ChangelogMode.insert_only()

ChangelogMode.upsert()

ChangelogMode.all()