pyflink.table.schema.Schema.Builder.watermark#
- Builder.watermark(column_name: str, watermark_expr: Union[str, pyflink.table.expression.Expression]) pyflink.table.schema.Schema.Builder #
Declares that the given column should serve as an event-time (i.e. rowtime) attribute and specifies a corresponding watermark strategy as an expression.
The column must be of type {@code TIMESTAMP(3)} or {@code TIMESTAMP_LTZ(3)} and be a top-level column in the schema. It may be a computed column.
The watermark generation expression is evaluated by the framework for every record during runtime. The framework will periodically emit the largest generated watermark. If the current watermark is still identical to the previous one, or is null, or the value of the returned watermark is smaller than that of the last emitted one, then no new watermark will be emitted. A watermark is emitted in an interval defined by the configuration.
Any scalar expression can be used for declaring a watermark strategy for in-memory/temporary tables. However, currently, only SQL expressions can be persisted in a catalog. The expression’s return data type must be {@code TIMESTAMP(3)}. User-defined functions (also defined in different catalogs) are supported.
Example:
>>> Schema.new_builder().watermark("ts", "ts - INTERVAL '5' SECOND")
- Parameters
column_name – The column name used as a rowtime attribute
watermark_expr – The expression used for watermark generation