Table#
Table#
A Table
object is the core abstraction of the Table API.
Similar to how the DataStream API has DataStream,
the Table API is built around Table
.
A Table
object describes a pipeline of data transformations. It does not
contain the data itself in any way. Instead, it describes how to read data from a table source,
and how to eventually write data to a table sink. The declared pipeline can be
printed, optimized, and eventually executed in a cluster. The pipeline can work with bounded or
unbounded streams which enables both streaming and batch scenarios.
By the definition above, a Table
object can actually be considered as
a view in SQL terms.
The initial Table
object is constructed by a
TableEnvironment
. For example,
from_path()
obtains a table from a catalog.
Every Table
object has a schema that is available through
get_schema()
. A Table
object is
always associated with its original table environment during programming.
Every transformation (i.e. select()
} or
filter()
on a Table
object leads to a new
Table
object.
Use execute()
to execute the pipeline and retrieve the transformed
data locally during development. Otherwise, use execute_insert()
to
write the data into a table sink.
Many methods of this class take one or more Expression
as parameters.
For fluent definition of expressions and easier readability, we recommend to add a star import:
Example:
>>> from pyflink.table.expressions import *
Check the documentation for more programming language specific APIs.
The following example shows how to work with a Table
object.
Example:
>>> from pyflink.table import EnvironmentSettings, TableEnvironment
>>> from pyflink.table.expressions import *
>>> env_settings = EnvironmentSettings.in_streaming_mode()
>>> t_env = TableEnvironment.create(env_settings)
>>> table = t_env.from_path("my_table").select(col("colA").trim(), col("colB") + 12)
>>> table.execute().print()
|
Adds additional columns. |
|
Adds additional columns. |
|
Performs a global aggregate operation with an aggregate function. |
|
Renames the fields of the expression result. |
Removes duplicate values and returns only distinct (different) values. |
|
|
Drops existing columns. |
|
Drops existing columns. |
Collects the contents of the current table local client. |
|
|
|
|
Returns the AST of this table and the execution plan. |
|
Limits a (possibly sorted) result to the first n rows. |
|
Filters out elements that don't pass the filter predicate. |
|
Perform a global flat_aggregate without group_by. |
|
Performs a flatMap operation with a user-defined table function. |
|
Joins two |
Returns the |
|
|
Groups the elements on some grouping keys. |
|
Intersects two |
|
Intersects two |
|
Joins two |
|
Joins this Table with an user-defined TableFunction. |
|
Joins two |
|
Joins this Table with an user-defined TableFunction. |
|
Limits a (possibly sorted) result to the first n rows. |
|
Performs a map operation with a user-defined scalar function. |
|
Minus of two |
|
Minus of two |
|
Limits a (possibly sorted) result from an offset position. |
|
Sorts the given |
|
Defines over-windows on the records of a table. |
Prints the schema of this table to the console in a tree format. |
|
|
Renames existing columns. |
|
Joins two |
|
Performs a selection operation. |
Converts the table to a pandas DataFrame. |
|
|
Unions two |
|
Unions two |
|
Filters out elements that don't pass the filter predicate. |
|
Defines group window on the records of a table. |
GroupedTable#
A table that has been grouped on a set of grouping keys.
|
Performs a selection operation on a grouped table. |
|
Performs a aggregate operation with an aggregate function. |
Performs a flat_aggregate operation on a grouped table. |
GroupWindowedTable#
A table that has been windowed for GroupWindow
.
|
Groups the elements by a mandatory window and one or more optional grouping attributes. |
WindowGroupedTable#
A table that has been windowed and grouped for GroupWindow
.
|
Performs a selection operation on a window grouped table. |
Performs an aggregate operation on a window grouped table. |
OverWindowedTable#
A table that has been windowed for OverWindow
.
Unlike group windows, which are specified in the GROUP BY clause, over windows do not collapse rows. Instead over window aggregates compute an aggregate for each input row over a range of its neighboring rows.
|
Performs a selection operation on a over windowed table. |
AggregatedTable#
A table that has been performed on the aggregate function.
|
Performs a selection operation after an aggregate operation. |
FlatAggregateTable#
A table that performs flatAggregate on a Table
, a
GroupedTable
or a WindowGroupedTable
|
Performs a selection operation on a FlatAggregateTable. |