.. ################################################################################ Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ################################################################################ ===== Table ===== Table ===== A :class:`~pyflink.table.Table` object is the core abstraction of the Table API. Similar to how the DataStream API has DataStream, the Table API is built around :class:`~pyflink.table.Table`. A :class:`~pyflink.table.Table` object describes a pipeline of data transformations. It does not contain the data itself in any way. Instead, it describes how to read data from a table source, and how to eventually write data to a table sink. The declared pipeline can be printed, optimized, and eventually executed in a cluster. The pipeline can work with bounded or unbounded streams which enables both streaming and batch scenarios. By the definition above, a :class:`~pyflink.table.Table` object can actually be considered as a view in SQL terms. The initial :class:`~pyflink.table.Table` object is constructed by a :class:`~pyflink.table.TableEnvironment`. For example, :func:`~pyflink.table.TableEnvironment.from_path` obtains a table from a catalog. Every :class:`~pyflink.table.Table` object has a schema that is available through :func:`~pyflink.table.Table.get_schema`. A :class:`~pyflink.table.Table` object is always associated with its original table environment during programming. Every transformation (i.e. :func:`~pyflink.table.Table.select`} or :func:`~pyflink.table.Table.filter` on a :class:`~pyflink.table.Table` object leads to a new :class:`~pyflink.table.Table` object. Use :func:`~pyflink.table.Table.execute` to execute the pipeline and retrieve the transformed data locally during development. Otherwise, use :func:`~pyflink.table.Table.execute_insert` to write the data into a table sink. Many methods of this class take one or more :class:`~pyflink.table.Expression` as parameters. For fluent definition of expressions and easier readability, we recommend to add a star import: Example: :: >>> from pyflink.table.expressions import * Check the documentation for more programming language specific APIs. The following example shows how to work with a :class:`~pyflink.table.Table` object. Example: :: >>> from pyflink.table import EnvironmentSettings, TableEnvironment >>> from pyflink.table.expressions import * >>> env_settings = EnvironmentSettings.in_streaming_mode() >>> t_env = TableEnvironment.create(env_settings) >>> table = t_env.from_path("my_table").select(col("colA").trim(), col("colB") + 12) >>> table.execute().print() .. currentmodule:: pyflink.table .. autosummary:: :toctree: api/ Table.add_columns Table.add_or_replace_columns Table.aggregate Table.alias Table.distinct Table.drop_columns Table.drop_columns Table.execute Table.execute_insert Table.explain Table.fetch Table.filter Table.flat_aggregate Table.flat_map Table.full_outer_join Table.get_schema Table.group_by Table.intersect Table.intersect_all Table.join Table.join_lateral Table.left_outer_join Table.left_outer_join_lateral Table.limit Table.map Table.minus Table.minus_all Table.offset Table.order_by Table.over_window Table.print_schema Table.rename_columns Table.right_outer_join Table.select Table.to_pandas Table.union Table.union_all Table.where Table.window GroupedTable ============ A table that has been grouped on a set of grouping keys. .. currentmodule:: pyflink.table .. autosummary:: :toctree: api/ GroupedTable.select GroupedTable.aggregate GroupedTable.flat_aggregate GroupWindowedTable ================== A table that has been windowed for :class:`~pyflink.table.GroupWindow`. .. currentmodule:: pyflink.table .. autosummary:: :toctree: api/ GroupWindowedTable.group_by WindowGroupedTable ================== A table that has been windowed and grouped for :class:`~pyflink.table.window.GroupWindow`. .. currentmodule:: pyflink.table .. autosummary:: :toctree: api/ WindowGroupedTable.select WindowGroupedTable.aggregate OverWindowedTable ================= A table that has been windowed for :class:`~pyflink.table.window.OverWindow`. Unlike group windows, which are specified in the GROUP BY clause, over windows do not collapse rows. Instead over window aggregates compute an aggregate for each input row over a range of its neighboring rows. .. currentmodule:: pyflink.table .. autosummary:: :toctree: api/ OverWindowedTable.select AggregatedTable =============== A table that has been performed on the aggregate function. .. currentmodule:: pyflink.table.table .. autosummary:: :toctree: api/ AggregatedTable.select FlatAggregateTable ================== A table that performs flatAggregate on a :class:`~pyflink.table.Table`, a :class:`~pyflink.table.GroupedTable` or a :class:`~pyflink.table.WindowGroupedTable` .. currentmodule:: pyflink.table.table .. autosummary:: :toctree: api/ FlatAggregateTable.select