Class StreamTableEnvironmentImpl

    • Method Detail

      • registerFunction

        public <T> void registerFunction​(String name,
                                         TableFunction<T> tableFunction)
        Description copied from interface: StreamTableEnvironment
        Registers a TableFunction under a unique name in the TableEnvironment's catalog. Registered functions can be referenced in Table API and SQL queries.
        Specified by:
        registerFunction in interface StreamTableEnvironment
        Type Parameters:
        T - The type of the output row.
        Parameters:
        name - The name under which the function is registered.
        tableFunction - The TableFunction to register.
      • registerFunction

        public <T,​ACC> void registerFunction​(String name,
                                                   AggregateFunction<T,​ACC> aggregateFunction)
        Description copied from interface: StreamTableEnvironment
        Registers an AggregateFunction under a unique name in the TableEnvironment's catalog. Registered functions can be referenced in Table API and SQL queries.
        Specified by:
        registerFunction in interface StreamTableEnvironment
        Type Parameters:
        T - The type of the output value.
        ACC - The type of aggregate accumulator.
        Parameters:
        name - The name under which the function is registered.
        aggregateFunction - The AggregateFunction to register.
      • registerFunction

        public <T,​ACC> void registerFunction​(String name,
                                                   TableAggregateFunction<T,​ACC> tableAggregateFunction)
        Description copied from interface: StreamTableEnvironment
        Registers an TableAggregateFunction under a unique name in the TableEnvironment's catalog. Registered functions can only be referenced in Table API.
        Specified by:
        registerFunction in interface StreamTableEnvironment
        Type Parameters:
        T - The type of the output value.
        ACC - The type of aggregate accumulator.
        Parameters:
        name - The name under which the function is registered.
        tableAggregateFunction - The TableAggregateFunction to register.
      • fromDataStream

        public <T> Table fromDataStream​(DataStream<T> dataStream,
                                        Schema schema)
        Description copied from interface: StreamTableEnvironment
        Converts the given DataStream into a Table.

        Column names and types of the Table are automatically derived from the TypeInformation of the DataStream. If the outermost record's TypeInformation is a CompositeType, it will be flattened in the first level. TypeInformation that cannot be represented as one of the listed DataTypes will be treated as a black-box DataTypes.RAW(Class, TypeSerializer) type. Thus, composite nested fields will not be accessible.

        Since the DataStream API does not support changelog processing natively, this method assumes append-only/insert-only semantics during the stream-to-table conversion. Records of class Row must describe RowKind.INSERT changes.

        By default, the stream record's timestamp and watermarks are not propagated to downstream table operations unless explicitly declared in the input schema.

        This method allows to declare a Schema for the resulting table. The declaration is similar to a CREATE TABLE DDL in SQL and allows to:

        • enrich or overwrite automatically derived columns with a custom DataType
        • reorder columns
        • add computed or metadata columns next to the physical columns
        • access a stream record's timestamp
        • declare a watermark strategy or propagate the DataStream watermarks

        It is possible to declare a schema without physical/regular columns. In this case, those columns will be automatically derived and implicitly put at the beginning of the schema declaration.

        The following examples illustrate common schema declarations and their semantics:

             // given a DataStream of Tuple2 < String , BigDecimal >
        
             // === EXAMPLE 1 ===
        
             // no physical columns defined, they will be derived automatically,
             // e.g. BigDecimal becomes DECIMAL(38, 18)
        
             Schema.newBuilder()
                 .columnByExpression("c1", "f1 + 42")
                 .columnByExpression("c2", "f1 - 1")
                 .build()
        
             // equal to: CREATE TABLE (f0 STRING, f1 DECIMAL(38, 18), c1 AS f1 + 42, c2 AS f1 - 1)
        
             // === EXAMPLE 2 ===
        
             // physical columns defined, input fields and columns will be mapped by name,
             // columns are reordered and their data type overwritten,
             // all columns must be defined to show up in the final table's schema
        
             Schema.newBuilder()
                 .column("f1", "DECIMAL(10, 2)")
                 .columnByExpression("c", "f1 - 1")
                 .column("f0", "STRING")
                 .build()
        
             // equal to: CREATE TABLE (f1 DECIMAL(10, 2), c AS f1 - 1, f0 STRING)
        
             // === EXAMPLE 3 ===
        
             // timestamp and watermarks can be added from the DataStream API,
             // physical columns will be derived automatically
        
             Schema.newBuilder()
                 .columnByMetadata("rowtime", "TIMESTAMP_LTZ(3)") // extract timestamp into a column
                 .watermark("rowtime", "SOURCE_WATERMARK()")  // declare watermarks propagation
                 .build()
        
             // equal to:
             //     CREATE TABLE (
             //        f0 STRING,
             //        f1 DECIMAL(38, 18),
             //        rowtime TIMESTAMP(3) METADATA,
             //        WATERMARK FOR rowtime AS SOURCE_WATERMARK()
             //     )
         
        Specified by:
        fromDataStream in interface StreamTableEnvironment
        Type Parameters:
        T - The external type of the DataStream.
        Parameters:
        dataStream - The DataStream to be converted.
        schema - The customized schema for the final table.
        Returns:
        The converted Table.
        See Also:
        StreamTableEnvironment.fromChangelogStream(DataStream, Schema)
      • toDataStream

        public <T> DataStream<T> toDataStream​(Table table,
                                              AbstractDataType<?> targetDataType)
        Description copied from interface: StreamTableEnvironment
        Converts the given Table into a DataStream of the given DataType.

        The given DataType is used to configure the table runtime to convert columns and internal data structures to the desired representation. The following example shows how to convert the table columns into the fields of a POJO type.

             // given a Table of (name STRING, age INT)
        
             public static class MyPojo {
                 public String name;
                 public Integer age;
        
                 // default constructor for DataStream API
                 public MyPojo() {}
        
                 // fully assigning constructor for field order in Table API
                 public MyPojo(String name, Integer age) {
                     this.name = name;
                     this.age = age;
                 }
             }
        
             tableEnv.toDataStream(table, DataTypes.of(MyPojo.class));
         

        Since the DataStream API does not support changelog processing natively, this method assumes append-only/insert-only semantics during the table-to-stream conversion. Updating tables are not supported by this method and will produce an exception.

        Note that the type system of the table ecosystem is richer than the one of the DataStream API. The table runtime will make sure to properly serialize the output records to the first operator of the DataStream API. Afterwards, the Types semantics of the DataStream API need to be considered.

        If the input table contains a single rowtime column, it will be propagated into a stream record's timestamp. Watermarks will be propagated as well.

        Specified by:
        toDataStream in interface StreamTableEnvironment
        Type Parameters:
        T - External record.
        Parameters:
        table - The Table to convert. It must be insert-only.
        targetDataType - The DataType that decides about the final external representation in DataStream records.
        Returns:
        The converted DataStream.
        See Also:
        StreamTableEnvironment.toDataStream(Table), StreamTableEnvironment.toChangelogStream(Table, Schema)
      • toChangelogStream

        public DataStream<Row> toChangelogStream​(Table table)
        Description copied from interface: StreamTableEnvironment
        Converts the given Table into a DataStream of changelog entries.

        Compared to StreamTableEnvironment.toDataStream(Table), this method produces instances of Row and sets the RowKind flag that is contained in every record during runtime. The runtime behavior is similar to that of a DynamicTableSink.

        This method can emit a changelog containing all kinds of changes (enumerated in RowKind) that the given updating table requires as the default ChangelogMode. Use StreamTableEnvironment.toChangelogStream(Table, Schema, ChangelogMode) to limit the kinds of changes (e.g. for upsert mode).

        Note that the type system of the table ecosystem is richer than the one of the DataStream API. The table runtime will make sure to properly serialize the output records to the first operator of the DataStream API. Afterwards, the Types semantics of the DataStream API need to be considered.

        If the input table contains a single rowtime column, it will be propagated into a stream record's timestamp. Watermarks will be propagated as well.

        Specified by:
        toChangelogStream in interface StreamTableEnvironment
        Parameters:
        table - The Table to convert. It can be updating or insert-only.
        Returns:
        The converted changelog stream of Row.
      • toChangelogStream

        public DataStream<Row> toChangelogStream​(Table table,
                                                 Schema targetSchema)
        Description copied from interface: StreamTableEnvironment
        Converts the given Table into a DataStream of changelog entries.

        Compared to StreamTableEnvironment.toDataStream(Table), this method produces instances of Row and sets the RowKind flag that is contained in every record during runtime. The runtime behavior is similar to that of a DynamicTableSink.

        This method can emit a changelog containing all kinds of changes (enumerated in RowKind) that the given updating table requires as the default ChangelogMode. Use StreamTableEnvironment.toChangelogStream(Table, Schema, ChangelogMode) to limit the kinds of changes (e.g. for upsert mode).

        The given Schema is used to configure the table runtime to convert columns and internal data structures to the desired representation. The following example shows how to convert a table column into a POJO type.

             // given a Table of (id BIGINT, payload ROW < name STRING , age INT >)
        
             public static class MyPojo {
                 public String name;
                 public Integer age;
        
                 // default constructor for DataStream API
                 public MyPojo() {}
        
                 // fully assigning constructor for field order in Table API
                 public MyPojo(String name, Integer age) {
                     this.name = name;
                     this.age = age;
                 }
             }
        
             tableEnv.toChangelogStream(
                 table,
                 Schema.newBuilder()
                     .column("id", DataTypes.BIGINT())
                     .column("payload", DataTypes.of(MyPojo.class)) // force an implicit conversion
                     .build());
         

        Note that the type system of the table ecosystem is richer than the one of the DataStream API. The table runtime will make sure to properly serialize the output records to the first operator of the DataStream API. Afterwards, the Types semantics of the DataStream API need to be considered.

        If the input table contains a single rowtime column, it will be propagated into a stream record's timestamp. Watermarks will be propagated as well.

        If the rowtime should not be a concrete field in the final Row anymore, or the schema should be symmetrical for both StreamTableEnvironment.fromChangelogStream(org.apache.flink.streaming.api.datastream.DataStream<org.apache.flink.types.Row>) and StreamTableEnvironment.toChangelogStream(org.apache.flink.table.api.Table), the rowtime can also be declared as a metadata column that will be propagated into a stream record's timestamp. It is possible to declare a schema without physical/regular columns. In this case, those columns will be automatically derived and implicitly put at the beginning of the schema declaration.

        The following examples illustrate common schema declarations and their semantics:

             // given a Table of (id INT, name STRING, my_rowtime TIMESTAMP_LTZ(3))
        
             // === EXAMPLE 1 ===
        
             // no physical columns defined, they will be derived automatically,
             // the last derived physical column will be skipped in favor of the metadata column
        
             Schema.newBuilder()
                 .columnByMetadata("rowtime", "TIMESTAMP_LTZ(3)")
                 .build()
        
             // equal to: CREATE TABLE (id INT, name STRING, rowtime TIMESTAMP_LTZ(3) METADATA)
        
             // === EXAMPLE 2 ===
        
             // physical columns defined, all columns must be defined
        
             Schema.newBuilder()
                 .column("id", "INT")
                 .column("name", "STRING")
                 .columnByMetadata("rowtime", "TIMESTAMP_LTZ(3)")
                 .build()
        
             // equal to: CREATE TABLE (id INT, name STRING, rowtime TIMESTAMP_LTZ(3) METADATA)
         
        Specified by:
        toChangelogStream in interface StreamTableEnvironment
        Parameters:
        table - The Table to convert. It can be updating or insert-only.
        targetSchema - The Schema that decides about the final external representation in DataStream records.
        Returns:
        The converted changelog stream of Row.
      • toChangelogStream

        public DataStream<Row> toChangelogStream​(Table table,
                                                 Schema targetSchema,
                                                 ChangelogMode changelogMode)
        Description copied from interface: StreamTableEnvironment
        Converts the given Table into a DataStream of changelog entries.

        Compared to StreamTableEnvironment.toDataStream(Table), this method produces instances of Row and sets the RowKind flag that is contained in every record during runtime. The runtime behavior is similar to that of a DynamicTableSink.

        This method requires an explicitly declared ChangelogMode. For example, use ChangelogMode.upsert() if the stream will not contain RowKind.UPDATE_BEFORE, or ChangelogMode.insertOnly() for non-updating streams.

        Note that the type system of the table ecosystem is richer than the one of the DataStream API. The table runtime will make sure to properly serialize the output records to the first operator of the DataStream API. Afterwards, the Types semantics of the DataStream API need to be considered.

        If the input table contains a single rowtime column, it will be propagated into a stream record's timestamp. Watermarks will be propagated as well. However, it is also possible to write out the rowtime as a metadata column. See StreamTableEnvironment.toChangelogStream(Table, Schema) for more information and examples on how to declare a Schema.

        Specified by:
        toChangelogStream in interface StreamTableEnvironment
        Parameters:
        table - The Table to convert. It can be updating or insert-only.
        targetSchema - The Schema that decides about the final external representation in DataStream records.
        changelogMode - The required kinds of changes in the result changelog. An exception will be thrown if the given updating table cannot be represented in this changelog mode.
        Returns:
        The converted changelog stream of Row.
      • fromDataStream

        public <T> Table fromDataStream​(DataStream<T> dataStream,
                                        Expression... fields)
        Description copied from interface: StreamTableEnvironment
        Converts the given DataStream into a Table with specified field names.

        There are two modes for mapping original fields to the fields of the Table:

        1. Reference input fields by name: All fields in the schema definition are referenced by name (and possibly renamed using an alias (as). Moreover, we can define proctime and rowtime attributes at arbitrary positions using arbitrary names (except those that exist in the result schema). In this mode, fields can be reordered and projected out. This mode can be used for any input type, including POJOs.

        Example:

        
         DataStream<Tuple2<String, Long>> stream = ...
         Table table = tableEnv.fromDataStream(
            stream,
            $("f1"), // reorder and use the original field
            $("rowtime").rowtime(), // extract the internally attached timestamp into an event-time
                                    // attribute named 'rowtime'
            $("f0").as("name") // reorder and give the original field a better name
         );
         

        2. Reference input fields by position: In this mode, fields are simply renamed. Event-time attributes can replace the field on their position in the input data (if it is of correct type) or be appended at the end. Proctime attributes must be appended at the end. This mode can only be used if the input type has a defined field order (tuple, case class, Row) and none of the fields references a field of the input type.

        Example:

        
         DataStream<Tuple2<String, Long>> stream = ...
         Table table = tableEnv.fromDataStream(
            stream,
            $("a"), // rename the first field to 'a'
            $("b"), // rename the second field to 'b'
            $("rowtime").rowtime() // extract the internally attached timestamp into an event-time
                                   // attribute named 'rowtime'
         );
         
        Specified by:
        fromDataStream in interface StreamTableEnvironment
        Type Parameters:
        T - The type of the DataStream.
        Parameters:
        dataStream - The DataStream to be converted.
        fields - The fields expressions to map original fields of the DataStream to the fields of the Table.
        Returns:
        The converted Table.
      • registerDataStream

        public <T> void registerDataStream​(String name,
                                           DataStream<T> dataStream)
        Description copied from interface: StreamTableEnvironment
        Creates a view from the given DataStream. Registered views can be referenced in SQL queries.

        The field names of the Table are automatically derived from the type of the DataStream.

        The view is registered in the namespace of the current catalog and database. To register the view in a different catalog use StreamTableEnvironment.createTemporaryView(String, DataStream).

        Temporary objects can shadow permanent ones. If a permanent object in a given path exists, it will be inaccessible in the current session. To make the permanent object available again you can drop the corresponding temporary object.

        Specified by:
        registerDataStream in interface StreamTableEnvironment
        Type Parameters:
        T - The type of the DataStream to register.
        Parameters:
        name - The name under which the DataStream is registered in the catalog.
        dataStream - The DataStream to register.
      • createTemporaryView

        public <T> void createTemporaryView​(String path,
                                            DataStream<T> dataStream,
                                            Expression... fields)
        Description copied from interface: StreamTableEnvironment
        Creates a view from the given DataStream in a given path with specified field names. Registered views can be referenced in SQL queries.

        There are two modes for mapping original fields to the fields of the View:

        1. Reference input fields by name: All fields in the schema definition are referenced by name (and possibly renamed using an alias (as). Moreover, we can define proctime and rowtime attributes at arbitrary positions using arbitrary names (except those that exist in the result schema). In this mode, fields can be reordered and projected out. This mode can be used for any input type, including POJOs.

        Example:

        
         DataStream<Tuple2<String, Long>> stream = ...
         tableEnv.createTemporaryView(
            "cat.db.myTable",
            stream,
            $("f1"), // reorder and use the original field
            $("rowtime").rowtime(), // extract the internally attached timestamp into an event-time
                                    // attribute named 'rowtime'
            $("f0").as("name") // reorder and give the original field a better name
         );
         

        2. Reference input fields by position: In this mode, fields are simply renamed. Event-time attributes can replace the field on their position in the input data (if it is of correct type) or be appended at the end. Proctime attributes must be appended at the end. This mode can only be used if the input type has a defined field order (tuple, case class, Row) and none of the fields references a field of the input type.

        Example:

        
         DataStream<Tuple2<String, Long>> stream = ...
         tableEnv.createTemporaryView(
            "cat.db.myTable",
            stream,
            $("a"), // rename the first field to 'a'
            $("b"), // rename the second field to 'b'
            $("rowtime").rowtime() // adds an event-time attribute named 'rowtime'
         );
         

        Temporary objects can shadow permanent ones. If a permanent object in a given path exists, it will be inaccessible in the current session. To make the permanent object available again you can drop the corresponding temporary object.

        Specified by:
        createTemporaryView in interface StreamTableEnvironment
        Type Parameters:
        T - The type of the DataStream.
        Parameters:
        path - The path under which the DataStream is created. See also the TableEnvironment class description for the format of the path.
        dataStream - The DataStream out of which to create the view.
        fields - The fields expressions to map original fields of the DataStream to the fields of the View.