pyflink.table.table_environment.StreamTableEnvironment.from_pandas#
- StreamTableEnvironment.from_pandas(pdf, schema: Optional[Union[pyflink.table.types.RowType, List[str], Tuple[str], List[pyflink.table.types.DataType], Tuple[pyflink.table.types.DataType]]] = None, splits_num: int = 1) pyflink.table.table.Table #
Creates a table from a pandas DataFrame.
Example:
>>> pdf = pd.DataFrame(np.random.rand(1000, 2)) # use the second parameter to specify custom field names >>> table_env.from_pandas(pdf, ["a", "b"]) # use the second parameter to specify custom field types >>> table_env.from_pandas(pdf, [DataTypes.DOUBLE(), DataTypes.DOUBLE()])) # use the second parameter to specify custom table schema >>> table_env.from_pandas(pdf, ... DataTypes.ROW([DataTypes.FIELD("a", DataTypes.DOUBLE()), ... DataTypes.FIELD("b", DataTypes.DOUBLE())]))
- Parameters
pdf – The pandas DataFrame.
schema – The schema of the converted table.
splits_num – The number of splits the given Pandas DataFrame will be split into. It determines the number of parallel source tasks. If not specified, the default parallelism will be used.
- Returns
The result table.
New in version 1.11.0.