This documentation is for an unreleased version of Apache Flink Table Store. We recommend you use the latest stable version.
Overview #
File Systems for Unified Engine #
Apache Flink Table Store utilizes the same pluggable file systems as Apache Flink. Users can follow the standard plugin mechanism to configure the plugin structure if using Flink as compute engine. However, for other engines like Spark or Hive, the provided opt jars (by Flink) may get conflicts and cannot be used directly. It is not convenient for users to fix class conflicts, thus Flink Table Store provides the self-contained and engine-unified FileSystem pluggable jars for user to query tables from Spark/Hive side.
Supported FileSystem #
FileSystem | URI Scheme | Pluggable | Description |
---|---|---|---|
Local File System | file:// | N | Built-in Support |
Aliyun OSS | oss:// | Y | Tested on Spark3.3 and Hive 3.1 |
Build #
After Build Flink Table Store from the source code, you can find the shaded jars under
./flink-table-store-filesystem/flink-table-store-filesystem-${filesystem}/target/flink-table-store-filesystem-${filesystem}-0.3-SNAPSHOT.jar
.
Common Configurations #
After building, users need pick the required file system jar, and configure the required file system parameters by adding a command/configuration prefix tablestore
.
For example, if users want set up a Flink job and use OSS as the underlay file system, and want to read from Spark/Hive side.
-
On Flink side, configure
flink-conf.yaml
likefs.oss.endpoint: oss-cn-hangzhou.aliyun.cs.com fs.oss.accessKey: xxx fs.oss.accessSecret: yyy
-
On Spark side, place
flink-table-store-filesystem-oss-0.3-SNAPSHOT.jar
together withflink-table-store-spark-0.3-SNAPSHOT.jar
under Spark’s jars directory, and start like-
Spark Shell
spark-shell \ --conf spark.datasource.tablestore.fs.oss.endpoint=oss-cn-hangzhou.aliyun.cs.com \ --conf spark.datasource.tablestore.fs.oss.accessKey=xxx \ --conf spark.datasource.tablestore.fs.oss.accessSecret=yyy
-
Spark SQL
spark-sql \ --conf spark.sql.catalog.tablestore=org.apache.flink.table.store.spark.SparkCatalog \ --conf spark.sql.catalog.tablestore.warehouse=oss://<bucket-name>/ \ --conf spark.sql.catalog.tablestore.fs.oss.endpoint=oss-cn-hangzhou.aliyun.cs.com \ --conf spark.sql.catalog.tablestore.fs.oss.accessKey=xxx \ --conf spark.sql.catalog.tablestore.fs.oss.accessSecret=yyy
-
-
On Hive side, place
flink-table-store-filesystem-oss-0.3-SNAPSHOT.jar
together withflink-table-store-hive-connector-0.3-SNAPSHOT.jar
under Hive’s auxlib directory, and start like- Hive Catalog
SET tablestore.fs.oss.endpoint=oss-cn-hangzhou.aliyun.cs.com; SET tablestore.fs.oss.accessKey=xxx; SET tablestore.fs.oss.accessSecret=yyy; CREATE EXTERNAL TABLE external_test_table STORED BY 'org.apache.flink.table.store.hive.TableStoreHiveStorageHandler' LOCATION 'oss://<bucket-name>/<db-name>.db/<table-name>';
- Hive Catalog