This documentation is for an unreleased version of Apache Flink Table Store. We recommend you use the latest stable version.
S3 #
Build
To build from source code, either download the source of a release or clone the git repository.
Build shaded jar with the following command.
mvn clean install -DskipTests
You can find the shaded jars under
./flink-table-store-filesystems/flink-table-store-s3/target/flink-table-store-s3-0.4-SNAPSHOT.jar
.
Usage #
Put flink-table-store-s3-0.4-SNAPSHOT.jar
into lib
directory of your Flink home, and create catalog:
CREATE CATALOG my_catalog WITH (
'type' = 'table-store',
'warehouse' = 's3://path/to/warehouse',
's3.endpoint' = 'your-endpoint-hostname',
's3.access-key' = 'xxx',
's3.secret-key' = 'yyy'
);
Place flink-table-store-s3-0.4-SNAPSHOT.jar
together with flink-table-store-spark-0.4-SNAPSHOT.jar
under Spark’s jars directory, and start like
spark-sql \
--conf spark.sql.catalog.tablestore=org.apache.flink.table.store.spark.SparkCatalog \
--conf spark.sql.catalog.tablestore.warehouse=s3://<bucket>/<endpoint> \
--conf spark.sql.catalog.tablestore.s3.endpoint=your-endpoint-hostname \
--conf spark.sql.catalog.tablestore.s3.access-key=xxx \
--conf spark.sql.catalog.tablestore.s3.secret-key=yyy
NOTE: You need to ensure that Hive metastore can access s3
.
Place flink-table-store-s3-0.4-SNAPSHOT.jar
together with flink-table-store-hive-connector-0.4-SNAPSHOT.jar
under Hive’s auxlib directory, and start like
SET tablestore.s3.endpoint=your-endpoint-hostname;
SET tablestore.s3.access-key=xxx;
SET tablestore.s3.secret-key=yyy;
And read table from hive metastore, table can be created by Flink or Spark, see Catalog with Hive Metastore
SELECT * FROM test_table;
SELECT COUNT(1) FROM test_table;
Place flink-table-store-s3-0.4-SNAPSHOT.jar
together with flink-table-store-trino-0.4-SNAPSHOT.jar
under plugin/tablestore
directory.
Add options in etc/catalog/tablestore.properties
.
s3.endpoint=your-endpoint-hostname
s3.access-key=xxx
s3.secret-key=yyy
S3 Complaint Object Stores #
The S3 Filesystem also support using S3 compliant object stores such as IBM’s Cloud Object Storage and MinIO. Just configure your endpoint to the provider of the object store service.
s3.endpoint: your-endpoint-hostname