S3

S3 #

Download

Download flink table store shaded jar for Spark, Hive and Trino.

Usage #

Prepare S3 jar, then configure flink-conf.yaml like

s3.endpoint: your-endpoint-hostname
s3.access-key: xxx
s3.secret-key: yyy

Place flink-table-store-s3-0.3.0.jar together with flink-table-store-spark-0.3.0.jar under Spark’s jars directory, and start like

spark-sql \ 
  --conf spark.sql.catalog.tablestore=org.apache.flink.table.store.spark.SparkCatalog \
  --conf spark.sql.catalog.tablestore.warehouse=s3://<bucket>/<endpoint> \
  --conf spark.sql.catalog.tablestore.s3.endpoint=your-endpoint-hostname \
  --conf spark.sql.catalog.tablestore.s3.access-key=xxx \
  --conf spark.sql.catalog.tablestore.s3.secret-key=yyy

NOTE: You need to ensure that Hive metastore can access s3.

Place flink-table-store-s3-0.3.0.jar together with flink-table-store-hive-connector-0.3.0.jar under Hive’s auxlib directory, and start like

SET tablestore.s3.endpoint=your-endpoint-hostname;
SET tablestore.s3.access-key=xxx;
SET tablestore.s3.secret-key=yyy;

And read table from hive metastore, table can be created by Flink or Spark, see Catalog with Hive Metastore

SELECT * FROM test_table;
SELECT COUNT(1) FROM test_table;

Place flink-table-store-s3-0.3.0.jar together with flink-table-store-trino-0.3.0.jar under plugin/tablestore directory.

Add options in etc/catalog/tablestore.properties.

s3.endpoint=your-endpoint-hostname
s3.access-key=xxx
s3.secret-key=yyy

S3 Complaint Object Stores #

The S3 Filesystem also support using S3 compliant object stores such as IBM’s Cloud Object Storage and MinIO. Just configure your endpoint to the provider of the object store service.

s3.endpoint: your-endpoint-hostname