Creating Catalogs
This documentation is for an unreleased version of Apache Flink Table Store. We recommend you use the latest stable version.

Creating Catalogs #

Table Store catalogs currently support two types of metastores:

  • filesystem metastore (default), which stores both metadata and table files in filesystems.
  • hive metastore, which additionally stores metadata in Hive metastore. Users can directly access the tables from Hive.

See CatalogOptions for detailed options when creating a catalog.

Creating a Catalog with Filesystem Metastore #

The following Flink SQL registers and uses a Table Store catalog named my_catalog. Metadata and table files are stored under hdfs://path/to/warehouse.

CREATE CATALOG my_catalog WITH (
    'type' = 'table-store',
    'warehouse' = 'hdfs://path/to/warehouse'
);

USE CATALOG my_catalog;

The following shell command registers a Table Store catalog named tablestore. Metadata and table files are stored under hdfs://path/to/warehouse.

spark-sql ... \
    --conf spark.sql.catalog.tablestore=org.apache.flink.table.store.spark.SparkCatalog \
    --conf spark.sql.catalog.tablestore.warehouse=hdfs://path/to/warehouse

After spark-sql is started, you can switch to the default database of the tablestore catalog with the following SQL.

USE tablestore.default;

Creating a Catalog with Hive Metastore #

By using Table Store Hive catalog, changes to the catalog will directly affect the corresponding Hive metastore. Tables created in such catalog can also be accessed directly from Hive.

Preparing Table Store Hive Catalog Jar File #

You are using an unreleased version of Table Store so you need to manually build bundled jar from the source code.

To build from source code, either download the source of a release or clone the git repository.

Build bundled jar with the following command.

Version Command
Hive 3.1 mvn clean install -Dmaven.test.skip=true -Phive-3.1
Hive 2.3 mvn clean install -Dmaven.test.skip=true
Hive 2.2 mvn clean install -Dmaven.test.skip=true -Phive-2.2
Hive 2.1 mvn clean install -Dmaven.test.skip=true -Phive-2.1
Hive 2.1 CDH 6.3 mvn clean install -Dmaven.test.skip=true -Phive-2.1-cdh-6.3

You can find Hive catalog jar in ./flink-table-store-hive/flink-table-store-hive-catalog/target/flink-table-store-hive-catalog-0.4-SNAPSHOT.jar.

Registering Hive Catalog #

To enable Table Store Hive catalog support in Flink, you can pick one of the following two methods.

  • Copy Table Store Hive catalog jar file into the lib directory of your Flink installation directory. Note that this must be done before starting your Flink cluster.
  • If you’re using Flink’s SQL client, append --jar /path/to/flink-table-store-hive-catalog-0.4-SNAPSHOT.jar to the starting command of SQL client.

The following Flink SQL registers and uses a Table Store Hive catalog named my_hive. Metadata and table files are stored under hdfs://path/to/warehouse. In addition, metadata is also stored in Hive metastore.

CREATE CATALOG my_hive WITH (
    'type' = 'table-store',
    'metastore' = 'hive',
    'uri' = 'thrift://<hive-metastore-host-name>:<port>',
    'warehouse' = 'hdfs://path/to/warehouse'
);

USE CATALOG my_hive;

To enable Table Store Hive catalog support in Spark3, append the path of Table Store Hive catalog jar file to --jars argument when starting spark.

The following shell command registers a Table tore Hive catalog named tablestore. Metadata and table files are stored under hdfs://path/to/warehouse. In addition, metadata is also stored in Hive metastore.

spark-sql ... \
    --conf spark.sql.catalog.tablestore=org.apache.flink.table.store.spark.SparkCatalog \
    --conf spark.sql.catalog.tablestore.warehouse=hdfs://path/to/warehouse \
    --conf spark.sql.catalog.tablestore.metastore=hive \
    --conf spark.sql.catalog.tablestore.uri=thrift://<hive-metastore-host-name>:<port>

After spark-sql is started, you can switch to the default database of the tablestore catalog with the following SQL.

USE tablestore.default;