This documentation is for an unreleased version of Apache Flink Table Store. We recommend you use the latest stable version.
Spark2 #
This documentation is a guide for using Table Store in Spark2.
Version #
Table Store supports Spark 2.4+. It is highly recommended to use Spark 2.4+ version with many improvements.
Preparing Table Store Jar File #
You are using an unreleased version of Table Store so you need to manually build bundled jar from the source code.To build from source code, either download the source of a release or clone the git repository.
Build bundled jar with the following command.
mvn clean install -DskipTests
You can find the bundled jar in ./flink-table-store-spark/flink-table-store-spark-2/target/flink-table-store-spark-2-0.4-SNAPSHOT.jar
.
Quick Start #
If you are using HDFS, make sure that the environment variableHADOOP_HOME
orHADOOP_CONF_DIR
is set.
Step 1: Prepare Test Data
Table Store currently only supports reading tables through Spark2. To create a Table Store table with records, please follow our Flink quick start guide.
After the guide, all table files should be stored under the path /tmp/table_store
, or the warehouse path you’ve specified.
Step 2: Specify Table Store Jar File
You can append path to table store jar file to the --jars
argument when starting spark-shell
.
spark-shell ... --jars /path/to/flink-table-store-spark-2-0.4-SNAPSHOT.jar
Alternatively, you can copy flink-table-store-spark-2-0.4-SNAPSHOT.jar
under spark/jars
in your Spark installation directory.
Step 3: Query Table
Table store with Spark 2.4 does not support DDL. You can use the Dataset
reader and register the Dataset
as a temporary table. In spark shell:
val dataset = spark.read.format("tablestore").load("file:/tmp/table_store/default.db/word_count")
dataset.createOrReplaceTempView("word_count")
spark.sql("SELECT * FROM word_count").show()