Format: Serialization Schema Format: Deserialization Schema
The Apache Orc format allows to read and write Orc data.
In order to setup the Orc format, the following table provides dependency information for both projects using a build automation tool (such as Maven or SBT) and SQL Client with SQL JAR bundles.
Maven dependency | SQL Client JAR |
---|---|
flink-orc_2.11 | Download |
Here is an example to create a table using Filesystem connector and Orc format.
Option | Required | Default | Type | Description |
---|---|---|---|---|
format |
required | (none) | String | Specify what format to use, here should be 'orc'. |
Orc format also supports table properties from Table properties.
For example, you can configure orc.compress=SNAPPY
to enable snappy compression.
Orc format type mapping is compatible with Apache Hive. The following table lists the type mapping from Flink type to Orc type.
Flink Data Type | Orc physical type | Orc logical type |
---|---|---|
CHAR | bytes | CHAR |
VARCHAR | bytes | VARCHAR |
STRING | bytes | STRING |
BOOLEAN | long | BOOLEAN |
BYTES | bytes | BINARY |
DECIMAL | decimal | DECIMAL |
TINYINT | long | BYTE |
SMALLINT | long | SHORT |
INT | long | INT |
BIGINT | long | LONG |
FLOAT | double | FLOAT |
DOUBLE | double | DOUBLE |
DATE | long | DATE |
TIMESTAMP | timestamp | TIMESTAMP |
Attention Composite data type: Array, Map and Row are not supported.