JSON Format #

Format: Serialization Schema Format: Deserialization Schema

The JSON format allows to read and write JSON data based on an JSON schema. Currently, the JSON schema is derived from table schema.

The JSON format supports append-only streams, unless you’re using a connector that explicitly support retract streams and/or upsert streams like the Upsert Kafka connector. If you need to write retract streams and/or upsert streams, we suggest you to look at CDC JSON formats like Debezium JSON and Canal JSON.

Dependencies #

In order to use the Json format the following dependencies are required for both projects using a build automation tool (such as Maven or SBT) and SQL Client with SQL JAR bundles.

Maven dependency	SQL Client
`<dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-json</artifactId> <version>1.20.0</version> </dependency>` Copied to clipboard!	Built-in

How to create a table with JSON format #

Here is an example to create a table using Kafka connector and JSON format.

CREATE TABLE user_behavior (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
 'connector' = 'kafka',
 'topic' = 'user_behavior',
 'properties.bootstrap.servers' = 'localhost:9092',
 'properties.group.id' = 'testGroup',
 'format' = 'json',
 'json.fail-on-missing-field' = 'false',
 'json.ignore-parse-errors' = 'true'
)

Format Options #

Option	Required	Forwarded	Default	Type	Description
format	required	no	(none)	String	Specify what format to use, here should be `'json'`.
json.fail-on-missing-field	optional	no	false	Boolean	Whether to fail if a field is missing or not.
json.ignore-parse-errors	optional	no	false	Boolean	Skip fields and rows with parse errors instead of failing. Fields are set to null in case of errors.
json.timestamp-format.standard	optional	yes	`'SQL'`	String	Specify the input and output timestamp format for `TIMESTAMP` and `TIMESTAMP_LTZ` type. Currently supported values are `'SQL'` and `'ISO-8601'`: Option `'SQL'` will parse input TIMESTAMP values in "yyyy-MM-dd HH:mm:ss.s{precision}" format, e.g "2020-12-30 12:13:14.123", parse input TIMESTAMP_LTZ values in "yyyy-MM-dd HH:mm:ss.s{precision}'Z'" format, e.g "2020-12-30 12:13:14.123Z" and output timestamp in the same format. Option `'ISO-8601'`will parse input TIMESTAMP in "yyyy-MM-ddTHH:mm:ss.s{precision}" format, e.g "2020-12-30T12:13:14.123" parse input TIMESTAMP_LTZ in "yyyy-MM-ddTHH:mm:ss.s{precision}'Z'" format, e.g "2020-12-30T12:13:14.123Z" and output timestamp in the same format.
json.map-null-key.mode	optional	yes	`'FAIL'`	String	Specify the handling mode when serializing null keys for map data. Currently supported values are `'FAIL'`, `'DROP'` and `'LITERAL'`: Option `'FAIL'` will throw exception when encountering map with null key. Option `'DROP'` will drop null key entries for map data. Option `'LITERAL'` will replace null key with string literal. The string literal is defined by `json.map-null-key.literal` option.
json.map-null-key.literal	optional	yes	'null'	String	Specify string literal to replace null key when `'json.map-null-key.mode'` is LITERAL.
json.encode.decimal-as-plain-number	optional	yes	false	Boolean	Encode all decimals as plain numbers instead of possible scientific notations. By default, decimals may be written using scientific notation. For example, `0.000000027` is encoded as `2.7E-8` by default, and will be written as `0.000000027` if set this option to true.
json.encode.ignore-null-fields	optional	yes	false	Boolean	Encode only non-null fields. By default, all fields will be included.
decode.json-parser.enabled	optional		true	Boolean	Whether to use the Jackson `JsonParser` to decode json. `JsonParser` is the Jackson JSON streaming API to read JSON data. This is much faster and consumes less memory compared to the previous `JsonNode` approach. Meanwhile, `JsonParser` also supports nested projection pushdown when reading data. This option is enabled by default. You can disable and fallback to the previous `JsonNode` approach when encountering any incompatibility issues.

Data Type Mapping #

Currently, the JSON schema is always derived from table schema. Explicitly defining an JSON schema is not supported yet.

Flink JSON format uses jackson databind API to parse and generate JSON string.

The following table lists the type mapping from Flink type to JSON type.

Flink SQL type	JSON type
`CHAR / VARCHAR / STRING`	`string`
`BOOLEAN`	`boolean`
`BINARY / VARBINARY`	`string with encoding: base64`
`DECIMAL`	`number`
`TINYINT`	`number`
`SMALLINT`	`number`
`INT`	`number`
`BIGINT`	`number`
`FLOAT`	`number`
`DOUBLE`	`number`
`DATE`	`string with format: date`
`TIME`	`string with format: time`
`TIMESTAMP`	`string with format: date-time`
`TIMESTAMP_WITH_LOCAL_TIME_ZONE`	`string with format: date-time (with UTC time zone)`
`INTERVAL`	`number`
`ARRAY`	`array`
`MAP / MULTISET`	`object`
`ROW`	`object`

JSON Format #

Dependencies #

How to create a table with JSON format #

Format Options #

format

json.fail-on-missing-field

json.ignore-parse-errors

json.timestamp-format.standard

json.map-null-key.mode

json.map-null-key.literal

json.encode.decimal-as-plain-number

json.encode.ignore-null-fields

decode.json-parser.enabled

Data Type Mapping #