Paimon

Paimon Pipeline Connector #

The Paimon Pipeline connector can be used as the Data Sink of the pipeline, and write data to Paimon. This document describes how to set up the Paimon Pipeline connector.

What can the connector do? #

  • Create table automatically if not exist
  • Schema change synchronization
  • Data synchronization

How to create Pipeline #

The pipeline for reading data from MySQL and sink to Paimon can be defined as follows:

source:
  type: mysql
  name: MySQL Source
  hostname: 127.0.0.1
  port: 3306
  username: admin
  password: pass
  tables: adb.\.*, bdb.user_table_[0-9]+, [app|web].order_\.*
  server-id: 5401-5404

sink:
  type: paimon
  name: Paimon Sink
  catalog.properties.metastore: filesystem
  catalog.properties.warehouse: /path/warehouse

pipeline:
  name: MySQL to Paimon Pipeline
  parallelism: 2

Pipeline Connector Options #

Option Required Default Type Description
type required (none) String Specify what connector to use, here should be 'paimon'.
name optional (none) String The name of the sink.
catalog.properties.metastore required (none) String Metastore of paimon catalog, supports filesystem and hive.
catalog.properties.warehouse optional (none) String The warehouse root path of catalog.
catalog.properties.uri optional (none) String Uri of metastore server.
commit.user optional "admin" String User name for committing data files.
partition.key optional (none) String Partition keys for each partitioned table, allow setting multiple primary keys for multiTables. Each table are separated by ';', and each partition key are separated by ','. For example, we can set partition.key of two tables by 'testdb.table1:id1,id2;testdb.table2:name'.
catalog.properties.* optional (none) String Pass options of Paimon catalog to pipeline,See Paimon catalog options
table.properties.* optional (none) String Pass options of Paimon table to pipeline,See Paimon table options

Usage Notes #

  • Only support Paimon primary key table, so the source table must have primary keys.

  • Not support exactly-once. The connector uses at-least-once + primary key table for idempotent writing.

Data Type Mapping #

CDC type Paimon type NOTE
TINYINT TINYINT
SMALLINT SMALLINT
INT INT
BIGINT BIGINT
FLOAT FLOAT
DOUBLE DOUBLE
DECIMAL(p, s) DECIMAL(p, s)
BOOLEAN BOOLEAN
DATE DATE
TIMESTAMP TIMESTAMP
TIMESTAMP_LTZ TIMESTAMP_LTZ
CHAR(n) CHAR(n)
VARCHAR(n) VARCHAR(n)

Back to top