This documentation is for an unreleased version of Apache Flink CDC. We recommend you use the latest stable version.
Welcome to Flink CDC ๐ #
Flink CDC is a streaming data integration tool that aims to provide users with a more robust API. It allows users to describe their ETL pipeline logic via YAML elegantly and help users automatically generating customized Flink operators and submitting job. Flink CDC prioritizes optimizing the task submission process and offers enhanced functionalities such as schema evolution, data transformation, full database synchronization and exactly-once semantic.
Deeply integrated with and powered by Apache Flink, Flink CDC provides:
- โ End-to-end data integration framework
- โ API for data integration users to build jobs easily
- โ Multi-table support in Source / Sink
- โ Synchronization of entire databases
- โ Schema evolution capability
How to Use Flink CDC #
Flink CDC provides an YAML-formatted user API that more suitable for data integration scenarios. Here’s an example YAML file defining a data pipeline that ingests real-time changes from MySQL, and synchronize them to Apache Doris:
source:
type: mysql
hostname: localhost
port: 3306
username: root
password: 123456
tables: app_db.\.*
server-id: 5400-5404
server-time-zone: UTC
sink:
type: doris
fenodes: 127.0.0.1:8030
username: root
password: ""
table.create.properties.light_schema_change: true
table.create.properties.replication_num: 1
pipeline:
name: Sync MySQL Database to Doris
parallelism: 2
By submitting the YAML file with flink-cdc.sh
, a Flink job will be compiled
and deployed to a designated Flink cluster. Please refer to Core Concept to get full documentation of all
supported functionalities of a pipeline.
Write Your First Flink CDC Pipeline #
Explore Flink CDC document to get hands on your first real-time data integration pipeline:
Quickstart #
Check out the quickstart guide to learn how to establish a Flink CDC pipeline:
Understand Core Concepts #
Get familiar with core concepts we introduced in Flink CDC and try to build more complex pipelines:
Submit Pipeline to Flink Cluster #
Learn how to submit the pipeline to Flink cluster running on different deployment mode:
Development and Contribution #
If you want to connect Flink CDC to your customized external system, or contributing to the framework itself, these sections could be helpful:
- Understand Flink CDC APIs to develop your own Flink CDC connector
- Learn about how to contributing to Flink CDC
- Check out licenses used by Flink CDC