Apache Paimon, originally launched as Flink Table Store (FTS) in January 2022, was developed by the Apache Flink community to combine Flink’s real-time streaming capabilities with the Lakehouse architecture. The goal was to enable real-time data flow in data lakes, offering users a unified experience for both real-time and batch processing. This vision laid the groundwork for a next-generation platform for seamless stream-batch integration.
On March 12, 2023, the project entered the Apache Software Foundation (ASF) incubator and was renamed Apache Paimon. Over time, it developed into a platform that supports both streaming and batch data workflows with highly efficient, real-time analytics. Paimon’s progress and innovation helped it establish a strong foothold in the realm of real-time data lakes.
During its incubation, Apache Paimon addressed several challenges through technical innovation, delivering the following advancements:
Key Capabilities of Apache Paimon
Stream and Batch Support: Handles stream updates, batch reads/writes, and change log generation in a unified platform.
Data Lake Capabilities: Offers low-cost, highly reliable, and scalable metadata, with all the advantages of data lake storage.
Versatile Merge Engines: Provides flexible record update methods, allowing you to choose between retaining the last record, performing partial updates, or aggregating records.
Change Log Generation: Generates accurate and complete change logs from any data source, simplifying stream analytics.
Rich Table Types: In addition to primary key tables, Paimon supports append-only tables, offering ordered stream reads as an alternative to message queues.
Schema Evolution: Fully supports schema evolution, allowing you to rename columns and reorder them as needed.
Apache Paimon brings a new approach to lake storage by combining LSM (Log-Structured Merge) trees with columnar formats like ORC and Parquet. This combination allows Paimon to support large-scale real-time updates within data lakes. The LSM file organization structure in Paimon is as follows:
Read/Write Capabilities: Paimon offers flexibility in how data is read and written:
Internal Structure: At its core, Paimon stores data in columnar files within a file system or object storage and uses an LSM tree structure to manage large-scale updates and ensure high-performance queries.
In streaming engines like Apache Flink, you typically encounter three types of connectors:
Paimon simplifies this by providing a table abstraction that behaves similarly to a traditional database:
A snapshot captures the state of a table at a specific moment. You can use the latest snapshot to access the most current data, or you can explore previous states through earlier snapshots using time travel.
Paimon uses the same partitioning strategy as Apache Hive, dividing tables into sections based on specific column values (e.g., date, city, department). This partitioning helps in managing and querying data more efficiently. If a table has a primary key, the partition key must be a subset of the primary key.
Buckets
Tables, whether partitioned or not, are further divided into buckets. Buckets are determined by hashing one or more columns, and they help organize the data for faster queries. If you don’t specify a bucket key, Paimon uses the primary key (if defined) or the entire record as the bucket key. The number of buckets controls the level of parallelism in processing, but too many buckets can lead to small files and slower reads. Ideally, each bucket should contain around 1GB of data.
Paimon uses a two-phase commit protocol to ensure that a batch of records is consistently written to the table. Each commit creates up to two snapshots. If multiple writers are modifying a table at the same time, as long as they’re working on different buckets, their changes are processed in sequence. If they modify the same bucket, snapshot isolation is maintained, meaning that while the final table may reflect a mix of changes, no data is lost.
Paimon currently focuses on three key use cases:
CDC Data Ingestion into Data Lakes
Paimon is optimized to make this process simpler and more efficient. With one-click ingestion, you can bring entire databases into the data lake, dramatically reducing the complexity of the architecture. It supports real-time updates and fast querying at a low cost. Additionally, it offers flexible update options, allowing specific columns or different types of aggregate updates to be applied.
Building Streaming Pipelines
Paimon can be used to build full streaming pipelines with key features like:
Ultra-Fast OLAP Queries
While the first two use cases ensure real-time data flow, Paimon also supports high-speed OLAP queries for analyzing stored data. By combining Z-Order and indexing, Paimon enables fast data analysis. Its ecosystem supports various query engines like Flink, Spark, StarRocks, and Trino, all of which can efficiently query data stored in Paimon.
StarRocks + Apache Paimon supports several key use cases:
sql_dialect = "trino"
, users can run Trino SQL natively on StarRocks. This feature has already been applied in production environments with partners and customers.
The performance of StarRocks + Apache Paimon was evaluated using the TPCH benchmark with 100 GB of data, running queries on a test environment configured with one master node and three core nodes, each with 16 vCPUs and 64 GB of memory. The test used the following configuration:
During testing, each TPCH query was run three times, with the average execution time taken without pre-analysis or warming up the system. The data was stored on HDFS.
Results: The combination of StarRocks’ optimized kernel and Apache Paimon’s data format provided significant performance gains, with the analysis capability of StarRocks + Apache Paimon delivering up to 15 times faster query performance compared to previous versions.
For more details, please read "Using Apache Paimon + StarRocks for High-Speed Batch and Streaming Lakehouse Analysis", which is organized from Riyu Wang’s presentation at FFA (Flink Forward Asia) 2023.
Ele.me, a leading local life service platform, focuses on online food delivery, new retail, instant logistics, and restaurant supply chain services. It has transformed food delivery into a mainstream dining option in China, covering 670 cities and over 1,000 counties, with 3.4 million online restaurants and 260 million users. To support this vast operation, Ele.me has evolved its data infrastructure, leveraging Flink, Paimon, and StarRocks to drive real-time data applications.
Ele.me's real-time data warehouse supports various business-critical functions, including:
To tackle the challenges of scaling its real-time data applications, Ele.me strategically integrated the following technologies:
Flink handles real-time data processing, enabling complex event and stream processing for a wide range of data sources. It supports high-velocity data streams, ensuring low-latency performance, essential for making real-time business decisions.
Paimon is the core of Ele.me’s streaming lakehouse architecture, enabling efficient data storage for real-time data processing. It supports real-time read and write operations, facilitating up-to-date data for analytics. Key features include:
StarRocks is integrated as a core OLAP engine, providing high-performance analytical capabilities on top of Paimon data. It offers several significant benefits:
The integration of Flink, Paimon, and StarRocks has led to substantial improvements in Ele.me’s data infrastructure:
The blog covered the essential aspects of Apache Paimon, including its architecture, features, and setup process. Apache Paimon stands out in data streaming and storage due to its high-throughput data writing and low-latency queries. The tool's compatibility with common computing engines like Alibaba Cloud E-MapReduce (EMR) enhances its utility. Apache Paimon significantly improves data processing frameworks and simplifies large-scale data operations. Readers are encouraged to explore and experiment with Apache Paimon to unlock its full potential in real-time data processing and analytics.