Skip to content
2-Aug-11-2022-03-51-03-19-AM

Trino/Presto
Alternative

Why users are migrating from Trino and Presto to StarRocks

Trino and Apache Presto are arguably the most popular open source engines for data lakehouse queries. Compared to their predecessors like Hive, Trino and Presto can reduce query latencies from tens of minutes to tens of seconds. While these performance improvements were good years ago, it's not enough for modern analytics work. Today, Trino and Presto users struggle with interactive query scenarios where query latency needs to be in the sub-second range to support their ad-hoc queries, operational analytics, and user-facing analytics.

This is just the tip of the iceberg when it comes to Trino and Presto's limitations. Other major challenges include:

 

High Query
Latency
The query performance of Trino and Presto is limited by their Java-based query engines. Unlike C++ based engines, these engines can't take full advantage of the vectorized executions of modern CPU's. This makes Trino difficult to handle for interactive analytics scenarios.

 

No real-time
analytics
Trino and Presto were designed as a batch analytics engine. Streaming data must be ingested into the data lake in batches. Customers are forced to bring on an additional platform for real-time analytics.

 

Limited High-Concurrency Support
Even when Trino and Presto can deliver acceptable query performance, they struggle to sustain that performance as concurrency scales. Users often need to maintain another system for highly concurrent use cases. This increases maintenance requirements dramatically.
Because of this, many Trino and Presto users have started migrating to StarRocks. With StarRocks, these former Trino and Presto users are able to enjoy significant query performance without the limitations of Trino and Presto holding them back.

StarRocks vs. Trino and Presto

14.6x Greater Performance with StarRocks' Native Table
3.3x Greater Performance With StarRocks on Data Lakes
10,000 QPS High Performance Even With 1,000s of Concurrent Users

Query external data sources.
No ingestion needed.

In addition to more efficient analysis of local data, StarRocks can work as the query engine to analyze data stored in data lakes such as Apache Hive, Apache Iceberg, Apache Hudi, and Delta Lake.

With StarRocks' external catalog, users are able to query external data sources seamlessly with zero-migration, analyzing data from different systems such as HDFS and Amazon S3, in various file formats such as Parquet, ORC, and CSV.

 

Deliver unparalleled performance in any scenario

StarRocks offers 3x greater performance when querying data on the data lake compared to Trino and Presto. StarRocks achieves this thanks to its unique vectorized execution engine, written in C++ to make full use of the SIMD instructions in modern CPUs. And because StarRocks includes an optimized native storage engine, you're able to unify data lake analytics, low latency, and highly concurrent workloads with one database.

 

Work with the freshest data, even on your data lake

Perform analytics on the freshest data possible, even on your data lake, without need for any data migration. Data lake users don't have to set up a separate data pipeline for real-time analytics. Streaming data from sources (such as Apache Kafka) are ingested into StarRocks and made available to analytics in real-time. StarRocks' storage engine also uses the Delete-and-insert pattern, which allows for efficient Partial Update and Upsert operations.

 

Accelerate your analytics with intelligent materialized views

With its breakthrough Intelligent Materialized View (IMV) technology, StarRocks can transparently accelerate queries and simplify data pipelines. IMVs are automatically refreshed to guarantee data consistency, and queries are automatically re-written to leverage IMVs. Expensive ETL jobs can also be replaced with IMVs to simplify data pipelines.

Compare Trino and Presto to StarRocks

Designed for the analytics needs of modern enterprises, StarRocks delivers the capabilities and performance. Trino and Presto can't say the same.

Trino | Presto

Java-based query engine
    No vectorized query executions
    No real-time analytics
Supports a limited number of concurrent users
    No support for local storage, data lake queries only
Rudimentary support for materialized views
Single point of failure at the coordinator node

StarRocks

C++ based high-performance query engine
Fully vectorized query executions
Supports batch and real-time analytics
High concurrency with 10,000+ QPS
Supports both data lake queries and local storage
Intelligent materialized views with real-time updates
MPP architecture with no single point of failure

Talk to an engineer

Have questions about CelerData and StarRocks? You can connect with our team of solutions architects and experienced engineers who can answer all of your questions and even offer a personalized demo aligned with your specific needs and analytics scenarios.