Trino/Presto Alternative

Why users are migrating from Trino and Presto to StarRocks

Trino and Apache Presto are arguably the most popular open source engines for data lakehouse queries. Compared to their predecessors like Hive, Trino and Presto can reduce query latencies from tens of minutes to tens of seconds. While these performance improvements were good years ago, it's not enough for modern analytics work. Today, Trino and Presto users struggle with interactive query scenarios where query latency needs to be in the sub-second range to support their ad-hoc queries, operational analytics, and user-facing analytics.

This is just the tip of the iceberg when it comes to Trino and Presto's limitations. Other major challenges include:

HIGH QUERY LATENCY

The query performance of Trino and Presto is limited by their Java-based query engines. Unlike C++ based engines, these engines can't take full advantage of the vectorized executions of modern CPU's. This makes Trino difficult to handle for interactive analytics scenarios.

NO REAL-TIME ANALYTICS

Trino and Presto were designed as a batch analytics engine. Streaming data must be ingested into the data lake in batches. Customers are forced to bring on an additional platform for real-time analytics.

LIMITED HIGH-CONCURRENCY SUPPORT

Even when Trino and Presto can deliver acceptable query performance, they struggle to sustain that performance as concurrency scales. Users often need to maintain another system for highly concurrent use cases. This increases maintenance requirements dramatically.

StarRocks vs. Trino and Presto

14.6x

GREATER PERFORMANCE WITH STARROCKS' NATIVE TABLE

5.54x

GREATER PERFORMANCE WITH STARROCKS ON DATA LAKES

10000QPS

HIGH PERFORMANCE EVEN WITH 1,000S OF CONCURRENT USERS

Query external data sources.
No ingestion needed.

In addition to more efficient analysis of local data, StarRocks can work as the query engine to analyze data stored in data lakes such as Apache Hive, Apache Iceberg, Apache Hudi, and Delta Lake.

With StarRocks' external catalog, users are able to query external data sources seamlessly with zero-migration, analyzing data from different systems such as HDFS and Amazon S3, in various file formats such as Parquet, ORC, and CSV.

Why more Presto and Trino users are switching to StarRocks

The benefits of StarRocks reach beyond querying external data sources. Here are some additional great reasons to switch:

Deliver unparalleled performance in any scenario

StarRocks offers 3x greater performance when querying data on the data lake compared to Trino and Presto. StarRocks achieves this thanks to its unique vectorized execution engine, written in C++ to make full use of the SIMD instructions in modern CPUs. And because StarRocks includes an optimized native storage engine, you're able to unify data lake analytics, low latency, and highly concurrent workloads with one database.

Work with the freshest data, even on your data lake

Perform analytics on the freshest data possible, even on your data lake, without need for any data migration. Data lake users don't have to set up a separate data pipeline for real-time analytics. Streaming data from sources (such as Apache Kafka) are ingested into StarRocks and made available to analytics in real-time. StarRocks' storage engine also uses the Delete-and-insert pattern, which allows for efficient Partial Update and Upsert operations.

Accelerate your analytics with intelligent materialized views

With its breakthrough Intelligent Materialized View (IMV) technology, StarRocks can transparently accelerate queries and simplify data pipelines. IMVs are automatically refreshed to guarantee data consistency, and queries are automatically re-written to leverage IMVs. Expensive ETL jobs can also be replaced with IMVs to simplify data pipelines.

Compare Trino and Presto to StarRocks

Designed for the analytics needs of modern enterprises, StarRocks delivers the capabilities and performance. Trino and Presto can't say the same.

Comparison	Trino \| Presto	StarRocks
Query engine	Java-based query engine	C++ based high-performance query engine
Query executions	No vectorized query executions	Fully vectorized query executions
Real-time analytics	No real-time analytics	Supports batch and real-time analytics
Concurrency supports	Supports a limited number of concurrent users	High concurrency with 10,000+ QPS
Data lake queries and local storage support	No support for local storage, data lake queries only	Supports both data lake queries and local storage
Materialized views support	Rudimentary support for materialized views	Intelligent materialized views with real-time updates
Point of failure	Single point of failure at the coordinator node	MPP architecture with no single point of failure