100k events per second | sub-500ms query latency at TP99 |
98.33% faster aggregation
About Intuit
Intuit is a financial software company that serves more than 100 million consumers, small businesses, and self-employed customers through products like TurboTax (tax filing), QuickBooks (accounting and payroll), and Credit Karma (personal finance and credit management).
To support these products, the Intuit Persistence Service (IPS) platform provides a unified data foundation for client developers. The IPS team abstracts away system-specific complexities, enabling developers to focus on building real-time ML models, identifying trends, and running what-if scenarios. Rather than each product team managing its own persistence layer, IPS delivers scalable, reliable data services that power mission-critical features.
During peak tax season, IPS processes more than 140 billion transactions with bursts exceeding 100,000 transactions per second. As demands grew toward OLAP and real-time clickstream use cases, the IPS team evolved away from Apache Druid to adopt a modern OLAP database designed to deliver query performance with SLAs as low as four seconds.
Challenges: Real-time analytical processing at massive scale
As Intuit's products evolved to deliver AI-driven experiences, the IPS team faced real-time analytical processing at massive scale.
Intuit's customer support relies on ML models that analyze the customer's last 30 in-product clicks to power two experiences:
- Reactive models: Enable Intuit Assist (their generative AI chatbot) to provide contextual help when customers reach out
- Proactive models: Detect when customers are struggling and surface information before they ask
The requirement was non-negotiable: end-to-end data freshness of 4 seconds or less. Beyond that threshold, model accuracy degrades exponentially, leading to irrelevant recommendations and eroded customer trust.
Marketing teams also needed real-time aggregates across billions of records with two types of complexity:
- Composition use cases: Determining total employee count across QuickBooks Online and QB Payroll required joining events from multiple real-time data streams
- Stateful processing: Tracking metrics like active customer counts meant comparing current and previous state for each incoming event
Intuit's existing Apache Druid deployment couldn't meet these demands:
- 40% more expensive for standard traffic, 150% more during peak periods
- Poor multi-table join performance forced complex denormalization workarounds
- Operationally complex architecture drained engineering resources
The IPS team evaluated alternatives, but each had disqualifying limitations:
- ClickHouse: Multi-table joins weren't performant enough, upserts required workarounds, horizontal scaling demanded custom operational tooling
- DuckDB: Single-node architecture was an operational non-starter
- Pinot: Couldn't match required query performance
The IPS team needed a solution delivering sub-4-second end-to-end latency, handling real-time upserts across billions of records, sustaining 100,000+ events per second during peaks, and performing multi-table joins efficiently—all while maintaining the "paved road" philosophy that made IPS successful for transactional workloads.
Solution: Why Intuit Adopted StarRocks for Real-Time OLAP
After evaluating ClickHouse, DuckDB, Pinot, and their existing Druid deployment, Intuit selected StarRocks based on critical capabilities:
- Multi-table join performance at scale: Essential for real-time composition across product domains
- Native real-time upsert support: Enabled stateful processing with primary key tables without custom workarounds
- Horizontal and vertical scaling: Operational requirement for handling unpredictable peak loads
- Standard SQL interface: Reduced adoption friction for teams already proficient in SQL with features like asynchronous materialized views and table partitioning and bucketing
The new and improved IPS OLAP platform consists of three key layers:
- Automated Ingestion Layer: Apache Flink pipelines automatically provisioned from developer configurations, supporting real-time streaming from Kafka, historical backfills, and zero-downtime data replays straight into StarRocks, with ingestion optimization and query acceleration configurations built-in.
- StarRocks Shared-Data Architecture: Compute and storage separation backed by AWS S3, with multiple data warehouses for resource isolation and resource groups to prevent noisy neighbor problems on multi-tenant clusters.
- Query REST Layer with Guard Rails: Lightweight abstraction (10-20ms overhead) that validates queries, enforces proper use of partitioning and indexes, provides attribute-based access control, and enables automatic query profiling to access StarRocks.
IPS Developers define entities in a simple schema l anguage. The platform handles DDL generation, deployment, monitoring, and operational concerns while securing the datastore with StarRocks features like query timeouts, SQL blacklist for risky patterns, and resource groups for multi-tenant isolation.
Results: From performance boost to a new real-time analytics approach
The shift to StarRocks didn't just translate to a performance improvement—it completely changed how Intuit approached real-time analytics:
- Eliminated Complex Workarounds: By replacing Druid with StarRocks, Intuit simplified its architecture by eliminating denormalization pipelines and workarounds required for multi-table joins.
- Simplified Operations: The unified platform reduced operational burden by consolidating OLAP infrastructure. Instead of managing complex Druid clusters, teams now work with a streamlined system that abstracts complexity while maintaining performance.
- Meeting Mission-Critical SLAs: The 4-second end-to-end requirement that was previously unattainable became consistently achievable, enabling both reactive customer support and proactive assistance without performance degradation.
Furthermore, it allowed for dramatic performance gains across the board:
- Real-time ML predictions: 2-second total latency (1-second data freshness + sub-500ms query latency at TP99) while handling 100K+ events per second
- Marketing aggregations on small datasets (up to 2B records): 35 minutes to 1 minute
- Medium datasets (up to 5B records): 2 hours to 2.5 minutes
- Large datasets (>10B records): 5 hours to 5 minutes
- Composition queries (multi-domain joins): NA to 30 seconds
What's Next for Intuit
After experiencing initial success with StarRocks as their OLAP engine, Intuit plans to further expand its capabilities:
- Read replicas to support development, exploration, and reporting without affecting production.
- Unified analytics by connecting directly to the data lake for real-time and historical insights.
- Stronger data consistency with multi-table transactions ensuring reliable, atomic updates.
To hear the story directly from the Intuit team, watch Gaurav Doon and Daniel Russotto’s keynote from StarRocks Summit 2025, where they share how Intuit powers real-time OLAP with StarRocks—covering large-scale adoption, the architecture behind customer-facing analytics, lessons in performance, scalability and reliability, and how simplifying infrastructure delivers faster insights.