Customer-Facing Analytics Meets Agents: Eightfold‘s Move from Redshift to StarRocks

Written by Sida Shen | Aug 22, 2025 5:01:31 AM

Eightfold.ai needed sub-second, multi-tenant analytics—today for humans, tomorrow for AI agents. By migrating from Redshift to StarRocks, they doubled performance and cut costs in half.

Modern SaaS users expect live, in-product analytics. Tomorrow, AI agents will ask for the same data continuously, at machine speed. Eightfold.ai—an AI-driven talent platform—walked through how their Redshift warehouse hit concurrency and latency limits under these demands, and why moving to StarRocks unlocked sub-second dashboards today and a clean runway for agentic analytics.

What Eightfold.ai Does

Eightfold operates in talent acquisition, talent management, and workforce planning. When you open a large enterprise’s careers site, there’s a good chance Eightfold is the engine behind it. That means:

Clickstream events from career sites and applications land in Eightfold first.
Deep-learning models enrich these events with people and job context.
Customers expect interactive dashboards inside the product—not static reports—at sub-second latency.

Under the hood, the data is classic, and demanding for the underlying engine:

Fact tables: billions of clickstream rows, continuously ingested.
Dimension tables: large, people-centric dimensions (profiles, employees, roles).
Schema: data organized in star schemas that require joins to drive most analytics.
Tenancy: strict isolation required across many large customers sharing the same platform.
Data skew: tenant sizes and activity levels vary widely; a few very large/busy tenants can dominate shared slices.

Why In-Product Analytics Became the Bar

Internal BI latency doesn’t fly for customer-facing user experience. Product teams want data that update in seconds, with stable p95s during traffic spikes. Eightfold also sees a near-term shift to chat-style, “ask anything” analytics—where a manager types a question and gets a live answer. That future multiplies query traffic and makes concurrency a first-order design constraint.

Where Redshift Started to Creak

Those product expectations quickly exposed Redshift’s limits in Eightfold’s multi-tenant, join-heavy environment. Under peak traffic and star-schema workloads, the same issues kept surfacing:

Single leader bottleneck. One coordinator plans all queries, creating a hard ceiling on true concurrency; queues appear quickly during spikes.
No tenant-aware partitioning. There’s no native way to keep a tenant’s working set confined to a small subset of nodes. Hot customers can spill across the cluster and degrade the performance of other tenants.
Concurrency scaling is expensive and slow. Extra clusters take time to spin up and cap out around ten replicas, so “instant headroom” is neither instant nor cheap.
Serverless loses EBS benefits. Serverless scale events don’t preserve warm EBS caches, so the next query pays the object-store I/O penalty, which leads to significantly degraded performance.

Work-arounds added operational overhead without removing the architectural ceiling, which set the bar for what the next engine had to surpass.

Non-Negotiables for the Next Engine

Eightfold started with a non-negotiable requirement: star-schema joins had to be first-class. Their workload joins large fact tables to large dimension tables. They evaluated engines like ClickHouse, Druid, and Pinot—good at fast scans and single-table queries—but the limited join query support of these engines would have pushed them toward heavy denormalizations, which is not only expensive, but also hinders the overall flexibility and confines their users to predefined query patterns. With that context, the next engine had to prove:

On-the-fly multi-table queries at scale. Run big fact-to-fact and fact-to-dimension joins at interactive latency—no forced denormalization, no pre-aggregate everything, no more predefined queries.
Leaderless scale-out for planning and execution. Add coordinators and workers horizontally to scale concurrency with no single node becoming the ceiling.
Shared-data design with an effective cache. Keep the source of truth in S3, serve from SSD to keep p95 query latency stable.
Operational simplicity. Minimal moving parts, Kubernetes-friendly, and compatible with existing SQL/ETL dialects, so migration isn’t a nightmare.

Why StarRocks

Given those non-negotiable requirements, StarRocks was the only option that met the requirements without compromising on joins.

MPP execution (no single coordinator): Multiple Front Ends (FEs) plan in parallel while Compute Nodes (CNs) scale out to run distributed joins and aggregations in memory—delivering low-latency, on-the-fly multi-table queries at scale without the need for precomputation.
Cost-based optimization enables fast JOIN queries: The cost-based optimizer chooses broadcast/shuffle/colocate per query, applies runtime filters and partition pruning to cut data movement, and runs fully vectorized C++ operators—so ad-hoc joins stay fast at scale.
Tenant-aware data distribution: Partition by tenant and spread tablets across multiple CNs to avoid hot-node bottlenecks and maximize parallelism.
Object storage durability, local disk performance: S3 stays the source of truth while CNs cache hot data; asynchronous, partition-aware MVs accelerate the few heavy/demanding queries without undermining data ingestion.

Architectural Designs

Partition by tenant to alleviate data skew: Eightfold maps each tenant to a partition whose tablets are replicated across BE nodes, avoiding both single-node hotspots and whole-cluster storms.
Keep data on object storage; cache for speed. The single source of truth stays in S3; data is served from EBS/SSD cache to maximize performance.
Accelerate joins with async MVs. Use StarRocks asynchronous materialized views where latency requirements are high; partition-wise refresh keeps overhead low.
Separate read/write, and read/read. With a multi-warehouse design, write warehouses handle compaction/vacuuming, while read warehouses ensure that each tenant has the appropriate resources to achieve SLA goals.

Results

Taken together, the reasons for choosing StarRocks—first-class joins, leaderless scale, tenant-aware data distribution, and intelligent caching—weren’t just architectural preferences; they changed outcomes. Eightfold AI achieved the following results after moving core workloads from Redshift to StarRocks:

Latency: at least 2× improvement in latency. What used to feel “interactive” on a good day became consistently subsecond.
Cost: roughly a 2× reduction in total cost at scale by avoiding warm-up clusters and making cache do the heavy lifting.
Focus: engineers spend more time on product features and less on building pipelines and firefighting.

What This Unlocks Next: Agentic Analytics

After migrating their workload from Redshift to StarRocks, Eightfold can treat high-QPS customer-facing analytics as solved and shift focus to building agents. In Eightfold's view, agents are the new “programming model”: they break down tasks, run the necessary queries, and trigger the next steps automatically. This changes the workload profile—more queries than humans, often in bursts—making concurrency, isolation, and governance first-order requirements.

Eightfold’s next steps:

Natural-language analytics: Customers ask questions in plain natural language and get live answers, not static reports. The same pipeline can produce downloadable summaries when needed.
Centralized permission boundary: Keep row/column policies in an application-layer service as the single source of truth; scope queries before they reach the warehouse.
OLAP as the base layer: With scalable joins and low latency in place, OLAP becomes the stable foundation; agents orchestrate business workflows and decision automation on top.

Want to explore what StarRocks can do for your analytics workloads? Join the StarRocks Slack to connect with the community.

View full post