Three years ago, when my partner, Andy Ye, and I were planning for the StarRocks project, both of us had been working in the big data field for more than a decade. We had always wished we could build a powerful analytics system that could satisfy the constantly changing, quickly evolving requirements of business users.
But, we just couldn't find a solution that fit the bill. The realities we saw were:
- Existing analytics systems could only deliver query response times in the 10s of seconds, at best, and with very expensive hardware costs. Enterprise customers need sub-second query latency at a much lower cost for their ideal analytics experience.
- Under those past systems (and even most of today's), data freshness was measured in hours. But enterprises require second-level data freshness to respond quickly to changes in business conditions.
- Different analytics systems were being used for different scenarios. This made it difficult for enterprises to simplify their architectures while still being able to analyze offline data and real-time data simultaneously.
The question we had been thinking about was: how can we make enterprise analytics simpler and more efficient so that we can shorten the time from raw data to business value?
That question sparked a journey that would lead to us starting the StarRocks project.
Since the inception of the project three years ago, we have hit several major milestones. We built a fully native vectorized execution engine and a brand new Cost Based Optimizer. We achieved performance levels 3 to 5 times greater than other popular analytics products. And we have made it possible for enterprises to finally unify their offline data analytics in their data lake in real-time.
In the last three years, StarRocks has been adopted by hundreds of enterprises, and offers a sub-second query and real-time analytics experience that was never possible before. It has delivered tremendous business value to its users, and we are so proud of what the project and the community have accomplished.
Last week, we founded CelerData to continue to grow and build upon the StarRocks project and provide a new generation of enterprise-grade data analytics solutions to businesses around the world. "Celer" means fast in Latin, and reflects our commitment to helping organizations improve the efficiency of their data analytics work, and push the limits of what's possible when it comes to performance.
CelerData will focus on three key priorities:
- Build StarRocks into the next generation of data lakehouse analysis engines. Users will not need to import data into StarRocks, and will be able to directly use StarRocks to query their data lake data. Without needing to build an expensive data warehouse, users can achieve sub-second query speeds, uniformly analyze offline and real-time data, and get full use out of their data lake warehouse architecture.
- Make StarRocks more cloud native. By transforming StarRocks into a new storage and calculation separation architecture, StarRocks will be more elastic, and enterprises can adjust their use of computing resources at any time according to changes in business load, further reducing their costs.
- Continue to invest heavily in building the StarRocks community. Since we opened up our code in 2021, the StarRocks community has grown rapidly. At present, StarRocks has accumulated nearly 10,000 community members engaged in data analytics-related work around the world. We will continue to invest resources to attract people with an interest in data analytics to join the community. These investments will help community members make better use of StarRocks in their businesses, create more opportunities for them to contribute to the StarRocks project, and let them grow with StarRocks.
We firmly believe the future of today's enterprises lies in being data-driven and having the ability to embrace digital transformation. CelerData strives to become the engine for these data-driven enterprises, to help unleash the potential of data engineers and operational personnel, and to make enterprise analytics widely available.
Creating values for customers, that's why CelerData exists!