Join StarRocks Community on Slack

Connect on Slack

    Analytics has always played a critical role in business operations. Powered by advancements in technology, today's enterprises are relying on analytics more than ever in order to gain (or keep) a competitive advantage. While many discussions have been dedicated to the acquisition, transformation, and delivery of data, the actual servicing of data has been largely ignored. This article will help you better understand analytics platforms.

     

    What Is an Analytics Platform?

    Analytics platforms, the platforms on which data is stored and insight is served, play a critical role in the overall analytical user experience. In this article, we will look at the latest trends in analytics workloads, how these trends affect the selection criteria of analytics platforms, and how we can gradually upgrade our own analytics platforms.

    But first, to better guide the conversation, let's define what an analytics platform is.

    Analytics platforms are the last step in the data pipeline before presentation, either by human beings or applications.

    An analytics platform consists of two components: the data storage layer and the data servicing layer.

     

    • The data storage layer is where data is stored. It can be directly attached storage, a distributed file system, or a cloud storage platform such as AWS S3.

    • The data service layer is where queries are executed and results are sent back to the clients.

     

    In the past, an analytics platform was typically a data warehouse that included a storage engine and a query engine. Some companies would also add a data mart layer for better data servicing performance.

    Along with the Big Data era came the Hadoop system, where data was stored in a distributed file system, and data servicing was handled by MapReduce, Spark, or their SQL variations such as Hive, Impala, and Spark SQL.

    The past 10 years have seen the rise of cloud computing. In the cloud, data is stored in cloud object storage, while the servicing layer is provided either by a cloud data warehouse (e.g. Snowflake) or a query engine such as Presto.

     

    The Shifting Role of Analytics in Enterprises

    The role analytics plays in today's enterprises is undergoing gradual, but significant changes. What used to be a highly strategic function is now becoming mandatory for all aspects of company operations. Through these changes, we are seeing a shift in analytics' role inside enterprises:

     

    Strategic to Operational

    Analytics used to be associated with strategic decision making. People often think about major investments, the allocation of critical resources, or a change in product direction when they hear the word analytics. But analytics now plays an increasingly important role in daily operations. Some examples include personalized recommendations for customers while they are on the website or dynamically adjusting the fleet configuration to optimize production and inventory levels.

     

    Batch Analytics to Real-time Analytics

    When enterprises are looking to improve their strategic decision making for the next quarter, year, or even 3+ years, analytics based on historical data up to the latest week are perfectly fine. In this scenario, data is ingested into the analytics platform regularly; hence why it's called Batch Analytics. But batch analytics is not fast enough for the operational analytics discussed in the previous section. To make the best operational decisions possible, we need the latest information. Analytics platforms need to ingest data from stream sources and incorporate the latest events into the analytics results immediately.

     

    Internal Users to External Users

    In the last couple of years, we've seen data products become mainstream. SaaS companies sit on top of a mountain of data, and they have the opportunity to offer analytics as a value-added service to their customers. Analytics platforms not only need to support internal operational folks, but they also need to service external users. Imagine the stress that would be put on an analytics platform when a social media app decides to open up analytics to all its advertisers!

     

    Standalone to Embedded

    Standalone analytics refers to a dedicated application that delivers resources like dashboards and reports. Today's applications demand analytics capabilities be embedded into the application. This requires analytics platforms to guarantee instantaneous query response times no matter how much data needs to be processed or how many users need to be served.

     

    Benefits of Analytics Platforms

    As the role of analytics has shifted in the enterprise, so too has its value to businesses. An entire ecosystem of software and technology has emerged to help data teams and businesses get more value out of their data. Infrastructure and data engineers are often left working overtime to keep this system running smoothly.

    Simplifying this complexity is a major selling point of modern analytics platforms, which can do everything from reduce the need for multiple tools to unifying the entire analytics process in one streamlined package. Simplification also has the added benefits of reduced maintenance work, improved security, and better performance. That last point is key. While some of these benefits arguably save you money on operational costs, analytics performance can have a direct impact on your revenue.

    The reality is that, for most enterprises, adopting an analytics platform is now a matter of when, not if.

     

    Choosing an Analytics Platform

    Even if you can appreciate the importance of analytics and the role it plays for your business, getting started with a new platform is easier said than done. Everything from determining your selection criteria, setting up your evaluation, and rolling out the solution you settle on can take plenty of time and labor hours. This makes understanding what to look out for in a platform ahead of time critical.

     

    Challenges in the Process

    When choosing an analytics platform, it's common to come up with a laundry list of features and capabilities you and your team are looking for. This can easily spiral out of control. How do you prioritize these features? How much time do you spend evaluating them across platforms? Are there any dealbreakers? Answering questions like this can be distracting and add weeks or months to your evaluation process, and that's assuming they don't just block things entirely.

    That's why it's important to know ahead of time how to separate critical capabilities from nice-to-have functionality. We'll delve deeper into what this means in future articles, but for now here's what you must, at a minimum, consider when choosing an analytics platform:

     

    Performance - Performance, undoubtedly, is the most important aspect of any analytics platform. User productivity is tied closely to it as well as the ROI of your platform.

     

    Timeliness - The timeliness of your data is a decisive factor in its effectiveness. This is especially true for operational analytics scenarios.

     

    Scalability - Modern analytics platforms need to support growing data volumes as well as a growing user base.

     

    Operation Efficiency - Most analytics platforms involve several different hardware and software modules. If the system's architecture is too complex, system administration is going to cost you a fortune.

     

    Cost-Effectiveness - Whether you are running on premises or in the cloud, reducing your cluster footprint makes your CFO happy, keeps your operational costs down, and is, in general, good for the environment.

     

    With these points in mind, you're ready to start evaluating analytics platforms.

     

    How CelerData Can Help

    Fortunately, our CelerData solutions engineers deal with enterprises struggling with these capabilities every day, and we've pulled together decades of experience to bring you some helpful guidance to make selecting your next analytics platform as smooth as possible. You'll be able to read about that in our next article.

    In the meantime, if you'd like to get a jumpstart on your analytics platform search, we'd recommend learning a bit more about CelerData.

    CelerData is a unified analytics platform founded by the original creators of the open source StarRocks project to help businesses tap into the analytics performance StarRocks is famous for accompanied by enterprise-scale features and support.

    When it comes to deployment, CelerData provides two options:

     

    CelerData Enterprise - which is deployed in your data center or private VPC in the cloud.

    CelerData Cloud - which is an SaaS cloud offering managed by CelerData.

     

    We encourage you to take a look at these solutions to see which is right for your business. If you have any questions, please reach out to one of our engineers here.

    copy success