
Materialized View
What is a Materialized View?
A materialized view is a database object that contains the results of a query. It may be a local copy of data located remotely, or a subset of the rows and/or columns of a table, or a summary using an aggregate function. The name materialized views can be broken down into two parts - materialization and views:
Materialization involves caching intermediate results of a query using additional storage space. This process accelerates the computation by reusing precomputed materialized results.
Views emphasize logical semantics, providing a unified schema for the upper layer, shielding complexity, and simplifying permission management.
For example, let's say you run an online store, and you have a database with millions of rows containing sales data. You often need to know the total sales for each product category. Instead of executing a complex query every time, you can create a materialized view that calculates the total sales for each category in advance. When you need the information, the database quickly reads the pre-aggregated data from the materialized view, saving time and resources.
Basic Concepts
-
Base Table: The original table specified when creating a materialized view (e.g., sales data table).
-
Materialized Table: A new table that is created based on the base table (e.g., total sales per product category).
-
Materialized View: The user-visible materialized view, which may include both the materialized table and incremental data.
-
Materialized View Task: A task that defines the processing from the base table to the materialized view table.
- Query Acceleration: materialized views can speed up queries by using precomputed data instead of scanning the base table and performing calculations on the fly (e.g., calculating total sales per product category).

- Data Modeling: Materialized views can generate a wide table by joining two or more tables and scheduling periodic updates, allowing direct querying on the generated materialized view (e.g., joining sales data with customer information).

Benefits of Materialized Views
-
Improved query performance: Materialized views simplify the process of optimizing performance in applications by relying on them for the intermediate results you use most frequently (e.g., faster access to total sales per product category).
-
Efficient handling of large datasets: Materialized views speed up queries by shifting the computational work to write time instead of query time (e.g., pre-calculating total sales per product category).
-
Greater consistency: Materialized views promote consistency by embedding business logic within data structures, ensuring its constant reuse (e.g., always calculating sales data in the same way).
Understand Materialized Views in StarRocks
Materialized View Use Cases in StarRocks
-
Transparent Acceleration: acceleration scenarios typically involve improving query performance without modifying user SQL (e.g., making sales data queries faster without changing the underlying SQL code).
-
Single-table Acceleration: pre-aggregation, cube optimization, and adjusting sorting columns (e.g., optimizing sales data for faster access to most popular products).
-
Multi-table Acceleration: de-normalization, multi-stream join, and ETL processing (e.g., joining sales data with customer and product tables for more comprehensive analysis).
-
ETL Processing:
-
Data Warehouse Layering: StarRocks supports ETL processing through its materialized views, enhancing data analysis across four data warehouse layers: ODS, DWD, DWS, and ADS (e.g., organizing data from raw inputs to aggregated summaries).
-
Scheduled Tasks: StarRocks schedules tasks related to materialized views through manual, periodic automatic, and stream-triggered scheduling (e.g., refreshing materialized views daily, weekly, or on-demand).
-
Data Lineage: by defining relationships between base tables and materialized views (MV), it's possible to organize multiple tables into a data map. Tasks, organized through the materialized view definitions, form a Directed Acyclic Graph (DAG) that reveals the flow of data updates (e.g., tracking how data moves through various tables and transformations).
-
Data Import: materialized views can encapsulate all data imports by converting all imports from external tables to internal tables. This approach simplifies and streamlines the data import process, making it more efficient and easier to manage (e.g., seamlessly integrating data from external sources).
-