Join StarRocks Community on Slack
Connect on SlackTABLE OF CONTENTS
Apache Iceberg is fast becoming the standard for open data, Snowflake Horizon Catalog provides the governance foundation that makes this ecosystem enterprise-ready.
By integrating CelerData Cloud with Snowflake Horizon Catalog, Snowflake customers can now enable high-performance, customer-facing analytics directly on governed Iceberg data.
Governance Meets Performance
Snowflake Horizon Catalog serves as the unified governance layer for the open ecosystem. It ensures that your Iceberg tables, whether managed by Snowflake or stored externally, are secure, compliant, and easy to manage.
CelerData Cloud, built on the open-source StarRocks project, acts as a specialized execution engine for high-demand analytical queries on top of Snowflake Horizon Catalog.
This integration allows you to run sub-second, high-concurrency analytics directly on your Snowflake-governed data. Instead of moving data around for performance, you can now leverage a single source of truth for diverse workloads, keeping your data strictly governed while enabling real-time responsiveness without data copying.
Bringing Warehouse Performance to Apache Iceberg Lakehouse
With governance unified in Horizon Catalog, the challenge is performance at scale. CelerData Cloud addresses this by applying a purpose-built analytical execution engine and a set of lakehouse-specific optimizations to Apache Iceberg:
-
Intelligent, out-of-the-box caching: CelerData Cloud automatically caches both metadata and data across memory and disk. This multi-layer caching strategy effectively masks object storage latency, enabling consistently fast dashboard queries even under high concurrency.
-
Robust query planning with smart assumptions: Apache Iceberg tables often lack complete or perfectly maintained statistics. CelerData’s cost-based optimizer is designed to work within this reality, combining available statistics with intelligent heuristics to generate efficient execution plans and maintain stable performance even when the query joins 10+ tables.
-
Targeted optimizations: CelerData brings proven execution techniques, such as late materialization and global low-cardinality dictionaries, to open file formats. These optimizations accelerate complex multi-table queries directly on Snowflake-managed Iceberg tables, without requiring proprietary storage.
Connecting CelerData Cloud to Snowflake Horizon Catalog
CelerData supports the standard Iceberg REST Catalog protocol, connecting to Snowflake Horizon is straightforward and secure.
Below is an example of how to configure the Programmatic Access Tokens (PAT) connection. Note that this configuration uses the standard REST interface, requiring no proprietary plugins:
CREATE EXTERNAL CATALOG snowflake_catalog
PROPERTIES (
"type" = "iceberg",
"iceberg.catalog.type" = "rest",
-- The Snowflake Polaris/Horizon API endpoint
"iceberg.catalog.uri" = "https://<accountidentifier>.snowflakecomputing.com/polaris/api/catalog",
-- Warehouse and Security Configuration
"iceberg.catalog.warehouse" = "<database_name>",
"iceberg.catalog.security" = "oauth2",
-- Credential Management via PAT
"iceberg.catalog.oauth2.credential" = "<your_PAT_token>",
"iceberg.catalog.oauth2.scope" = "session:role:<role>",
-- Region and S3 optimizations
"aws.s3.region" = "<region>",
"iceberg.catalog.vended-credentials-enabled" = "true",
"iceberg.catalog.token-exchange-enabled" = "false"
);
Conclusion
By combining Snowflake Horizon Catalog and CelerData Cloud, customers can govern their Apache Iceberg data while leveraging a purpose-built execution engine for high-performance analytics.
This integration supports open, interoperable architectures that allow organizations to scale customer-facing workloads with confidence.
To learn more about CelerData Cloud, you can explore a 30-day free trial at cloud.celerdata.com.
copy success
