Did you know that databases often “cheat”? Even with a scalable query engine and smart optimizer, many real-world queries would be too slow if the engine read all the data, so the engine re-writes your query to use a pre-materialized result. B-tree indexes made the first relational databases possible, and there are now many flavors of materialization, from explicit materialized views to OLAP-style caching and spatial indexes. Materialization is more relevant than ever in today’s heterogenous, distributed systems.
If you are evaluating data engines, we describe what materialization features to look for in your next engine. If you are implementing an engine, we describe the features provided by Apache Calcite to design, maintain and use materializations.
Julian Hyde is an expert in query optimization, database internals, in-memory analytics, and streaming. He is the founder of Apache Calcite, an open-source query planning framework that powers many database and streaming SQL engines, including Apache Beam, Flink and Hive. He was the original developer of the Mondrian OLAP engine, and was formerly Chief Architect at SQLstream. He is an architect at Looker.