Not Your Father's Data Lakehouse: Building with Trino and Iceberg

Monica Miller Monica Miller | Senior Developer Advocate | Starburst
Jack Klamer Jack Klamer | Software Engineer, Data Mesh Enthusiast | Starburst

The data lakehouse architecture has taken the analytics world by storm, applying critical data warehouse-like capabilities to the data lake. To achieve this desired result, you need to select two critical components of your lakehouse - a query engine and a table format. In this workshop, Jack Klamer and Monica Miller will lead you through how you can easily build and manage an open data lakehouse architecture using open-source technologies such as Trino and Apache Iceberg to support your growing analytics. Trino is an open source highly parallel and distributed query engine built from the ground up at Facebook for efficient, low-latency analytics. Iceberg is an open source, highly performant table storage format that enables an engine like Trino to perform data warehousing SQL functionality such as UPDATE, DELETE, and MERGE commands on the data lakehouse. Jack and Monica will help you configure and build a sample data lakehouse, transform your data, highlight key Iceberg functionality, and produce a final output ready to be utilized by downstream consumers.

Senior Developer Advocate | Starburst

Monica is a former data engineer turned developer advocate, who now works to improve the lives of other data engineers by creating informational resources, speaking at conferences, and writing about her experiences in the data space.  As a data engineer, she spent her time primarily developing and supporting data pipelines for both near-real time analytics and batch processing.

Software Engineer, Data Mesh Enthusiast | Starburst

A data focused software engineer with a background in many types of data and data technologies. Jack is driven by a core belief that we can do so much better with our data technology and hopes to help get us there.