Technical Talks

View All

WTF is an Analytics Lake: Building an Open Data Service Layer with Arrow, DuckDB and Semantic Layer

Ryan Dolley Ryan Dolley | VP of Product Strategy | GoodData
Jan Soubusta Jan Soubusta | Distinguished Engineer | GoodData

Have you implemented a bunch of data services interacting with both clients and each other? Has maintenance became a huge burden? Do you have to boiler-plate too often? Is the performance a concern?

In this workshop, we will present to you our project "FlexQuery" which can help you to find the light at the end of the tunnel.
With just a few lines of Python code, without boiler plating, you can build a new data service.
It fits extremely well into the whole ecosystem powered by Arrow and Arrow Flight.

Moreover, we provide a base set of data services for you:
- Data Source Connectors: Bridging data sources and your analytics environment seamlessly.
- Multi-Tier Cache Storage: High-performance cache storage spanning memory, disk, and object stores.
- Cache Enabled Query Execution: Real-time insights from cached data powered by DuckDB.
- Post-SQL Transformations: Precise data shaping with the versatility of Pandas or Polars.

Finally, we provide you a complete deployment flexibility - start really small and scale out, deploy anywhere (baremetal, VMs, K8S).

But we don't stop there.
Our ongoing work introduces a new service, proposing an optimal cache design based on physical statistics, our innovative semantic model, and user activity.
In the future, we plan to fine-tune a Large Language Model, and let it design the caches even better.

And we plan to make this open source for all to benefit. Join our workshop to learn more!

Ryan Dolley
Ryan Dolley
VP of Product Strategy | GoodData

Ryan Dolley is Vice President of Product Strategy at GoodData and co-host of the Super Data Brothers Show. He has 13 years experience in the analytics and business intelligence industry as an engineer, consultant and product executive. He's an avid dungeon master, chicken farmer, husband and father to 3 children.

Jan Soubusta
Jan Soubusta
Distinguished Engineer | GoodData

Jan is a Distinguished Engineer at GoodData. He is a full-stack engineer, but primarily focuses on back-end (micro)services and data sources in his current role, validates product requirements and technical designs and mentors other engineers. Before joining GoodData, Jan worked at CSOB, Ceska Sporitelna, Unicorn and Tesco.