Technical Talks

Adi Polak

Vice President of Developer Experience | Treeverse

Building Efficient ML Pipelines and Responsible AI Solutions

Building end-to-end data science solutions is a complex task that goes beyond simply winning prizes on Kaggle. Applying advanced machine learning techniques to real-word scenarios requires rigorous cleaning, preparing and feature-engineering of the data before we even get to discussions on algorithms. We then need to test various ML models, explore diverse configurations, and finally, productize our result. Doing this for massive quantities of data creates challenges at scale that become more complex when we also factor in the implications of bias and fairness in data representation.

In this talk, we’ll architect a production-grade ML Pipeline using feature engineering, model training and management tools from Apache Spark. We’ll see demos using Microsoft Azure services such as Azure DataBricks, Event Hub and Cognitive Services, that showcase these pipelines in action. And finally, we’ll explore relevant research, tools and best practices that can be used to craft responsible AI solutions with a focus on issues like bias and fairness in data representation.

Download Slides

Vice President of Developer Experience

Adi Polak

Treeverse

Adi is an open-source technologist who believes in communities and is passionate about building a better world through open collaboration. As Vice President of Developer Experience at Treeverse, Adi helps build lakeFS, git-like interface for the data lakehouse. In her work, she brings her vast industry research and engineering experience to bear in educating and helping teams design, architect, and build cost-effective data systems and machine learning pipelines that emphasize scalability, expertise, and business goals. Adi is a frequent worldwide presenter and the author of O'Reilly's upcoming book, "Machine Learning With Apache Spark." Adi is also a proud Beacon for Databricks! Previously, she was a senior manager for Azure at Microsoft, where she focused on building advanced analytics systems and modern architectures.

When Adi isn’t building data pipelines or thinking up new software architecture, you can find her on the local cultural scene or at the beach.