Compliant Data Management and Machine Learning at Scale

Daniel Whitenack | Pachyderm

ABOUT THE TALK

Data scientists, machine learning engineers, and researchers are under increasing pressure to provide explanations for how they are processing and managing user data. In particular, the EU's GDPR regulations taking effect this year are forcing organizations to rethink their data management and processing strategies.
 
In this talk, we will demonstrate a data management and processing methodology/framework that is helping organization deploy compliant workflows on top of Kubernetes. The framework, based on the open source Pachyderm project, gives data scientists automatic tracking of changes to data and of all the various pieces of data and processing that lead to particular results. This, along with access control strategies and anonymization (which will also be discussed in the talk), gives organizations a framework that is easy to manage, scalable for AI/ML workflows, and compliant.

Download Slides

daniel whitenack

Data Scientist & Lead Developer Advocate | Pachyderm

Daniel is a Data Scientist and Lead Developer Advocate at Pachyderm. Previously he worked at Ardan Labs as Data Engineer and Data Science Mentor at Thinkful. He holds a Ph.D. in Mathematical/Computational Physics.

Daniel Whitenack Pachyderm