Recent years have seen a widespread adoption of machine learning in industry and academia, impacting diverse areas from advertisement to personal medicine. As more and more areas adopt machine learning and data science techniques, the question arises on how much expertise is needed to successfully apply machine learning, data science and statistics. Not every company can afford a data science team, and going your PhD in biology, no-one can expect you to have PhD-level expertise in computer science and statistics.
This talk will summarize recent progress in automating machine learning and give an overview of the tools currently available. It will also point out areas where the ecosystem needs to improve in order to allow a wider access to inference using data science techniques.
Finally we will point out some open problems regarding assumptions, and limitations of what can be automated. The talk will first describe recent process in commodification of machine learning, as witnessed by a wide array of open source packages and commercial solutions. Then I will discuss the setting of automating supervised learning, and recent progress in automatic model selection and meta learning.
Andreas Mueller is a lecturer at the Data Science Institute at Columbia University and author of the O'Reilly book "Introduction to machine learning with Python", describing a practical approach to machine learning with python and scikit-learn. He is one of the core developers of the scikit-learn machine learning library, and has been co-maintaining it for several years. He is also a Software Carpentry instructor. In the past, he worked at the NYU Center for Data Science on open source and open science, and as Machine Learning Scientist at Amazon.