Today, with always more data at their fingertips, Machine Learning experts seem to have no shortage of opportunities to create always better models. Over and over again, research has proven that both the volume and quality of the training data is what differentiates good models from the highest performing ones. But with an ever-increasing volume of data, and with the constant rise of data-greedy algorithms such as Deep Neural Networks, it is becoming challenging for data scientists to get the volume of labels they need at the speed they need, regardless of their budgetary and time constraints.
To address this “Big Data labeling crisis”, most data labeling companies offer solutions based on semi-automation, where a machine learning algorithm predicts labels before this labeled data is sent to an annotator so that he/she can review the results and validate their accuracy. There is a radically different approach to this problem which focuses on labeling “smarter” rather than labeling faster.
Instead of labeling all of the data, it is usually possible to reach the same model accuracy by labeling just a fraction of the data, as long as the most informational rows are labeled. Active Learning allows data scientists to train their models and to build and label training sets simultaneously in order to guarantee the best results with the minimum number of labels.
Jennifer Prendki is the founder and CEO of Alectio. The company is the direct product of her beliefs that good models can only be built with good data, and that the brute force approach that consists in blindly using ever larger training sets is the reason why the barrier to entry into AI is so high. Prior to starting Alectio, Jennifer was the VP of Machine Learning at Figure Eight, the company that pioneered data labeling, Chief Data Scientist at Atlassian and Senior Manager of Data Science in the Search team at Walmart Labs. She has spent most of her career creating a data-driven culture wherever she went, succeeding in sometimes highly skeptical environments. She is particularly skilled at building and scaling high-performance machine learning teams and is known for enjoying a good challenge. Trained as a particle physicist (she holds a PhD in particle physics from Sorbonne University), she likes to use her analytical mind not only when building complex models but also as part of her leadership philosophy. She is pragmatic yet detail oriented. Jennifer also takes great pleasure in addressing both technical and nontechnical audiences alike at conferences and seminars and is passionate about attracting more women to careers in STEM.
Data Council, PO Box 2087, Wilson, WY 83014, USA - Phone: +1 (415) 800-4938 - Email: community (at) datacouncil.ai