ML Model evaluation traditionally involves checking the model's performance metrics. While this gives us some insight, is it the whole story?
When we come to thoroughly evaluate our model, we’re aiming to validate the model’s expected behavior in real life. Thus, we’d want to verify many additional aspects, including its robustness, performance over weaker segments, ability to generalize to unseen data, behavior in edge cases and more.
This talk will explore a wide approach for model evaluation, to enable having a clearer picture about the above. Successful incorporation of these ideas within our ML development process can enable us to prevent costly failures in deployed ML models by uncovering the problems much earlier in the pipeline.
Shir is the co-founder and CTO of Deepchecks, an MLOps startup for continuous validation of ML models and data. Previously, Shir worked at the Prime Minister’s Office and at Unit 8200, conducting and leading research in various Machine Learning and Cybersecurity related challenges. Shir has a B.Sc. in Physics from the Hebrew University, which she obtained as part of the Talpiot excellence program, and an M.Sc. in Electrical Engineering from Tel Aviv University. Shir was selected as a featured honoree in the Forbes Europe 30 under 30 class of 2021.