Modern machine learning can be incredibly powerful, but the reality is that even the best systems are far from perfect. Unlike traditional software with deterministic outcomes, ML services operate in probabilities and can return different results over time.
To demonstrate this, I will showcase the approach we took to testing 4 commercial ML systems for gender bias at our small startup. By startup necessity, we had to scale our blackbox testing with practical automation. By building tools that allowed us to rapidly generate new data sets, perform queries, and version control results, we were able to find large categories of images with gender labelling errors.
Edwin is co-founder of TinyData, where he is focused on enabling all businesses to have access to best-in-class data stacks. Edwin has built data products at companies both small and large. He previously founded CastTV (a video data company that was acquired by Tribune) and FileFish (an enterprise data company that was acquired by Oracle). As the EVP of Tribune’s data business, he led the team that produced video data for many of the major cable, satellite, and technology companies in the world.