When building a data pipeline, we need to decide if we should strictly validate incoming data, and discard anything that we don't support, or if we should be flexible, and accept anything so we can analyze it later. In this talk, I'll discuss how the compromise we've reached at Bluecore, where we both record the "raw" data to recover from bugs or mistakes, as well as strictly validated data. I'll talk about why we think that validating up front is the better choice when building data intensive applications.

Download Slides

Evan Jones

Software Engineer | Bluecore

Evan Jones is a software engineer at Bluecore in New York. He previously fixed interesting bugs at Twitter, and taught a database class at Columbia as an adjunct. Evan was a co-founder and CTO of Mitro, a password manager for groups and organizations. Before that, he earned a Ph. D. from MIT, researching distributed OLTP databases. Even earlier in his life, he worked at Google in New York for a bit more than a year, and he was a graduate and undergraduate student at the University of Waterloo.

Evan Jones

Experience talks like this and many more at San Francisco 2019