The talk will give an overview of how the scrappy data engineering team at TripleLift evolved its data pipeline to keep up with its rapid growth which is currently processing tens of billions of events a day. Emphasis will be placed on the major turning points and decisions that required us to tackle the same problems in a different way - both due to new scales of data as well as growing business requirements. The talk will cover the following data technologies and how they were used and modified over the years: Kafka, Redshift, Secor, Spark, Spark Streaming, VoltDB, and Druid.
Dan is currently the VP of Engineering at TripleLift and was responsible for introducing many of the problems this talk will cover. Before TripleLift he launched a small startup that was acquired by an advertising agency and had a few stints as a quantitative engineer, the precursor to data science. He's currently trying to keep up with the overwhelming amount of open source data engineering tools.