A central challenge of current stream processors is navigating the trade-off between performance and consistency. Giving results too early achieves low latency at the cost of missing late arrivals, treading too carefully knocks us back into the world of batch processing. The few stream processors that handle this well restrict themselves to relatively simple computations (think MapReduce).
Timely Dataflow changes that. By rethinking how time should be represented in a distributed system, it achieves lower latencies and strong consistency, while allowing its users to express even complex cyclic computations. Originally developed at Microsoft Research, Timely has been refined over the last years by ETH Zurich's Systems Group under the supervision of one of its creators, Frank McSherry.
In this talk, we give an introduction to the Timely stack, revealing what makes it special, and how we use it in our professional work. Using a real-world use case as a working example, we will guide you through Timely's underlying dataflow model, its unique approach to progress tracking, and give intuition for why it is able to outperform even specialized systems in the wild. Along the way, we highlight some of the more advanced aspects of the Timely ecosystem, and how your organization can use it to supercharge its data-driven architectures.
Malte Sandstede is a partner at Clockworks and a graduate student at TU Munich. At his company, he supports organizations in building powerful data-driven systems. Currently, he researches online analysis of distributed dataflows with the timely dataflow computational model at ETH Zurich's Systems Group. Malte is interested in data-centric functional and relational programming. His aim is to create simple and effective programs — to unveil technology's inherent beauty.
Niko Göbel is a partner at Clockworks, where he helps to design and deliver scalable data-driven systems for international clients. He is interested in ideas that significantly lower the cost of doing useful things with data, and of software development in general. His research is focused on streaming query engines and on-line event processing.