Twitter is all about real-time at scale. To achieve real-time performance, Twitter has developed, deployed and open-sourced Heron, the next-generation cloud streaming engine. The amount of data that need to be processed in Twitter’s data centers changes significantly due to expected and unexpected global events. For example, during the Super Bowl, there are spikes of tweets that all need to be processed in real-time. Similarly, unexpected events such as natural disasters can generate very large volumes of data.
In this talk we will describe how Twitter and Microsoft have been collaborating to transform Heron into a truly elastic system that can support dynamic load changes. We'll present how we adapted several components of the system, such as the scheduler and resource manager, to make them able to seamlesly support elasticity. We will also present our future plans for scaling that current and future Heron contributors could tackle.
Ashvin is currently a Senior Research Engineer at Microsoft where he primarily works on streaming systems and contributes to the Twitter Heron project. He specializes in developing large scale distributed systems and has a work experience of more than 10 years. Prior to joining the company, Ashvin was part of VMware, Yahoo, and Mojo Networks. He holds a M.Tech in Computer Science from IIT Kanpur, India.
Avrilia is a Senior Scientist at Microsoft's Cloud and Information Services Lab, where her research is focused on scalable real-time stream processing systems. She is also an active contributor to Heron, collaborating with Twitter. Prior to her current role, she was a research scientist at IBM Research working on SQL-on-Hadoop systems. She holds a PhD in data management from the University of Wisconsin-Madison.
Data Council, PO Box 2087, Wilson, WY 83014, USA - Phone: +1 (415) 800-4938 - Email: community (at) datacouncil.ai