📺 All 2025 Talks Are Live on Our YouTube

Technical Talks

Nishith Agarwal
Nishith Agarwal
Senior Software Engineer | Uber
Ethan Guo Yihua
Ethan Guo Yihua
Software Engineer | Uber

Powering Uber's global network analytics pipelines in near real-time with Apache Hudi (Incubating) Delta Streamer

Measuring & aggregating network request duration percentiles and error rate are essential for Uber teams to monitor the reliability of network libraries and catch performance issues, at a global scale across 600+ cities/4500+ mobile carriers. In this talk, we will discuss how we used the Hudi Delta Streamer to build Spark pipelines to incrementally pull data sources and generate network metrics, powering near real-time network dashboards.

Hudi is an open-source data format managing storage of large analytical datasets, providing efficient upsert and incremental processing primitives. We will describe the design of incrementally updating different types of network metrics using Hudi upserts and how it reduces ingestion and processing latency by taking advantage of incremental pulls. We will also share our experience on launching and running near real-time network analytics pipelines at Uber.

Nishith Agarwal

Senior Software Engineer

Nishith Agarwal

Uber

Nishith Agarwal is a senior software engineer at Uber, where he works on the Hudi project and the Hadoop platform at large. His interests lie in large-scale distributed and data systems.

Ethan Guo Yihua

Software Engineer

Ethan Guo Yihua

Uber

Ethan is a Software Engineer at Uber. Previously he worked at Google as Software Engineering Intern. Ethan finished his Ph.D & Masters from University of Michigan.