Technical Talks

View All

Scaling up your pandas workflows with Modin

Devin Petersohn Devin Petersohn | Co-Founder and CTO | Ponder

pandas is one of the most commonly used data science libraries in Python, with a convenient set of APIs to help data scientists prepare, analyze, and explore their data. However, despite its widespread adoption, pandas suffers from severe memory and performance issues on moderately large datasets. We present Modin, a fast, scalable drop-in replacement for pandas. By changing just a single line of code, Modin seamlessly speeds up pandas workflow on a laptop or in a cluster. Modin has over 6.6k GitHub stars, 1.7 million downloads, and is deployed at many data-centric organizations to accelerate dataframe workflows.

For more details, see: https://github.com/modin-project/modin

Devin Petersohn
Devin Petersohn
Co-Founder and CTO | Ponder

Devin Petersohn is the lead developer of Modin and the co-founder and CTO of Ponder. Devin recently completed his Ph.D. from UC Berkeley RISE Lab, where he did research on distributed systems for data science. As a part of this work, he created Modin, a system for enabling scalable interactive data science.

FEATURED MEETINGS