Apache Arrow: A Cross-language Development Platform for In-memory Data

Technical Talks

This talk discusses Apache Arrow project and its uses for high performance analytics and system interoperability. Data processing systems have historically been full-stack systems features memory management, IO, file format adapters, runtime memory format, in-memory query engine, and front-end user interfaces. Many of these components are fully "bespoke" or "custom", in part due to a lack of open standards for many of the pieces.

Apache Arrow was created by a diverse group of open source data system developers to define open standards and community-maintained libraries for high performance in-memory data processing. Since the beginning of 2016, we have been building a cross-language development platform for data processing to help create systems that are faster, more scalable, and more interoperable.

I discuss the current development initiative and future roadmap as it relates to the data science and data engineering worlds.

💾 Download Slides

Wes McKinney

Principal Architect | Posit, PBC

Wes McKinney is an open source software developer and entrepreneur focusing on data processing tools and systems. He created the Python pandas and Ibis projects, and co-created Apache Arrow. He is a Member of the Apache Software Foundation and also a project PMC member for Apache Parquet. He is currently a Principal Architect at Posit PBC and a co-founder of Voltron Data.

Technical Talks

Apache Arrow: A Cross-language Development Platform for In-memory Data

FEATURED MEETINGS

Follow / Join Us

Contact Us

Menu