Technical Talks

View All

Apache Arrow: A Cross-language Development Platform for In-memory Data

Wes McKinney Wes McKinney | Founder | Ursa Labs, Member Apache Software Foundation

This talk discusses Apache Arrow project and its uses for high performance analytics and system interoperability. Data processing systems have historically been full-stack systems features memory management, IO, file format adapters, runtime memory format, in-memory query engine, and front-end user interfaces. Many of these components are fully "bespoke" or "custom", in part due to a lack of open standards for many of the pieces.

Apache Arrow was created by a diverse group of open source data system developers to define open standards and community-maintained libraries for high performance in-memory data processing. Since the beginning of 2016, we have been building a cross-language development platform for data processing to help create systems that are faster, more scalable, and more interoperable.

I discuss the current development initiative and future roadmap as it relates to the data science and data engineering worlds.

Wes McKinney
Wes McKinney
Founder | Ursa Labs, Member Apache Software Foundation

Wes McKinney is an open source software developer focusing on data processing tools. He created the Python pandas project and has been a major contributor to many other OSS projects. He is a Member of the Apache Software Foundation and a project PMC member for Apache Arrow and Apache Parquet. He is the director of Ursa Labs, an innovation lab for open source data science tools powered by Apache Arrow.