Accelerating Single-cell Bioinformatics with N-dimensional Arrays in the Cloud

Ryan Williams | Icahn School of Medicine at Mount Sinai


Single-cell sequencing generates a new kind of genomic data, and with it new storage and compute challenges. I'll talk about recent work parallelizing analysis of this data using a variety of distributed backends (Apache Spark, Dask, Pywren, Apache Beam). I'll also discuss the Zarr format for storing and working with N-dimensional arrays, that several scientific domains have recently gravitated toward in response to challenges using HDF5 in parallel and in the cloud.

Download Slides

Ryan Williams

Software Engineer | Icahn School of Medicine at Mount Sinai

Ryan is a software developer at Mount Sinai School of Medicine, focused on open-source tools for distributed genomic- and single-cell analyses in the cloud.

Ryan Williams
Buy Tickets