Technical Talks
Accelerating Single-cell Bioinformatics with N-dimensional Arrays in the Cloud
Single-cell sequencing generates a new kind of genomic data, and with it new storage and compute challenges. I'll talk about recent work parallelizing analysis of this data using a variety of distributed backends (Apache Spark, Dask, Pywren, Apache Beam). I'll also discuss the Zarr format for storing and working with N-dimensional arrays, that several scientific domains have recently gravitated toward in response to challenges using HDF5 in parallel and in the cloud.
Discover the data foundations powering today's AI breakthroughs. Join leading minds as we explore both cutting-edge AI and the infrastructure behind it. Reserve your spot at before tickets sell out!