Technical Talks

Saif Ur-Rehman
Saif Ur-Rehman
Data Engineering Lead | Basecamp Research

Engineering Earth's Largest Biological Data Pipeline

video
Missing value detected...
Video will be populated after the conference

ABOUT THE TALK
  • Lightning Talks

Basecamp Research is pioneering a groundbreaking mission to map the unknown biological world, addressing the staggering fact that over 99.9% of life on Earth remains undiscovered. This session unveils an unprecedented biological data pipeline that surpasses all publicly available scientific data collected over the past century. By creating a comprehensive digital twin of Earth's life, the team is developing next-generation biological foundation models with applications spanning pharmaceutical research, deep learning, and scientific discovery. Attendees will explore how a global biological data supply chain, spanning five continents, is generating billions of biological labels and producing state-of-the-art AI models that outperform research from Google, DeepMind, and Genentech.

Saif Ur-Rehman

Data Engineering Lead

Saif Ur-Rehman

Basecamp Research

Dr. Saif Ur-Rehman is Principal Data Engineer at Basecamp Research, where he leads a team developing strategic data pipelines and bioinformatics systems that have facilitated the production of state-of-the-art models for protein folding, annotation, and generation. His work focuses on constructing datasets for model training and fine-tuning while leveraging computational biology to build sustainable solutions. His expertise spans cancer biology and machine learning, developed through his postdoctoral work at the Institute of Cancer Research and his bioinformatics experience at EMBL-EBI and Genomics England after earning his PhD in Computational Biology from the University of St Andrews.