DataCouncil_Logo_Transparent
  • BAY 25
  • AUS 24
  • Sponsor
  • Podcast
  • Register Now
Data Council

Data Council Talks

Creating an Extensible Big Data Platform

Creating an Extensible Big Data Platform
Reza Shiftehfar | Uber
DBT: Powerful, Open Source Data Transformations

DBT: Powerful, Open Source Data Transformations
Connor McArthur | Fishtown Analytics / DBT
Architecting the Data Lake: How to Ensure that Your ETL Pipelines Deliver High Quality Data

Architecting the Data Lake: How to Ensure that ...
Paul Lappas | Intermix
OASIS – Data Analysis Platform for Enterprise

OASIS – Data Analysis Platform for Enterprise
Keiji Yoshida | LINE Corporation
Building Highly Reliable Data Pipelines at Datadog

Building Highly Reliable Data Pipelines at Datadog
Quentin Francois | Datadog
Building a Feature Platform to Scale Machine Learning

Building a Feature Platform to Scale Machine ...
Willem Pienaar | Cleric
Creating a Data Engineering Culture

Creating a Data Engineering Culture
Jesse Anderson | Big Data Institute
Big Data Platform-as-a-Service for Cross-Media Monitoring

Big Data Platform-as-a-Service for Cross-Media ...
Liana Napalkova | Eurecat Technology Center
Apache Arrow: A Cross-language Development Platform for In-memory Data

Apache Arrow: A Cross-language Development ...
Wes McKinney | Posit
Computer Vision in Space

Computer Vision in Space
Marco Bressan | Satellogic
Easy Access to Data with Presto

Easy Access to Data with Presto
Iker Martinez de Apellaniz | Schibsted Classified Media
Event-Driven Data Architecture at Letgo

Event-Driven Data Architecture at Letgo
Ricardo Fanjul | Letgo
Automation of Machine Learning Workflows at BBVA

Automation of Machine Learning Workflows at BBVA
Jose Serrano | BBVA Data & Analytics
Flink SQL in Action

Flink SQL in Action
Timo Walther | Confluent
Data Processing with Apache Beam: Towards Portability and Beyond

Data Processing with Apache Beam: Towards ...
Maximilian Michels | Google
Using R in a Mid-Sized Data Analysis Scenario

Using R in a Mid-Sized Data Analysis Scenario
Richard Brosi | Yara Digital Labs
Driving in Dataland

Driving in Dataland
Carlos Herrera | Cabify
Marquez: A Metadata Service for Data Abstraction, Data Lineage, and Event-based Triggers

Marquez: A Metadata Service for Data Abstraction, ...
Willy Lulciuc | Datakin
Presto: Fast SQL-on-Anything

Presto: Fast SQL-on-Anything
Kamil Bajda-Pawlikowski | Starburst
Extract - Tiered Transform - Load (ETTL): A pipeline for a modular, scalable, and observable Internal Analytics platform

Extract - Tiered Transform - Load (ETTL): A ...
Jean-Mathieu Saponaro | Datadog
Using Embeddings to Understand the Evolution of Data Science Skill Sets

Using Embeddings to Understand the Evolution of ...
Maryam Jahanshahi | TapRecruit
Data Pipeline Frameworks: The Dream and the Reality

Data Pipeline Frameworks: The Dream and the ...
Mark Weiss | Beeswax
An Update on Scikit-learn

An Update on Scikit-learn
Andreas Mueller | Data Science Institute
Active Learning: Why Smart Labeling is the Future of Data Annotation

Active Learning: Why Smart Labeling is the Future ...
Jennifer Prendki | Alectio
Scaling Personalization via Machine-Learned Assortment Optimization

Scaling Personalization via Machine-Learned ...
Ethan Rosenthal | Runway
Hindsight Bias: How to Deal with Label Leakage at Scale

Hindsight Bias: How to Deal with Label Leakage at ...
Till Bergmann | Salesforce
Oops I did it Again -- Adapting a Pop Music Identifier to Find Syndicated Content in Talk Radio

Oops I did it Again -- Adapting a Pop Music ...
Allison King | Cortico
Artwork Personalization at Netflix

Artwork Personalization at Netflix
Tony Jebara | Netflix
PyTorch 1.0 - The Platform for Accelerating AI Research to Production

PyTorch 1.0 - The Platform for Accelerating AI ...
Jeff Smith | Facebook AI Research
Scalability is Quantifiable: The Universal Scalability Law

Scalability is Quantifiable: The Universal ...
Baron Schwartz | VividCortex
Content Based Recommendations: Using Word Embeddings to Automate Related Content Generation at BuzzFeed

Content Based Recommendations: Using Word ...
Carolyn Huangci | BuzzFeed
Building a Modern Machine Learning Platform on Kubernetes

Building a Modern Machine Learning Platform on ...
Saurabh Bajaj | Lyft
Evolving Stitch Fix's Data Platform for Data Lineage

Evolving Stitch Fix's Data Platform for Data ...
Neelesh Srinivas Salian | Stitch Fix
The Literate Programmer: Cargo Cult Open Source

The Literate Programmer: Cargo Cult Open Source
Wes Chow | Cortico
Accelerating Single-cell Bioinformatics with N-dimensional Arrays in the Cloud

Accelerating Single-cell Bioinformatics with ...
Ryan Williams | Icahn School of Medicine at Mount Sinai
The Software Architecture of WayUp's Job Recommender System

The Software Architecture of WayUp's Job ...
Harlan Harris | WayUp
Automating Modeling Pipelines

Automating Modeling Pipelines
William Nelson | Intent Media Inc
Building Data Tools that Work

Building Data Tools that Work
Benn Stancil | ThoughtSpot
Analyzing Data in the Cloud: Is True Privacy and Security Possible?

Analyzing Data in the Cloud: Is True Privacy and ...
Raghu Murthy | Datacoral
Stream Processing Design Patterns

Stream Processing Design Patterns
Andreas Markmann | Capital One
Scale Processes, Not People: How Data Teams Do More With Less By Adopting Software Engineering Best Practices

Scale Processes, Not People: How Data Teams Do ...
Thomas La Piana | GitLab
AI farming: 100x the yield with a data team of 1

AI farming: 100x the yield with a data team of 1
Sam Swift | Bowery Farming
The Highs and Lows of Building an Adtech Data Pipeline

The Highs and Lows of Building an Adtech Data ...
Dan Goldin | TripleLift
The Customer as The Unit of Analysis: Models, Metrics and a Multitude of Uses

The Customer as The Unit of Analysis: Models, ...
Brian Bloniarz | Second Measure
Technical Founders Panel

Technical Founders Panel
William Falcon | Facebook / NYU
Computer Vision AI to Disrupt Digital Advertising

Computer Vision AI to Disrupt Digital Advertising
Joy Tang | Markable AI
Running Effective Machine Learning Teams: Common Issues, Challenges and Solutions

Running Effective Machine Learning Teams: Common ...
Gideon Mendels | Comet.ml
Optimizing Time to Data through Streams and Data Abstraction

Optimizing Time to Data through Streams and Data ...
Nicolas Joseph | Datalogue
Engineering Lessons Learned by Data Scientists in Growing MalwareScore from Kaggle Competition to Trusted Antivirus Solution

Engineering Lessons Learned by Data Scientists in ...
Phil Roth | Endgame
The Unreasonable Deceptiveness of Bad Data

The Unreasonable Deceptiveness of Bad Data
Rigel Swavely | Clarifai
Fixing the Big Data Development Cycle with SQL

Fixing the Big Data Development Cycle with SQL
Justin Coffey | Criteo Labs
Building a Knowledge Graph Platform from Billions of Unstructured Online Sources Using AI

Building a Knowledge Graph Platform from Billions ...
Aditya Jami | Meltwater
Three Tips for Better Predictive Modeling

Three Tips for Better Predictive Modeling
Stephanie Yang | Foursquare
AI Challenges in Customer Care Automation

AI Challenges in Customer Care Automation
Sameer Yami | Linc Global
Building a Music Analytics Pipeline at Pandora

Building a Music Analytics Pipeline at Pandora
Brian Femiano | Pandora
Fast Data apps with Alpakka Kafka connector and Akka Streams

Fast Data apps with Alpakka Kafka connector and ...
Sean Glover | Lightbend
Causal Data Science

Causal Data Science
Adam Kelleher | Barclays Investment Bank
Machine Learning from Development to Production at Instacart

Machine Learning from Development to Production ...
Montana Low | Instacart
Cloud Data Warehouse Benchmark: Redshift vs Snowflake vs BigQuery

Cloud Data Warehouse Benchmark: Redshift vs ...
George Fraser | Fivetran
Democratizing Data with the Clover Transform Framework

Democratizing Data with the Clover Transform ...
Chris Hartfield | Clover Health
Functional Data Engineering - A Set of Best Practices

Functional Data Engineering - A Set of Best ...
Max Beauchemin | Preset
Uber’s Data Journey: 100+PB with Minute Latency

Uber’s Data Journey: 100+PB with Minute Latency
Reza Shiftehfar | Uber
Scaling a Relational Database for the Cloud-age

Scaling a Relational Database for the Cloud-age
Sumedh Pathak | Citus Data
Lazy Beats Smart and Fast

Lazy Beats Smart and Fast
Julian Hyde | Google
Effective Management of High Volume Numeric Data with Histograms

Effective Management of High Volume Numeric Data ...
Fred Moyer | Circonus
From Flat Files to Deconstructed Database: The Evolution and Future of the Big Data Ecosystem

From Flat Files to Deconstructed Database: The ...
Julien Le Dem | Datadog
A Trillion Rows Per Second as a Foundation for Interactive Analytics

A Trillion Rows Per Second as a Foundation for ...
Eric Hanson | MemSQL
What the heck is an In-Memory Data Grid?

What the heck is an In-Memory Data Grid?
Addison Huddy | Pivotal
Data Access for Data Science

Data Access for Data Science
Jacques Nadeau | Dremio
A Multi-Armed Bandit Framework for Recommendations at Netflix

A Multi-Armed Bandit Framework for ...
Jaya Kawale | Netflix
Fast & Effective: Natural Language Understanding

Fast & Effective: Natural Language Understanding
Mike Conover | Workday
Weld: Accelerating Data Science by 100x

Weld: Accelerating Data Science by 100x
Shoumik Palkar | Stanford University Infolab
AutoML: The Assembly Line of Machine Learning

AutoML: The Assembly Line of Machine Learning
Mayukh Bhaowal | Salesforce
Enabling Full Stack Data Scientists at Stitch Fix

Enabling Full Stack Data Scientists at Stitch Fix
Juliet Hougland | Stitch Fix
Safely Streamlining Healthcare Policy Management using Ideas from Structured Natural Language Processing (SNLP)

Safely Streamlining Healthcare Policy Management ...
Asif Khalak | Collective Health
Democratizing Metric Definition and Discovery at Airbnb

Democratizing Metric Definition and Discovery at ...
Lauren Chircus | Airbnb
Define Once, Evaluate Anywhere: Building Repeatable and Correct Features at Stripe

Define Once, Evaluate Anywhere: Building ...
Kelley Rivoire | Stripe
Hazardous Models and Risk Mitigation in Real Estate

Hazardous Models and Risk Mitigation in Real ...
David Lundgren | Opendoor
The Design of Systems for Real-time Prediction Serving

The Design of Systems for Real-time Prediction ...
Joseph Gonzalez | RunLLM & UC Berkeley
Data Science: Past, Present, Future

Data Science: Past, Present, Future
Shubha Nabar | Salesforce
VC Panel Talk

VC Panel Talk
Lisha Li | Amplify PartnersAmplify Partners
Macrobase: A Search Engine for Fast Data Streams

Macrobase: A Search Engine for Fast Data Streams
Sahaana Suri | Stanford University
TimescaleDB: Rearchitecting a SQL database for time-series data

TimescaleDB: Rearchitecting a SQL database for ...
Mike Freedman | TimescaleDB
Production Analytics With a Distributed Column Store

Production Analytics With a Distributed Column ...
Sam Stokes | Honeycomb
Don’t optimize my queries, optimize my data!

Don’t optimize my queries, optimize my data!
Julian Hyde | Google
Worse Case Scenario in the Database

Worse Case Scenario in the Database
Marianne Bellotti | United States Digital Service
The Challenges of Distributing Postgres: A Citus Story

The Challenges of Distributing Postgres: A Citus ...
Ozgun Erdogan | Citus Data
The Statistics of Dirty Data

The Statistics of Dirty Data
Sanjay Krishnan | UC Berkeley
Easy, Scalable, Fault-tolerant Stream Processing with Structured Streaming in Apache Spark

Easy, Scalable, Fault-tolerant Stream Processing ...
Burak Yavuz | Databricks
Using Apache Arrow, Calcite and Parquet to build a Relational Cache

Using Apache Arrow, Calcite and Parquet to build ...
Jacques Nadeau | Dremio
How Spotify Distills Terabytes of Raw Data into Meaningful Music Recommendations

How Spotify Distills Terabytes of Raw Data into ...
Gandalf Hernandez | Spotify
Building a Recursive BigQuery Mapper

Building a Recursive BigQuery Mapper
Darren McCleary | The New York Times
Building ETL Infrastructure that Analysts Love

Building ETL Infrastructure that Analysts Love
Christian Romming | ETLeap
Using Apache Spark for processing trillions of records each day at Datadog

Using Apache Spark for processing trillions of ...
Vadim Semenov | Datadog
Productizing structural models

Productizing structural models
James Savage | Schmidt Futures
Lessons in hiring data scientists

Lessons in hiring data scientists
Genevieve Smith | Insight
Using Causal Forests for Subpopulation Identification in Randomized Clinical Trials

Using Causal Forests for Subpopulation ...
James Faghmous | Icahn School of Medicine at Mt. Sinai
You Won't Believe How We Optimize our Headlines

You Won't Believe How We Optimize our Headlines
Lucy Wang | BuzzFeed
Deploying Data Science for Distribution of The New York Times

Deploying Data Science for Distribution of The ...
Anne Bauer | The New York Times
Zip codes and other lies your map told you

Zip codes and other lies your map told you
Peter Lenz | Near
Automating machine learning

Automating machine learning
Andreas Mueller | Data Science Institute
Building automated support at Kickstarter

Building automated support at Kickstarter
Jeffrey Doker | Kickstarter
Privacy Techniques for Data Science

Privacy Techniques for Data Science
Jim Klucar | Immuta
The Future of Data Science in the Media

The Future of Data Science in the Media
Haile Owusu | Mashable
Composable Interfaces for Parallel Processing in Apache Spark & Weld

Composable Interfaces for Parallel Processing in ...
Matei Zaharia | Databricks
Data Science @ Pinterest

Data Science @ Pinterest
Mohammad Shahangian | Pinterest
Data Science in the Enterprise

Data Science in the Enterprise
Sean Anderson | Vectara
Practical Solutions for Annoying Machine Learning Problems

Practical Solutions for Annoying Machine Learning ...
Alyssa Frazee | Stripe
Beyond 50,000 Partitions: How Heroku Pushes the Limits of Kafka at Scale

Beyond 50,000 Partitions: How Heroku Pushes the ...
Jeff Chao | Heroku
Scaling Up Spark at Facebook – a 60TB Production Use Case

Scaling Up Spark at Facebook – a 60TB Production ...
Shuojie Wang | Facebook
 How Superset and Druid Power Real-Time Analytics at Airbnb

How Superset and Druid Power Real-Time Analytics ...
Max Beauchemin | Preset
Hoodie: An Open Source Incremental Processing Framework From Uber

Hoodie: An Open Source Incremental Processing ...
Vinoth Chandar | Onehouse
Data for the 99%

Data for the 99%
Benn Stancil | ThoughtSpot
Payment Fraud in Digital Currency

Payment Fraud in Digital Currency
Soups Ranjan | Revolut
The Limitations of Big Data in Predictive Analytics

The Limitations of Big Data in Predictive ...
Jennifer Prendki | Alectio
Practical Lessons for Building Machine Learning Models in Production

Practical Lessons for Building Machine Learning ...
Sharath Rao | Instacart
An Introduction to Big Data's Unsung Hero: The Log

An Introduction to Big Data's Unsung Hero: The Log
Liz Bennett | Stitch Fix
Why, When, How: Lessons Learned in Applying Deep Learning to Real-World Problems

Why, When, How: Lessons Learned in Applying Deep ...
Daniel Galron | eBay
Anomaly Detection for Data Quality and Metric Shifts at Netflix

Anomaly Detection for Data Quality and Metric ...
Laura Pruitt | Netflix
Cloud-Native Stream Processing

Cloud-Native Stream Processing
Sid Anand | Apache Software Foundation
Parsing of Diverse Healthcare Data

Parsing of Diverse Healthcare Data
Chris Hartfield | Clover Health
Interactive Exploratory Analytics with Druid

Interactive Exploratory Analytics with Druid
Fangjin Yang | Imply
InfluxDB Storage Engine Internals

InfluxDB Storage Engine Internals
Paul Dix | InfluxData
A Nation of Immigrants: The Data Sciences

A Nation of Immigrants: The Data Sciences
Kenneth Sanford | Dataiku
Twitter Heron: The Path Towards Elastic Streaming

Twitter Heron: The Path Towards Elastic Streaming
Ashvin Agrawal | Microsoft
Simulation-based Inference: Advantages Over A/B Testing in Real Estate

Simulation-based Inference: Advantages Over A/B ...
Nelson Ray | Opendoor
Format Wars: from VHS and Beta to Avro and Parquet

Format Wars: from VHS and Beta to Avro and Parquet
Silvia Oliveros Torres | Silicon Valley Data Science
Real-time System Computing Engines

Real-time System Computing Engines
Steffen Peter | Levyx
The Right Stuff: Lessons Learned from a Decade of Data Engineering

The Right Stuff: Lessons Learned from a Decade of ...
Ben Hamner | Kaggle
How Engineer Angels Evaluate Data-Oriented Companies

How Engineer Angels Evaluate Data-Oriented ...
Jocelyn Goldfein | Zetta
The Future of Column-Oriented Data Processing with Arrow and Parquet

The Future of Column-Oriented Data Processing ...
Julien Le Dem | Datadog
Computational Social Science: Exciting Progress & Future Challenges

Computational Social Science: Exciting Progress & ...
Duncan Watts | Microsoft Research
Causal Inference in Data Science From Prediction to Causation

Causal Inference in Data Science From Prediction ...
Amit Sharma | Microsoft Research
Reinforcement Learning for Data Scientists

Reinforcement Learning for Data Scientists
Brian Farris | Bloomberg
How to Change a City with Data Science

How to Change a City with Data Science
Ben Wellington | Two Sigma
Python Data Wrangling: Preparing for the Future

Python Data Wrangling: Preparing for the Future
Wes McKinney | Posit
Lessons Learned Optimizing NoSQL for Apache Spark

Lessons Learned Optimizing NoSQL for Apache Spark
John Musser | Ford Motor Company
Unified Pipeline Architecture: The Evolution of Data Processing at Spotify

Unified Pipeline Architecture: The Evolution of ...
Erin Palmer | Spotify
SystemML & Spark: a Framework for Scalable Data Science Algorithm Development

SystemML & Spark: a Framework for Scalable Data ...
Jerome Nilmeier | IBM
Stop Obsessing about Data Infrastructure

Stop Obsessing about Data Infrastructure
Yair Weinberger | Alooma
Apache Kafka and the Rise of Stream Processing

Apache Kafka and the Rise of Stream Processing
Guozhang Wang | Confluent
Genomic Data Analysis with Spark & Hadoop

Genomic Data Analysis with Spark & Hadoop
Ryan Williams | Icahn School of Medicine at Mount Sinai
Anomaly Detection for Real-World Systems

Anomaly Detection for Real-World Systems
Manojit Nandi | STEALTHbits
Building a Cloud-Native SQL Database

Building a Cloud-Native SQL Database
Alex Robinson | Cockroach Labs
To Get the Value, Ditch the Hype

To Get the Value, Ditch the Hype
Nick Ursa | The New York Times
Statistical and Computational Challenges of Real-Time News Clustering

Statistical and Computational Challenges of ...
Jeiran Jahani | Chartbeat
Predicting Chaotic Systems with Sparse Data: Lessons from Numerical Weather Prediction

Predicting Chaotic Systems with Sparse Data: ...
David Kelly | New York University
The Trials and Tribulations of Scaling Data Science and Engineering

The Trials and Tribulations of Scaling Data ...
Ashley Miller | Datadog
Peloton: The Self-Driving Database Management System

Peloton: The Self-Driving Database Management ...
Andy Pavlo | Carnegie Mellon University
Elastic Big Data Platform at Datadog

Elastic Big Data Platform at Datadog
Doug Daniels | Datadog
Processing Geographic Data at Internet Scale

Processing Geographic Data at Internet Scale
Peter Lenz | Near
Bias, Variance and Adaptive Products

Bias, Variance and Adaptive Products
George Davis | Frame.ai
Career Panel - Leveling Up in Your Career as a Data Scientist/Engineer

Career Panel - Leveling Up in Your Career as a ...
Nick Chamandy | Lyft
VC Panel - The Present Future of Data-Oriented Startups | DataEngConf NY '16

VC Panel - The Present Future of Data-Oriented ...
Matt Hartman | Betaworks
Fighting Churn with Data

Fighting Churn with Data
Carl Gold | Zuora
How Data is Transforming Politics

How Data is Transforming Politics
Catherine Tarsney | Democratic National Committee
Scaling model training: from flexible training APIs to resource management with Kubernetes

Scaling model training: from flexible training ...
Kelley Rivoire | Stripe
Dagster: A New Programming Model for Data Processing

Dagster: A New Programming Model for Data ...
Nick Schrock | Dagster Labs
Time Series Prediction with TensorFlow

Time Series Prediction with TensorFlow
Jerome Nilmeier | IBM
Ray for Reinforcement Learning

Ray for Reinforcement Learning
Ion Stoica | Rise Lab (UC Berkeley)
Building a Distributed Data Access Layer for Analytics on Any Cloud

Building a Distributed Data Access Layer for ...
Bin Fan | Alluxio Inc
Notebooks as Functions with Papermill

Notebooks as Functions with Papermill
Matt Seal | Netflix
The history and anatomy of Apache Superset

The history and anatomy of Apache Superset
Max Beauchemin | Preset
Machine Learning Infrastructure at an Early Stage

Machine Learning Infrastructure at an Early Stage
Spencer Barton | Branch International
Actionable and Interpretable Predictions from a Stacked Model

Actionable and Interpretable Predictions from a ...
Austen Head | Quid
Scalability! But at What COST?

Scalability! But at What COST?
Frank McSherry | Materialize
Reducing Student Loans with Bot-Powered Humans

Reducing Student Loans with Bot-Powered Humans
William Falcon | Facebook / NYU
Building Bots, Building Blocks: How Forbes Experiments, Evaluates, and Kills Data-driven Bots

Building Bots, Building Blocks: How Forbes ...
Luis Capelo | Forbes
Tactical Data Engineering

Tactical Data Engineering
Julian Hyde | Google
Scaling Data Products Under Startup Constraints: A Case Study of ML Bias Testing

Scaling Data Products Under Startup Constraints: ...
Edwin Ong | TinyData
Operating Multi-Tenant Kafka Services for Developers on Heroku

Operating Multi-Tenant Kafka Services for ...
Ali Hamidi | Salesforce
Building a Lean AI Startup - Lessons Learned

Building a Lean AI Startup - Lessons Learned
Paul Cothenet | MadKudu
Scaling the best healthcare to everyone, with AI

Scaling the best healthcare to everyone, with AI
Anitha Kannan | Curai
Introducing Data Downtime: From Firefighting to Winning

Introducing Data Downtime: From Firefighting to ...
Barr Moses | Monte Carlo
Explaining AI: Putting Theory into Practice

Explaining AI: Putting Theory into Practice
Luke Merrick | Fiddler Labs
When Testing in Production is a Good Idea

When Testing in Production is a Good Idea
Dan Robinson | Heap
Accelerating Machine Learning with Training Data Management

Accelerating Machine Learning with Training Data ...
Alex Ratner | Stanford University
Making Friends with Generative Models

Making Friends with Generative Models
Andrew Colombi | Tonic
Amundsen: A Data Discovery Platform From Lyft

Amundsen: A Data Discovery Platform From Lyft
Tao Feng | Lyft
Powering Uber's global network analytics pipelines in near real-time with Apache Hudi (Incubating) Delta Streamer

Powering Uber's global network analytics ...
Nishith Agarwal | Uber
Introducing Switch: A Framework for Custom Data Applications

Introducing Switch: A Framework for Custom Data ...
Josh Ferguson | Mode
Appifying Data Science Workflows to Create Composable, User-Friendly Data Pipeline Products

Appifying Data Science Workflows to Create ...
Austen Head | Quid
End-to-end Exactly-once Aggregation Over Ad Streams

End-to-end Exactly-once Aggregation Over Ad ...
Amit Ramesh | Yelp
Transfer Learning in NLP - How to Help Small Teams Account for Small Datasets

Transfer Learning in NLP - How to Help Small ...
Ryan Smith | Wootric
Bighead: Airbnb's end-to-end Machine Learning Platform

Bighead: Airbnb's end-to-end Machine Learning ...
Andrew Hoh | Airbnb
Balancing Broad Data Access with Usability at Scale

Balancing Broad Data Access with Usability at ...
Austin Wilt | Slack
Real-Time Data Pipelines Made Easy with Structured Streaming in Apache Spark

Real-Time Data Pipelines Made Easy with ...
Tathagata Das | Databricks
Scalable Data Ingestion Architecture Using Airflow and Spark

Scalable Data Ingestion Architecture Using ...
Johannes Leppä | Komodo Health
Spatial Data Science Methods for Improving Models

Spatial Data Science Methods for Improving Models
Andy Eschbacher | Carto
Building a Programming By Example (PBE) Framework in Trifacta: Lessons learned and implications for Data Analytics

Building a Programming By Example (PBE) Framework ...
Anish Doshi | Trifacta
Split Learning: A Resource Efficient Distributed Deep Learning Method without Sensitive Data Sharing

Split Learning: A Resource Efficient Distributed ...
Praneeth Vepakomma | Massachusetts Institute of Technology
Building a Data-Powered Sales Intelligence Platform

Building a Data-Powered Sales Intelligence ...
Durgam Vahia | LinkedIn
Running Airflow reliably with Kubernetes

Running Airflow reliably with Kubernetes
Greg Neiheisel | Astronomer
Distributed SQL Databases Deconstructed

Distributed SQL Databases Deconstructed
Karthik Ranganathan | YugaByte
Swimming in the Data River, or, when “Streaming Analytics” isn’t

Swimming in the Data River, or, when “Streaming ...
Gian Merlino | Imply
Building Resilient Machine Learning Pipelines with IoT Data

Building Resilient Machine Learning Pipelines ...
Hedi Razavi | Keewi
Architecting a Low-Latency Schemaless SQL Engine

Architecting a Low-Latency Schemaless SQL Engine
Igor Canadi | Rockset
Data Security and Privacy in the Age of Machine Learning

Data Security and Privacy in the Age of Machine ...
Soups Ranjan | Revolut
Delphi: A Hybrid Approach to Forecasting a Global Marketplace

Delphi: A Hybrid Approach to Forecasting a Global ...
Kai Brusch | Airbnb
Building Real-Time Analytics Applications Using Apache Pinot: A Case Study of LinkedIn

Building Real-Time Analytics Applications Using ...
Kishore Gopalakrishna | Founding Engineer
Data Modeling and Processing for a Travel Super App

Data Modeling and Processing for a Travel Super ...
Rendy Bambang Junior | Traveloka
Argo: Kubernetes Native Workflows and Pipelines

Argo: Kubernetes Native Workflows and Pipelines
Greg Roodt | Canva
Data Architecture 101 for Your Business

Data Architecture 101 for Your Business
Bence Faludi | Independent Consultant
Causal Inference: Making the Right Intervention

Causal Inference: Making the Right Intervention
Paul Beaumont | QuantumBlack
Scaling Data Science Teams: Twitter's Perspective

Scaling Data Science Teams: Twitter's Perspective
Miguel Rios | Twitter
Building Data Products with Machine Learning at Zendesk

Building Data Products with Machine Learning at ...
Chris Hausler | Zendesk
Revenue Maximization in the Shared Bike Business Using Network Analysis and Geospatial Mapping

Revenue Maximization in the Shared Bike Business ...
Arpit Agarwal | Zoomcar
Sparklens: Understanding the Scalability Limits of Spark Applications

Sparklens: Understanding the Scalability Limits ...
Ashish Dubey | Qubole
A View from Apache Flink on Evolution and Outlooks for the Modern Stateful Stream Processor

A View from Apache Flink on Evolution and ...
Tzu-Li (Gordon) Tai | Ververica
Presto: Optimizing Performance of SQL-on-Anything

Presto: Optimizing Performance of SQL-on-Anything
Kamil Bajda-Pawlikowski | Starburst
Building Data Orchestration for Big Data Analytics in the Cloud

Building Data Orchestration for Big Data ...
Bin Fan | Alluxio Inc
7 Habits to Build Ethical AI

7 Habits to Build Ethical AI
Karthik Thirumalai | Teradata
Translating Source Code into Natural Language with AI

Translating Source Code into Natural Language ...
Mikhail Filippov | Quod AI
Taking Recommendation to the Masses

Taking Recommendation to the Masses
Le Zhang | Microsoft
Autoencoder Forest for Anomaly Detection from IoT Time Series

Autoencoder Forest for Anomaly Detection from IoT ...
Yiqun Hu | SP Group
Delivering ML Models the Safe and Sane Way

Delivering ML Models the Safe and Sane Way
David Tan | Thoughtworks
Building Data Engineering Teams

Building Data Engineering Teams
Wouter de Bie | Datadog
Murron: Reliable Logging Pipeline

Murron: Reliable Logging Pipeline
Ananth Packkildurai | Slack
Evolution of Data Ingestion and Product Instrumentation at Prezi

Evolution of Data Ingestion and Product ...
Tamás Németh | Prezi
Data at Marfeel: Addressing Complexity at Scale with the Latest Technologies

Data at Marfeel: Addressing Complexity at Scale ...
Alessandro Pregnolato | Marfeel
Kafka Streams in Production: From Use Case to Monitoring

Kafka Streams in Production: From Use Case to ...
Alexander Kudryashov | New Relic
GDPR: Discover The Main Challenges & Considerations

GDPR: Discover The Main Challenges & ...
Jordi Miró Bruix | Lernin Games
Blending Event Stream Processing with Machine Learning Using the Kafka Ecosystem

Blending Event Stream Processing with Machine ...
Andrea Spina | Radicalbit
A Federated Information Infrastructure That Works

A Federated Information Infrastructure That Works
Xavier Gumara Rigol | Adevinta
Explainable AI

Explainable AI
Ricardo Baeza-Yates | NTENT
A Machine Learning Approach to Optimize Prices During Clearance Sales at MANGO

A Machine Learning Approach to Optimize Prices ...
Carmen Herrero | MANGO
Uncertainty-Aware Food Recognition by Deep Learning

Uncertainty-Aware Food Recognition by Deep ...
Petia Radeva | University of Barcelona
Chatbots at Nestle: Improving Performance on Intent Detection

Chatbots at Nestle: Improving Performance on ...
Maria Crosas Batista | Nestlé
Talking Bayes to Business: A/B Testing Use Case

Talking Bayes to Business: A/B Testing Use Case
Yizhar Toren | Shopify
Machine Learning for Brain Health and Understanding at Starlab Neuroscience

Machine Learning for Brain Health and ...
Aureli Soria-Frisch | Starlab Consulting Division
Improving Search with Natural Language Processing and Deep Learning

Improving Search with Natural Language Processing ...
Markus Ludwig | Scout24
Rethinking Transportation in Cities: Making Traffic Smarter Through Optimization and Location Intelligence

Rethinking Transportation in Cities: Making ...
Miguel Alvarez | CARTO
Why We Defined a Metalanguage for SQL

Why We Defined a Metalanguage for SQL
Lewis Hemens | Dataform
Building a 1500-Class Listing Categorizer from Implicit User Feedback

Building a 1500-Class Listing Categorizer from ...
Arnau Tibau Puig | Letgo
The Case for Metadata for Machine Learning Platforms

The Case for Metadata for Machine Learning ...
Joerg Schad | ArangoDB
VEA: Validating, Evolving & Anonymizing Data in Real Time

VEA: Validating, Evolving & Anonymizing Data in ...
Albert Franzi Cros | Alpha Health
It's About Time: An Introduction to Timely Dataflow

It's About Time: An Introduction to Timely ...
Malte Sandstede | Clockworks
Stream Processing Beyond Streaming Data

Stream Processing Beyond Streaming Data
Timo Walther | Confluent
Kubernetes as a Streaming Data Platform: A Federated Operator Approach

Kubernetes as a Streaming Data Platform: A ...
Gerard Maas | Lightbend
Intelligent Orchestration - Data's Missing Link

Intelligent Orchestration - Data's Missing Link
Sean Knapp | Ascend
Building a Scalable Real-Time Data Pipeline

Building a Scalable Real-Time Data Pipeline
Vicente Valls Rios | Delivery Hero
Emotion Recognition in Images and Text

Emotion Recognition in Images and Text
Agata Lapedriza | UOC / MIT Media Lab
Issue with Tracking? Fail That Build!

Issue with Tracking? Fail That Build!
Steve Coppin-Smith | Snowplow Analytics
A Data- and Knowledge-Driven Approach for Mental Health Policy-Making at World Health Organization

A Data- and Knowledge-Driven Approach for Mental ...
Karina Gibert | UPC
Cloud-Based Practices for Effective Data Science at Scale

Cloud-Based Practices for Effective Data Science ...
Tiago Henriques | Microsoft
From Water Purification to Science Documentaries: Industrial Applications of AI, High Performance Computing, and Data Visualization

From Water Purification to Science Documentaries: ...
Fernando Cucchietti | Barcelona Supercomputing Center
Data Stories by Humans, for Humans

Data Stories by Humans, for Humans
Xaquín G.V. |
News in the Age of Algorithmic Recommendation

News in the Age of Algorithmic Recommendation
Nick Rockwell | The New York Times
One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques

One Explanation Does Not Fit All: A Toolkit and ...
Ronny Luss | IBM Research AI
From Batch to Streaming to Both: The Data Platform at Skyscanner

From Batch to Streaming to Both: The Data ...
Herman Schaaf | Skyscanner
A Label-Free World - Current State Of Unsupervised Deep Learning

A Label-Free World - Current State Of ...
William Falcon | Facebook / NYU
Online Learning of Website Embeddings for Accurate Prediction of User Behavior Even When Data Are Scarce

Online Learning of Website Embeddings for ...
Amelia White | Dstillery
Tailor-S: Look What You Made Me Do

Tailor-S: Look What You Made Me Do
Vadim Semenov | Datadog
Testing and Documenting Your Data <br> Doesn't Have to Suck

Testing and Documenting Your Data
Doesn't ...

Abe Gong | Great Expectations
The Materialize Incremental View Maintenance Engine

The Materialize Incremental View Maintenance ...
Frank McSherry | Materialize
A Modern Love Story: Machine Learning & The Global Sports Betting Industry

A Modern Love Story: Machine Learning & The ...
Lloyd Danzig | ICED(AI)
Empowering Customer-Facing Teams With Voice-Based AI

Empowering Customer-Facing Teams With Voice-Based ...
Yev Meyer | Guru
Accelerate Source to Signal: Data Engineering Efficiency

Accelerate Source to Signal: Data Engineering ...
Mark Etherington | Crux Informatics
Real-time SQL Stream Processing at Scale with Apache Kafka and KSQL

Real-time SQL Stream Processing at Scale with ...
Viktor Gamov | Confluent
Combating AI Bias at Scale

Combating AI Bias at Scale
Lucy Vasserman | Jigsaw
Kubernetes-Native Workflow Orchestration with Argo

Kubernetes-Native Workflow Orchestration with Argo
Kai Rikhye | Skillshare
3 Best Practices for Data Organizations: Structure, ROI, Communications

3 Best Practices for Data Organizations: ...
Barr Moses | Monte Carlo
Building an End-to-End Data Stack in Thirty Minutes

Building an End-to-End Data Stack in Thirty ...
Benn Stancil | ThoughtSpot
Messy Data and Reluctant Users - The Trouble with Healthcare Data

Messy Data and Reluctant Users - The Trouble with ...
Samantha Bail |
Building a Knowledge Graph Using Messy Real Estate Data

Building a Knowledge Graph Using Messy Real ...
John Maiden | Cherre
Intrinsic Autoregressive Models in Stan

Intrinsic Autoregressive Models in Stan
Susana Marquez | Rockefeller Foundation
Data Scientist, or the Most Dangerous Job of the 21st Century

Data Scientist, or the Most Dangerous Job of the ...
Hugo Bowne-Anderson | DataCamp
Augmented Programming

Augmented Programming
Gideon Mann | Bloomberg
Building Efficient ML Pipelines and Responsible AI Solutions

Building Efficient ML Pipelines and Responsible ...
Adi Polak | Treeverse
Statistical Aspects of Distributed Tracing

Statistical Aspects of Distributed Tracing
Joe Ross | Splunk
Reducing Flight Delays with Kubernetes and Tensorflow

Reducing Flight Delays with Kubernetes and ...
Daniel van der Ende | GoDataDriven
Creating Knowledge Graphs via a Symbiosis of Data Science and Data Engineering

Creating Knowledge Graphs via a Symbiosis of Data ...
Maureen Teyssier | Reonomy
Leveraging Compute

Leveraging Compute
Riva-Melissa Tez | Intel Corporation
Building Systems to Monitor Data and Model Health in Production Systems

Building Systems to Monitor Data and Model Health ...
Mohammed Ridwanul | Dessa
Monarch, Google’s Planet-Scale Streaming Monitoring Infrastructure

Monarch, Google’s Planet-Scale Streaming ...
George Talbot | Google
Valid Inference after Model Selection and the selectiveInference Package

Valid Inference after Model Selection and the ...
Joshua Loftus | NYU
The Observatorium - Using Machine Learning and Observability Together to Reduce Incident Impact

The Observatorium - Using Machine Learning and ...
Alex Kass | DigitalOcean
Uncovering the Potential of TensorFlow 2.0

Uncovering the Potential of TensorFlow 2.0
Jerome Nilmeier | IBM
CockroachDB: Architecture of a Geo-Distributed SQL Database

CockroachDB: Architecture of a Geo-Distributed ...
Nathan VanBenschoten | Cockroach Labs
Leveraging Stateful Functions to Power the Next Generation of Event-Driven Applications

Leveraging Stateful Functions to Power the Next ...
Seth Wiesman | Ververica
Reproducibility in Data Science

Reproducibility in Data Science
Juliana Freire | NYU
Zipline - Airbnb's Declarative Feature Engineering Framework

Zipline - Airbnb's Declarative Feature ...
Nikhil Simha | Airbnb
Dagster: Workflows for Data Science, Machine Learning, and Data Engineering

Dagster: Workflows for Data Science, Machine ...
Nick Schrock | Dagster Labs
Meet dbt: The Data Transformation Tool Used by JetBlue, GitLab, Wistia, and Away

Meet dbt: The Data Transformation Tool Used by ...
Jeremy Cohen | Fishtown Analytics
Time to Rethink Visual Data Management for Machine Learning

Time to Rethink Visual Data Management for ...
Vishakha Gupta-Cledat | ApertureData
Amundsen - From Discovering Data to Securing Data

Amundsen - From Discovering Data to Securing Data
Mark Grover | Stemma
The Unreasonable Effectiveness of Product Sense

The Unreasonable Effectiveness of Product Sense
Vitaly Gordon | Faros AI
Real-time Retrieval with Deep Learning: Benefits and Challenges

Real-time Retrieval with Deep Learning: Benefits ...
Edo Liberty | HyperCube
Data Reliability for Data Lakes

Data Reliability for Data Lakes
Michael Armbrust | Databricks
Agile AI: From Research, Production to Customer Adoption

Agile AI: From Research, Production to Customer ...
Zineb Laraki | Salesforce
Flyte: Cloud Native Machine Learning & Data Processing Platform

Flyte: Cloud Native Machine Learning & Data ...
Haytham Abuelfutuh | Lyft
Pitfalls and Challenges of ML-Powered Applications

Pitfalls and Challenges of ML-Powered Applications
Emmanuel Ameisen | Stripe
Data Lineage with Apache Airflow

Data Lineage with Apache Airflow
Willy Lulciuc | Datakin
Lessons Developing Conversational AI Virtual Agents

Lessons Developing Conversational AI Virtual ...
Mitul Tiwari | Stealth
MESA: Building a Personalized Messaging System at Netflix

MESA: Building a Personalized Messaging System at ...
Grace Huang | Netflix
Federated Learning and Analytics at Google and Beyond

Federated Learning and Analytics at Google and ...
Peter Kairouz | Google
Responsible AI – Model Interpretability and Fairness

Responsible AI – Model Interpretability and ...
Mehrnoosh Sameki | Microsoft
Building High Performance Recommender Systems with Feature Stores

Building High Performance Recommender Systems ...
Danny Chiao | Tecton
Data Apps: Data warehouse as a platform

Data Apps: Data warehouse as a platform
Kashish Gupta | Hightouch
How to Create a Shared, Open Standard for Data Quality

How to Create a Shared, Open Standard for Data ...
Abe Gong | Great Expectations
Materialize+dbt: Streaming for the Modern Data Stack

Materialize+dbt: Streaming for the Modern Data ...
Jessica Laughlin | Materialize
The Data Practitioners Guide to Data Discovery

The Data Practitioners Guide to Data Discovery
Shirshanka Das | Acryl Data
Using GIT as a NoSQL Database for Fine-Grained Control Over the Data Pipeline

Using GIT as a NoSQL Database for Fine-Grained ...
Hung Dang | Y42
Scaling up your pandas workflows with Modin

Scaling up your pandas workflows with Modin
Devin Petersohn | Ponder
Getting ROI from Experimentation: How AB Experimentation plays out in Organizations

Getting ROI from Experimentation: How AB ...
Chad Sanderson | Gable.ai
Making Humans and Code GPU-Capable at Mailchimp

Making Humans and Code GPU-Capable at Mailchimp
Emily Curtin | Intuit Mailchimp
Beyond Linear Notebooks: Implementing Reactivity with IPython

Beyond Linear Notebooks: Implementing Reactivity ...
Caitlin Colgrove | Hex
The Modern Stack for ML Infrastructure

The Modern Stack for ML Infrastructure
Ville Tuulos | Outerbounds
Lakehouse or Warehouse - Where Should You Live?

Lakehouse or Warehouse - Where Should You Live?
Vinoth Chandar | Onehouse
Data Reliability Engineering: A New Approach to Data Quality

Data Reliability Engineering: A New Approach to ...
Egor Gryaznov | Bigeye
Apache Flink Adoption at Shopify

Apache Flink Adoption at Shopify
Yaroslav Tkachenko | Shopify
Ray: A Framework for Scaling and Distributing Python & ML Applications

Ray: A Framework for Scaling and Distributing ...
Jules Damji | Anyscale
The Return of the OLAP Cube

The Return of the OLAP Cube
Benn Stancil | ThoughtSpot
The Future of Business Intelligence

The Future of Business Intelligence
Max Beauchemin | Preset
Data Lineage with Apache Airflow using OpenLineage

Data Lineage with Apache Airflow using OpenLineage
Willy Lulciuc | Datakin
Exploring Data Through Natural-Language & Conversational Analytics

Exploring Data Through Natural-Language & ...
Anand Ranganathan | Unscrambl
Towards Human-AI Teaming: Intelligence Ecosystems to Tackle High-Stakes Use Cases

Towards Human-AI Teaming: Intelligence Ecosystems ...
Clodéric Mars | AIR
The Life-Changing Magic of Data Governance

The Life-Changing Magic of Data Governance
Paula Griffin | Zip Recruiter
AI Monitoring and Explainability - the Critical, Hidden Connection

AI Monitoring and Explainability - the Critical, ...
Anupam Datta | TruEra
Rethinking Orchestration as Reconciliation: Software-Defined Assets in Dagster

Rethinking Orchestration as Reconciliation: ...
Sandy Ryza | Elementl
Type-Safe Data Processing and Machine Learning Pipelines with Flyte and Pandera

Type-Safe Data Processing and Machine Learning ...
Niels Bantilan | Union
Why You Shouldn’t Care About Iceberg

Why You Shouldn’t Care About Iceberg
Ryan Blue | Databricks
Arguments Against Hand Labeling

Arguments Against Hand Labeling
Shayan Mohanty | Watchful
Data and AI in Construction

Data and AI in Construction
Alvaro Soto | Procore Technologies
How to Teach an Old Dog New Tricks: Modernizing a Business with Data Products

How to Teach an Old Dog New Tricks: Modernizing a ...
Julie Hollek | Mozilla
A Quick Tour of the Orchest OSS project

A Quick Tour of the Orchest OSS project
Rick Lamers | Orchest
How to Finally Get Self-Service ... By Ditching the Star Schema

How to Finally Get Self-Service ... By Ditching ...
Ahmed Elsamadisi | Narrator
Privacy Plus Utility: Preserving Data Insights with State-of-the-Art Privacy Protection

Privacy Plus Utility: Preserving Data Insights ...
Will Thompson | Privacy Dynamics
 The Case for Declarative Machine Learning

The Case for Declarative Machine Learning
Tristan Zajonc | Continual
It's The Data, Stupid!

It's The Data, Stupid!
Peter Gao | Aquarium
Iggy: Build better models with location data

Iggy: Build better models with location data
Anirudh Shah | Iggy
Designing for life or death: building data products for biopharma operations with Fathom

Designing for life or death: building data ...
Clare Gollnick | Fathom
High-performance Model Serving in Python, using BentoML

High-performance Model Serving in Python, using ...
Chaoyu Yang | BentoML
Building Responsible AI: Best Practices Across the Product Development Lifecycle

Building Responsible AI: Best Practices Across ...
Susannah Shattuck | Credo AI
How 200+ Leaders Made Business Data Work Harder

How 200+ Leaders Made Business Data Work Harder
Jesika Haria | LogicLoop
Data in, privacy out: data streams for the privacy age with STRM Privacy

Data in, privacy out: data streams for the ...
Pim Nauts | STRM Privacy
Dashboards, dashboards everywhere: how to fix analytics collaboration

Dashboards, dashboards everywhere: how to fix ...
Robert Yi | Hyperquery
Rikai: a new data format for analytics on unstructured data at scale

Rikai: a new data format for analytics on ...
Chang She | LanceDB
Just Because Your Data Is Unstructured Doesn’t Mean it Should be Onerous

Just Because Your Data Is Unstructured Doesn’t ...
Vishakha Gupta-Cledat | ApertureData
Data Discovery: Getting More From Your Metadata

Data Discovery: Getting More From Your Metadata
Shinji Kim | Select Star
Observability at the long tail: why sampling production data doesn’t work for rare events

Observability at the long tail: why sampling ...
Bernease Herman | WhyLabs
Don't Let Your ML Models Decay!

Don't Let Your ML Models Decay!
Bastien Boutonnet | Soda
What is a Dataset? Emerging Core Concepts in the Modern Data Stack

What is a Dataset? Emerging Core Concepts in the ...
Nick Schrock | Dagster Labs
Enterprise Data Science Comes of Age

Enterprise Data Science Comes of Age
Peter Wang | Anaconda
The Next Big Opportunities in Data Infrastructure

The Next Big Opportunities in Data Infrastructure
Oana Olteanu | SignalFire
Get Ready for ML! Level Up Your Data Lake with Delta and lakeFS

Get Ready for ML! Level Up Your Data Lake with ...
Adi Polak | Treeverse
The Sky's the Limit -- RE: The Next 10 Years of Data Infra on Clouds

The Sky's the Limit -- RE: The Next 10 Years of ...
Mingsheng Hong | Bluesky Data
DevOps for Machine Learning and other half-Truths Processes and tools for the ML lifecycle

DevOps for Machine Learning and other half-Truths ...
Diego Oppenheimer | DataRobot
Notebooks Got Your Modern Data Back

Notebooks Got Your Modern Data Back
Elizabeth Dlha | Deepnote
Ten years of building open source standards: From Parquet to Arrow to OpenLineage

Ten years of building open source standards: From ...
Julien Le Dem | Datadog
Creating the Right Developer Community for Your Company

Creating the Right Developer Community for Your ...
Wesley Faulkner | AWS
A deep dive into the dbt manifest

A deep dive into the dbt manifest
Aaron Richter
Generative AI and the Natural Language Interface for Data

Generative AI and the Natural Language Interface ...
Sarah Nagy | Seek AI
Creating our own Kubernetes and Docker to run our data infrastructure

Creating our own Kubernetes and Docker to run our ...
Erik Bernhardsson | Modal Labs
Extinguishing the Garbage Fire of ML Testing

Extinguishing the Garbage Fire of ML Testing
Emily Curtin | Intuit Mailchimp
Innovating on Software Development

Innovating on Software Development
Hamel Husain | Parlance Labs
Cubing and Metrics in SQL, oh my!

Cubing and Metrics in SQL, oh my!
Julian Hyde | Google
Hot Takes and Tragic Mistakes: How (not) to Integrate Data People in Your App Dev Team Workflows

Hot Takes and Tragic Mistakes: How (not) to ...
Noelle Saldana |
Behind the Curtain: What it Takes to Support the World's Most Popular Open Source Communities

Behind the Curtain: What it Takes to Support the ...
Katrina Riehl | NumFOCUS
A New Era of Applied AI: How to Accelerate Enterprise Adoption of AI for Business Impact

A New Era of Applied AI: How to Accelerate ...
Gaurav Rao | AtScale
Writing Unit Tests for Data Science Code

Writing Unit Tests for Data Science Code
Dr. Nile Wilson | Microsoft
MLOps for League of Legends - Heimerdinger Toolbelt

MLOps for League of Legends - Heimerdinger ...
Ian Schweer | Riot Games
Why People Started Testing Their Models and Data in CI / CD Pipelines

Why People Started Testing Their Models and Data ...
Shir Chorev | Deepchecks
Creating Self Service, High Velocity Data Cultures

Creating Self Service, High Velocity Data Cultures
DeVaris Brown | Meroxa
Malloy - An Experimental Language for Data

Malloy - An Experimental Language for Data
|
Conversation Simulator: A Real Life Case Leveraging OpenAI's API

Conversation Simulator: A Real Life Case ...
Maddie Schults | Crisis Text Line
Building a better world with AI, one architectural drawing at a time

Building a better world with AI, one ...
Jean-Pierre Trou | mbue
Designing and Building Metric Trees

Designing and Building Metric Trees
Abhi Sivasailam | Levers Labs
How to Be a 10x Analyst

How to Be a 10x Analyst
Robert Yi | Hyperquery
Building a Business Review Program from Scratch

Building a Business Review Program from Scratch
Katie Bauer | GlossGenius
The Missing Manual: Everything you need to know about Snowflake optimization

The Missing Manual: Everything you need to know ...
Niall Woodward | SELECT
From 1 to IPO: Growing the Data Team and Data Culture at GitLab

From 1 to IPO: Growing the Data Team and Data ...
Taylor Murphy | Meltano
Hierarchical Forecasting in Python

Hierarchical Forecasting in Python
Max Mergenthaler | Nixtla
Evolving AI Laws and the Imperative to Build Safe, Compliant, and Risk-proof AI

Evolving AI Laws and the Imperative to Build ...
Ayush Patel | Censius
Generative AI for Product Builders

Generative AI for Product Builders
Tristan Zajonc | Continual
Generative AI for Search

Generative AI for Search
D. Sivakumar | Tonita
How Vercel Builds Dozens of Metrics from One Heterogenous Table

How Vercel Builds Dozens of Metrics from One ...
Thomas Mickley-Doyle | Vercel
Data Contracts: Accountable Data Quality

Data Contracts: Accountable Data Quality
Chad Sanderson | Gable.ai
DataOps for Business Intelligence: How

DataOps for Business Intelligence: How ...
Dan Eisenberg | Hashboard (formerly Glean)
Everything I Know About Data Science I Learned From Model Railroading

Everything I Know About Data Science I Learned ...
Peter Lenz | Near
Publishing Jupyter Notebooks with Quarto

Publishing Jupyter Notebooks with Quarto
J.J. Allaire | Posit / RStudio
Feed The Alligators With the Lights On: How Data Engineers Can See Who Really Uses Data

Feed The Alligators With the Lights On: How Data ...
Mark Grover | Stemma
Data Warehouses are Gilded Cages. What Comes Next?

Data Warehouses are Gilded Cages. What Comes Next?
Tino Tereshko | MotherDuck
Scalable and Sustainable Feature Engineering with Hamilton

Scalable and Sustainable Feature Engineering with ...
|
The state of cross-company data exchange

The state of cross-company data exchange
Pardis Noorzad | General Folders
Extreme Self-Service: Turning Data Consumers into Data Constructors

Extreme Self-Service: Turning Data Consumers into ...
Alice Leach | Whatnot
ML in Production – What Does “Production” Even Mean?

ML in Production – What Does “Production” Even ...
Dean Pleban | Dagshub
Data Contracts in the Modern Data Stack

Data Contracts in the Modern Data Stack
Zachary Klein | Whatnot
Scaling Experimentation to 20 Billion Users

Scaling Experimentation to 20 Billion Users
Timothy Chan | Statsig
Data Products Aren't Just for Data Teams!

Data Products Aren't Just for Data Teams!
Katie Hindson | Lightdash
Modern Data Management - How to Set Your Data Team Up for Success

Modern Data Management - How to Set Your Data ...
Alec Bialosky | Select Star
Continuous Data Pipeline for Real-time Benchmarking & Data Set Augmentation

Continuous Data Pipeline for Real-time ...
Ivan Aguilar | Teleskope
How to End the Long-tail of Most Data Requests?

How to End the Long-tail of Most Data Requests?
Ahmed Elsamadisi | Narrator
Incident Management for Data People

Incident Management for Data People
Kyle Kirwan | Bigeye
Automatically Fix Data Issues & Label Errors in Most ML Datasets

Automatically Fix Data Issues & Label Errors in ...
Curtis Northcutt | Cleanlab
How to Build a Streaming Database in Three Challenging Steps

How to Build a Streaming Database in Three ...
Frank McSherry | Materialize
CDC Stream Processing with Apache Flink

CDC Stream Processing with Apache Flink
Timo Walther | Confluent
Change Data Streaming Patterns With Debezium & Apache Flink

Change Data Streaming Patterns With Debezium & ...
Gunnar Morling | Decodable
Making Moves with Arrow Data: Introducing Arrow Database Connectivity (ADBC)

Making Moves with Arrow Data: Introducing Arrow ...
Matthew Topol | Voltron Data
The Fun-Sized MLOps Stack from Scratch

The Fun-Sized MLOps Stack from Scratch
Mikiko Bazeley | Featureform
HuggingFace + Ray AIR Integration: A Python Developer’s Guide to Scaling Transformers

HuggingFace + Ray AIR Integration: A Python ...
Antoni Baum | Anyscale Inc
The Story of DevRel at Snowflake: How We Got Here

The Story of DevRel at Snowflake: How We Got Here
Felipe Hoffa | Snowflake
Getting Real(-Time): When to move from Batch to Streaming (and how to do it without hiring an entirely new team)

Getting Real(-Time): When to move from Batch to ...
Zander Matheson | Bytewax
Data Product Success: Aligning with Data's Core Purpose - A Framework for Data Product Management for Increasing Adoption  & User Love

Data Product Success: Aligning with Data's Core ...
Rick Saporta | Entera
How to Ensure Your Model Does Not Drift? From Human-In-The-Loop Concept to Building Fully Adaptive Ml Models Using Crowdsourcing

How to Ensure Your Model Does Not Drift? From ...
Fedor Zhdanov | Toloka
The End of History? Convergence of Batch and Realtime Data Technologies

The End of History? Convergence of Batch and ...
Matt Housley | Ternary Data
Scaling Uber Metric System from Elasticsearch to Pinot

Scaling Uber Metric System from Elasticsearch to ...
Yupeng Fu | Uber
What I Don’t Want To Exist In The Data World In 5 Years

What I Don’t Want To Exist In The Data World In 5 ...
Ben Rogojan | Seattle Data Guy
Real-time Schema Discovery

Real-time Schema Discovery
Daniel Selans | Streamdal
This App Ends Tantrums: How ML, NLP, and Five Minutes of Playtime Help Parents, Caregivers, and Children Enjoy Life Together

This App Ends Tantrums: How ML, NLP, and Five ...
Mady Mantha | Happypillar
Designing for Intelligence at GitHub Next: Patterns and Practices for Making AI-powered Products

Designing for Intelligence at GitHub Next: ...
Idan Gazit | GitHub
The things I wish I knew --  What I've gotten right and wrong from startups to the White House, and the world ahead

The things I wish I knew -- What I've gotten ...
DJ Patil |
AI - The Future is Now

AI - The Future is Now
Idan Gazit | GitHub
Hot or Not? Trends & Buzzwords in Data

Hot or Not? Trends & Buzzwords in Data
Julia Schottenstein | dbt Labs
Govern Your Data Clients - the Right Way to Scale

Govern Your Data Clients - the Right Way to Scale
Yaniv Ben Hemo | Memphis
How Freewheel Processes Billions of Ad-tech Events in Real-time

How Freewheel Processes Billions of Ad-tech ...
Margi Dubal | Freewheel
How Investors Think About Data

How Investors Think About Data
Pete Soderling | Data Community Fund
Big Data is Dead

Big Data is Dead
Jordan Tigani | MotherDuck
Building a Control Plane for Data

Building a Control Plane for Data
Shirshanka Das | Acryl Data
Building an ML Experimentation Platform for Easy Reproducibility

Building an ML Experimentation Platform for Easy ...
Vino Duraisamy | Treeverse
The Road to Exceptional Data Correctness

The Road to Exceptional Data Correctness
Emma Tang |
Latency is the Mind Killer: it’s Time to Reimagine Data Interactions

Latency is the Mind Killer: it’s Time to ...
David Wilson | Hunch Tools
LLM’s & Semantic Layer: Self-serve has Entered the Chat

LLM’s & Semantic Layer: Self-serve has Entered ...
Paul Blankley | Zenlytic
How a

How a "Less is More" Approach Stems the Sprawl of ...
Mahendra Kutare | Deeptrail
Bringing Accuracy and Predictability to Software's Most Intractable Problem: Forecasting Development Timelines and On-Time Delivery

Bringing Accuracy and Predictability to ...
Bart Palmer | Cardagraph
Build AI Apps with Llama 2 in 10 Minutes with Snowflake Cortex

Build AI Apps with Llama 2 in 10 Minutes with ...
Gilberto Hernandez | Snowflake
Building a Holistic SQL Chatbot that Solves Real Problems for People in Tech and the Business

Building a Holistic SQL Chatbot that Solves Real ...
Noy Twerski | Sherloq Data
How to build a GenAI Application with Vectara - a Step-by-Step Guide

How to build a GenAI Application with Vectara - a ...
Ofer Mendelevitch | Vectara
Innovating with Open Generative AI

Innovating with Open Generative AI
Amit Sangani | Meta
Redefining Database Workloads: The Future with Modern Object Storage

Redefining Database Workloads: The Future with ...
Brenna Buuck | MinIO
Introducing lolpop: an Open Source Framework for Machine Learning Workflows

Introducing lolpop: an Open Source Framework for ...
Jordan Volz | lolpop
Using AI, Mathematics, and Statistics to Find Similar Data in Massive Data Ecosystems

Using AI, Mathematics, and Statistics to Find ...
Eric Warner | Collibra
The Future Roadmap for the Composable Data Stack

The Future Roadmap for the Composable Data Stack
Wes McKinney | Posit
Case Studies from a Methodologist on an Experimentation Platform

Case Studies from a Methodologist on an ...
Laura Cosgrove | Microsoft
Real-Time Analytics for Small Data Teams

Real-Time Analytics for Small Data Teams
Matt Helm | Materialize
Building a User-Level Targeting Platform

Building a User-Level Targeting Platform
Alex Wood-Doughty | Monocle
How Beam uses Code-based dashboards to Scale Analytics Products

How Beam uses Code-based dashboards to Scale ...
Emilio Tamez | Beam
Building a Unified Feature Platform with DuckDB and Arrow

Building a Unified Feature Platform with DuckDB ...
Michael Eastham | Tecton
Building a Flexible Data Platform for LLM Training Data

Building a Flexible Data Platform for LLM ...
Jonathan Talmi | Cohere
Building an Ecosystem for Open Foundation Models, Together

Building an Ecosystem for Open Foundation Models, ...
Ce Zhang | Together
How to Align AI Capabilities with Product Strategy so You Can Innovate

How to Align AI Capabilities with Product ...
Noelle Saldana |
Driving Revenue By Getting Your Data Outside Your Company

Driving Revenue By Getting Your Data Outside Your ...
Solomon Kahn | Delivery Layer
From Twilio to Propel: Building Real-Time Customer-Facing Analytics at Scale

From Twilio to Propel: Building Real-Time ...
Nico Acosta | Propel
Building InfluxDB 3.0 with Apache Arrow, DataFusion, Flight and Parquet

Building InfluxDB 3.0 with Apache Arrow, ...
Andrew Lamb | InfluxData
Open Data Foundations across Hudi, Iceberg and Delta

Open Data Foundations across Hudi, Iceberg and ...
Kyle Weller | Onehouse
Optimization and Contextual Bandits at Stripe

Optimization and Contextual Bandits at Stripe
Brent Cohn | Stripe
Data Mesh: The Next Stage in the Evolution from Time Share to Data Share

Data Mesh: The Next Stage in the Evolution from ...
Zhamak Dehghani | Nextdata
Cost Containment–A Critical Piece of your Data Team's ROI

Cost Containment–A Critical Piece of your Data ...
Lindsay Murphy | Hiive
Foundations for a Multi-Modal Lakehouse for AI

Foundations for a Multi-Modal Lakehouse for AI
Chang She | LanceDB
Unlocking Reliable GenAI: Strategies for Assessing LLMs in Real-World Applications

Unlocking Reliable GenAI: Strategies for ...
Dhruv Singh | HoneyHive AI
Evolving Data Pipelines at Scale

Evolving Data Pipelines at Scale
Iaroslav Zeigerman | Tobiko
Scaling Data Reliably: a Journey in Growing Through Data Pain Points

Scaling Data Reliably: a Journey in Growing ...
Miriah Peterson | MX
 A 101 in Time Series Analytics with Apache Arrow, Pandas and Parquet

A 101 in Time Series Analytics with Apache ...
Zoe Steinkamp | InfluxData
Beyond Simple A/B Testing: Advanced Experimentation Tactics

Beyond Simple A/B Testing: Advanced ...
Timothy Chan | Statsig
Predictive Auto-Scaling at MongoDB

Predictive Auto-Scaling at MongoDB
A. Jesse Jiryu Davis | MongoDB
Building Responsible and Trustworthy Generative AI Products at LinkedIn

Building Responsible and Trustworthy Generative ...
Daniel Olmedilla | LinkedIn
Beyond MLOps: Building AI systems with Metaflow

Beyond MLOps: Building AI systems with Metaflow
Ville Tuulos | Outerbounds
Is Kubernetes a Database?

Is Kubernetes a Database?
Ryanne Dolan | LinkedIn
Enabling Data Centric Solutions through Modern Schema Management

Enabling Data Centric Solutions through Modern ...
Aaron Taylor | Nalej
The End of Data Governance as We Know It

The End of Data Governance as We Know It
Kirit Basu | Metaphor
Panel: Data Lineage We’ve Come a Long Way

Panel: Data Lineage We’ve Come a Long Way
Harel Shein | Datadog
Empowering Data Teams: A Step-by-Step Playbook for Leads and Managers

Empowering Data Teams: A Step-by-Step Playbook ...
C.J. Jameson | Monte Carlo
Events Sourcing with Kafka at Scale

Events Sourcing with Kafka at Scale
Alex Martin | Tinybird
Why Streaming SQL? The Semantics and Challenges of Applying SQL to Unbounded Data

Why Streaming SQL? The Semantics and Challenges ...
Micah Wylde | Arroyo Sytems
Rising Tides with Radical Transparency: Why and How to Open Source Your Data Platform

Rising Tides with Radical Transparency: Why and ...
Tim Castillo | Dagster Labs
Streamlining Entry Into Streaming Analytics with JupyterHub and Apache Flink

Streamlining Entry Into Streaming Analytics with ...
Elkhan Dadashov | Apple
Unified Stream/Batch Execution with Ibis

Unified Stream/Batch Execution with Ibis
Deepyaman Datta | Voltron Data
How to Use Your Development Data to Make LLMs Code Like You and Your Team

How to Use Your Development Data to Make LLMs ...
Tyler Dunn | Continue
The Reality of Building a Modern AI Data Stack

The Reality of Building a Modern AI Data Stack
Colleen Tartow | Capital One
From Playgrounds to Production: The Evolution of AI Evaluation at Coda

From Playgrounds to Production: The Evolution of ...
Kenny Wong | Coda
Processing Trillions of Records at Okta with Mini Serverless Databases

Processing Trillions of Records at Okta with Mini ...
Jake Thomas | Okta
Data Culture 2.0: Leveraging AI to Build Human Connections and Expand Your Influence

Data Culture 2.0: Leveraging AI to Build Human ...
Celina Wong | Data Culture
LLM Observability and Evaluations

LLM Observability and Evaluations
Amber Roberts | Arize AI
Declarative Orchestration: Why You’ve Been Thinking About the Wrong DAG the Entire Time

Declarative Orchestration: Why You’ve Been ...
Pete Hunt | Dagster Labs
Build Faster, More Responsive Analytics with a Semantic Layer

Build Faster, More Responsive Analytics with a ...
Paco Valdez | Cube
WTF is an Analytics Lake: Building an Open Data Service Layer with Arrow, DuckDB and Semantic Layer

WTF is an Analytics Lake: Building an Open Data ...
Ryan Dolley | GoodData
Bridging the Gap Between Batch and Real-Time with Mixed-Latency Pipelines

Bridging the Gap Between Batch and Real-Time with ...
Will Goldstein | Twirl
Move Fast and Don't Break Things -- how to build a data platform that scales with your organization

Move Fast and Don't Break Things -- how to build ...
Elijah Ben Izzy | DAGWorks Inc.
Fast ReActions: Planning and Reasoning Quickly with LLMs

Fast ReActions: Planning and Reasoning Quickly ...
Ashwin Ramesh | Continual
Optimizing Time Series Data in Mixed Architectures with QuestDB

Optimizing Time Series Data in Mixed ...
Adam Cimarosti | QuestDB
Data for impact: levering LLMs for nonprofit social research

Data for impact: levering LLMs for nonprofit ...
Tobias Lunt | Development Data Lab
Give Rust a Chance

Give Rust a Chance
Slater Stich | Bain Capital
Creating a Purpose-Built Data Indexer

Creating a Purpose-Built Data Indexer
Andrew Dworschak | Yakoa
WTF Are We Doing?

WTF Are We Doing?
Benn Stancil | ThoughtSpot
Data Culture as a Product

Data Culture as a Product
Abhi Sivasailam | Levers Labs
Beyond Kafka: Cutting Costs and Complexity with WarpStream and S3

Beyond Kafka: Cutting Costs and Complexity with ...
Ryan Worl | WarpStream
Let's Play: Flappybird (+ some Data Engineering)

Let's Play: Flappybird (+ some Data Engineering)
David Margulies | Tinybird
What Makes for an Effective Data Practitioner in 2024?

What Makes for an Effective Data Practitioner in ...
Marck Vaisman | Microsoft
Tackling I/O Challenges in Modern Data Lakes

Tackling I/O Challenges in Modern Data Lakes
Hope Wang | Alluxio
CloudQuery: Open-Source High-Performance Cross-Language ELT Framework Powered by Apache Arrow

CloudQuery: Open-Source High-Performance ...
Yevgeny Pats | CloudQuery
The Future of Data Engineering in a Post-AI World

The Future of Data Engineering in a Post-AI World
Michelle Ufford Winters | eBay
Building a Reliable, Secure and Efficient Event Ingestion Pipeline

Building a Reliable, Secure and Efficient Event ...
Suman Karumuri | Airbnb
What does it take to build a Postgres specialized data movement tool?

What does it take to build a Postgres specialized ...
Sai Srirampur | Clickhouse
Panel: Insights from Founders Shaping the Industry's Future

Panel: Insights from Founders Shaping the ...
George Fraser | Fivetran
Not Your Father's Data Lakehouse: Building with Trino and Iceberg

Not Your Father's Data Lakehouse: Building with ...
Monica Miller | Starburst
How Developers Should Think About the Emerging AI Stack

How Developers Should Think About the Emerging AI ...
Ce Zhang | Together
Why it Takes Billions: Navigating the AI landscape with OpenAI, Google, Nvidia, and Everyone Else with Billions to Spare

Why it Takes Billions: Navigating the AI ...
DJ Patil |
OttoBot: Productionizing LLM Models

OttoBot: Productionizing LLM Models
Lukas Biewald | Weights and Biases
Working Outside the Box: the Journey to FigJam AI

Working Outside the Box: the Journey to FigJam AI
Dan Mejia | Figma
From Silo to Scale: How Data Infrastructures Evolved to Bring More Data to More People

From Silo to Scale: How Data Infrastructures ...
Bhaskar Ghosh | 8VC
GenAI and Datacomp: Creating the Largest Public Multimodal Dataset in Academia

GenAI and Datacomp: Creating the Largest Public ...
Alex Dimakis | The University of Texas at Austin
Creating a Competitive Advantage in the Age of Intelligence as a Service

Creating a Competitive Advantage in the Age of ...
Miguel Paredes |
How Data Teams Can Contribute to Data Privacy

How Data Teams Can Contribute to Data Privacy
Josh Schwartz | Phaselab
Agentic Architecture to Reduce Decision Paralysis

Agentic Architecture to Reduce Decision Paralysis
Schaun Wheeler | Aampe
Ten Years of Building Open Source Standards

Ten Years of Building Open Source Standards
Julien Le Dem | Datadog
Streaming CDC data from PostgreSQL to Snowflake, challenges and solutions

Streaming CDC data from PostgreSQL to Snowflake, ...
Alexandru Cristu | Streamkap
Charting the Lakehouse Trail: A Data Migration Adventure

Charting the Lakehouse Trail: A Data Migration ...
Erick Enriquez | InQuery
Why dbt Acquired Sdf: How A Small Team Built True SQL Comprehension

Why dbt Acquired Sdf: How A Small Team Built True ...
Elias DeFaria | SDF
Liberate Analytical Data Management with DuckDB

Liberate Analytical Data Management with DuckDB
Hannes Mühleisen | DuckDB Labs
Billion-Scale Vector Search on Object Storage

Billion-Scale Vector Search on Object Storage
Simon Eskildsen | Turbopuffer
Optimizing Iceberg Table Layouts at Scale: A Multi-Objective Approach

Optimizing Iceberg Table Layouts at Scale: A ...
Sumedh Sakdeo | LinkedIn
Orchestrating at Scale: How Instacart Manages 20M+ Daily Workflows

Orchestrating at Scale: How Instacart Manages ...
Anant Agarwal | Instacart
The Future of Data Engineering in the Agent Era

The Future of Data Engineering in the Agent Era
Madison Faulkner | NEA
AI Cram Session

AI Cram Session
Rachel Lee Nabors | Meta
RAG In 2025: State Of The Art And The Road Forward

RAG In 2025: State Of The Art And The Road Forward
Tengyu Ma | Voyage AI
What Every Data Scientist Needs To Know About GPUs

What Every Data Scientist Needs To Know About GPUs
Charles Frye | Modal Labs
The Future Of Guardrails

The Future Of Guardrails
Shreya Rajpal | Guardrails
Building Reliable Agentic AI Systems

Building Reliable Agentic AI Systems
Eno Reyes | Factory
Revolutionize AI Engineering With Autogen

Revolutionize AI Engineering With Autogen
Marck Vaisman | Microsoft
Python Over Data Lakes: Declarative Environments, Data Management And Other Things With Feathers

Python Over Data Lakes: Declarative Environments, ...
Ciro Greco | Bauplan
Going Bayes: Shifting Our Testing Methods To Reflect Our Priorities

Going Bayes: Shifting Our Testing Methods To ...
Joseph Powers | Intuit
Unlocking A/B Testing For B2B

Unlocking A/B Testing For B2B
Timothy Chan | Statsig
Failure Is A Funnel

Failure Is A Funnel
Bryan Bischof | Theory Ventures
AGI Is Already Here (But It's Not What You Think)

AGI Is Already Here (But It's Not What You Think)
Joseph Gonzalez | RunLLM & UC Berkeley
TapeAgents: A Powerful Framework For Building And Optimizing AI Agents

TapeAgents: A Powerful Framework For Building And ...
Mitul Tiwari | Stealth
Building an LLM-Powered Analytics Slack Bot at Twitch

Building an LLM-Powered Analytics Slack Bot at ...
Ethan Brown | Twitch / AWS
Instant Preview Mode: Real-Time Feedback to Make SQL Data Exploration Fly

Instant Preview Mode: Real-Time Feedback to Make ...
Hamilton Ulmer | MotherDuck
A SQL-Based Metrics Layer for DuckDB and Clickhouse

A SQL-Based Metrics Layer for DuckDB and ...
Mike Driscoll | Rill Data
Building Blocks: Advanced Semantic Data Model Layering

Building Blocks: Advanced Semantic Data Model ...
Lloyd Tabb | Meta
Multi-Modal Compute for Data Analytics

Multi-Modal Compute for Data Analytics
Ganesh Ramanarayanan | Hex
Trimming the Long Tail of Production Model Ownership at Hinge

Trimming the Long Tail of Production Model ...
Jonathan Jin | Hinge
Introducing Pixeltable: Open Source Data Infrastructure for Multimodal AI

Introducing Pixeltable: Open Source Data ...
Marcel Kornacker | Pixeltable
From Scaling to Observability: Solving Key Challenges for Distributed ML with Ray

From Scaling to Observability: Solving Key ...
Nikita Vemuri | Anyscale
Internals of SlateDB: An Embedded Key-Value Store Built on Object Storage

Internals of SlateDB: An Embedded Key-Value Store ...
Vignesh Chadramohan | Doordash
Embedding OLAP, Everywhere: Lessons from Okta

Embedding OLAP, Everywhere: Lessons from Okta
Jake Thomas | Okta
Building InfluxDB 3 Core: A Real-Time Columnar DB and Data Processor on Object Storage

Building InfluxDB 3 Core: A Real-Time Columnar DB ...
Paul Dix | InfluxData
Converging Database Architectures: DuckDB in PostgreSQL

Converging Database Architectures: DuckDB in ...
Marco Slot | Crunchy Data
Unbundling of the Cloud Data Warehouse

Unbundling of the Cloud Data Warehouse
Tanya Bragin | ClickHouse
The Agentic Database: A New Way to Interact with Your Data

The Agentic Database: A New Way to Interact with ...
Etienne Dilocker | Weaviate
No More BS: How (and When) to Really Leverage AI

No More BS: How (and When) to Really Leverage AI
Colleen Tartow | Capital One
A Modern Data Stack in Healthcare

A Modern Data Stack in Healthcare
Anil Sadineni | 1upHealth
Write Less More: How Dagster Rebuilt Our Docs from the Ground Up

Write Less More: How Dagster Rebuilt Our Docs ...
Pedram Navid | Dagster Labs
Engineering Earth's Largest Biological Data Pipeline

Engineering Earth's Largest Biological Data ...
Saif Ur-Rehman | Basecamp Research
Putting Data to Work for Global Urban Development

Putting Data to Work for Global Urban Development
Tobias Lunt | Development Data Lab
The Deconstructed Database and the Advent of the Open Data Lake

The Deconstructed Database and the Advent of the ...
Julien Le Dem | Datadog
Data Engineering Is Not Software Engineering, Until It Is

Data Engineering Is Not Software Engineering, ...
CL Kao | Recce
Open Source Success: Learnings from 1 Billion Downloads

Open Source Success: Learnings from 1 Billion ...
Avi Press | Scarf
The Art of Data: Reimagining Creative Processes with Data Culture

The Art of Data: Reimagining Creative Processes ...
Michael Cohen | Plus Company
The Middle Ground: Balancing Batch and Real-Time Processing in a Data Lakehouse

The Middle Ground: Balancing Batch and Real-Time ...
Brenna Buuck | MinIO
Text-to-SQL Is Not the Answer: How to Effectively Use AI For Analytics

Text-to-SQL Is Not the Answer: How to Effectively ...
Dillon Morrison | Sigma Computing
Causal Inference Methods for Bridging Experiments and Strategic Impact

Causal Inference Methods for Bridging Experiments ...
Wenjing Zheng | Roblox
LLMs for Data Science

LLMs for Data Science
Hadley Wickham | Posit
AI is Going to Break Your Data Platform - Are You Ready?

AI is Going to Break Your Data Platform - Are You ...
Doron Porat | Lakeway
Scaling GenAI & Agentic Workflows for practical solutions with Zerve

Scaling GenAI & Agentic Workflows for practical ...
Dr. Greg Michaelson | Zerve AI
Building Enterprise Agentic RAG Applications with Reduced Hallucinations

Building Enterprise Agentic RAG Applications with ...
Ofer Mendelevitch | Vectara
Pydantic: An Opinionated Blueprint for the Future of GenAI Applications

Pydantic: An Opinionated Blueprint for the Future ...
Samuel Colvin | Pydantic
Eval Agents: How to Solve Error Cascades in Agents

Eval Agents: How to Solve Error Cascades in Agents
Dhruv Singh | HoneyHive AI
Why is Everyone Talking about Apache Iceberg™? (From the Original Creator of Apache Iceberg)

Why is Everyone Talking about Apache Iceberg™? ...
Ryan Blue | Databricks
Scalable Continuous Monitoring for Large-scale A/B Experimentation

Scalable Continuous Monitoring for Large-scale ...
Chenyu Qiu | Uber
Data Governance is NOT the Governance of Data!

Data Governance is NOT the Governance of Data!
Oriol Mirosa | Brooklyn Data Co
Analytics and the Dark Side of the Analytics Development Lifecycle

Analytics and the Dark Side of the Analytics ...
Ori Soen | Montara
Everything Everywhere All at Once: Object Store Native

Everything Everywhere All at Once: Object Store ...
Vishnu Vasanth | e6data
The Modern Database Debate: PostgreSQL and MongoDB

The Modern Database Debate: PostgreSQL and MongoDB
Franck Pachot | MongoDB
Bridging the AI Implementation gap: Strategies for Embedding Data Professionals with Business Units

Bridging the AI Implementation gap: Strategies ...
Josh Curl | Hightouch
How to Build Your Own Model Router

How to Build Your Own Model Router
Tomás Kofman | Not Diamond
Designing & Engineering a Viral Multi-Model AI Workflow: From Prototype to 300K Users in Two Weeks

Designing & Engineering a Viral Multi-Model AI ...
David Wilson | Hunch Tools
Go-To-Market Data Enrichment: Practical Strategies to Drive Business Value

Go-To-Market Data Enrichment: Practical ...
Nathan Sooter | 1Password
The Unofficial Guide to Apple’s Private Cloud Compute

The Unofficial Guide to Apple’s Private Cloud ...
Jonathan Mortensen | Confident Security
AI-Powered Automation: Supercharge Data-Intensive Workflows with Intelligent Agents

AI-Powered Automation: Supercharge Data-Intensive ...
Skip Everling | Kolena
Make Too Much Knowledge Just Enough. Massive Scale RAG and GraphRAG with Open Source

Make Too Much Knowledge Just Enough. Massive ...
Skyler Thomas | Cake AI
More Than Query: Future Directions of Query Langages, from SQL to Morel

More Than Query: Future Directions of Query ...
Julian Hyde | Google
OpenAI’s Responses API: A New Foundation for Building with Models & Tools

OpenAI’s Responses API: A New Foundation for ...
Nikunj Handa | OpenAI
Building LLM Applications with Llama Stack

Building LLM Applications with Llama Stack
Ashwin Bharambe | Meta
Models as Tools: My Perspective On the Matter

Models as Tools: My Perspective On the Matter
Ravin Kumar | Google Deepmind
Building a Data Foundation for Multimodal Foundation Models

Building a Data Foundation for Multimodal ...
Ethan Rosenthal | Runway
Real-Time Data Infrastructure and AI: Powering the Next Generation of Analytics

Real-Time Data Infrastructure and AI: Powering ...
Martin Casado | a16z
Introduction to Google DeepMind's Models: Gemini 2.0, Imagen 3, and Veo

Introduction to Google DeepMind's Models: Gemini ...
Paige Bailey | Google
A Local-First approach to extremely fast Streaming Visualization

A Local-First approach to extremely fast ...
Parham Parvizi | Prospective
AI Launchpad 2025: Quesma

AI Launchpad 2025: Quesma
Jacek Migdal | Quesma
AI Launchpad 2025: TopK

AI Launchpad 2025: TopK
Marek Galovic | TopK
AI Launchpad 2025: Guilde

AI Launchpad 2025: Guilde
Schuyler Brown | Guilde
AI Launchpad 2025: Tower

AI Launchpad 2025: Tower
Serhii Sokolenko | Tower
AI Launchpad 2025: Mooncake Labs

AI Launchpad 2025: Mooncake Labs
Pranav Aurora | Mooncake Labs
AI Launchpad 2025: Nao

AI Launchpad 2025: Nao
Christophe Blefari | Nao
Chaos By Design: Simulation Based Testing for AI Agents

Chaos By Design: Simulation Based Testing for AI ...
Willem Pienaar | Cleric
More Than a Vibe: AI-Driven SQL that Actually Works

More Than a Vibe: AI-Driven SQL that Actually ...
Jacob Matson | MotherDuck
The Power of Low Latency Data for AI Apps

The Power of Low Latency Data for AI Apps
Cole Bowden | Firebolt
Powering AI Workflows with Tabular Graphs

Powering AI Workflows with Tabular Graphs
Rui Lopes | DataLinks
RAGs to Riches: Engineering the Future of LLM Systems

RAGs to Riches: Engineering the Future of LLM ...
Denis Yarats | Perplexity
Guardrails for the Future: AI Safety and Responsible AI in Practice

Guardrails for the Future: AI Safety and ...
Krishnaram Kenthapadi | Oracle Health
Bringing Trillions to Reality: How SambaNova’s Memory-Centric Design Powers Agentic AI and GenAI Workflows for Enterprise Data

Bringing Trillions to Reality: How SambaNova’s ...
Sumti Jairath | SambaNova Systems
The Model is the Product

The Model is the Product
Han-chung Lee | Moody's Analytics
Data Meets Intelligence: Where the Data Infra & AI Stack Converge

Data Meets Intelligence: Where the Data Infra & ...
Naveen Rao | Databricks
Legal Agency: Building Domain-specific Agents for Enterprise

Legal Agency: Building Domain-specific Agents for ...
Niko Grupen | Harvey
The Model is Not the Product

The Model is Not the Product
Hamel Husain | Parlance Labs
Building High-Impact Data Teams in an AI-Driven World

Building High-Impact Data Teams in an AI-Driven ...
Alexa Garrison | Splice
OramaCore: A Search Database with LLMs Built-In

OramaCore: A Search Database with LLMs Built-In
Issac Roth | Orama
OAKS: Open Agentic Knowledge Stack

OAKS: Open Agentic Knowledge Stack
Alexy Khraborov | Neo4j
Look Ma, No Data Warehouse!

Look Ma, No Data Warehouse!
George Fraser | Fivetran
AI Your Way with All-In-One Access

AI Your Way with All-In-One Access
Yusuf Ozuysal | Snowflake
From Concurrency Control to Concurrent Scheduling

From Concurrency Control to Concurrent Scheduling
Natacha Crooks | UC Berkeley
Data Mesh: Buzzword or the Right Guide Towards a Tactical Recipe to Improve Event-Data Quality?

Data Mesh: Buzzword or the Right Guide Towards a ...
Stefania Olafsdóttir | Avo
Building SOTA Search: It's Ranking All the Way Down

Building SOTA Search: It's Ranking All the Way ...
David Karam | Pi Labs
Beyond Pipelines: Introducing Pub/Sub for Tables

Beyond Pipelines: Introducing Pub/Sub for Tables
Arvind Prabhakar | Tabsdata
https://www.datacouncil.ai/hubfs/DataEngConf/Data%20Council/Sample%20Backgrounds/bg2.jpg center center
  • Follow / Join Us


  • YouTube
  • Twitter
  • LinkedIn
  • Contact Us


  • Email Us
  • Lost Your Tickets?
  • Menu


  • Home
  • Blog
  • Code of Conduct
  • Partners
  • Privacy Policy
  • Terms of Use

MENU