Technical Talks

View All

What does it take to build a Postgres specialized data movement tool?

Sai Srirampur Sai Srirampur | CEO | PeerDB

Every datastore is unique with a diverse set of features and data modeling characteristics. For example PostgreSQL has 4 ways to ingest data, 5 ways to read data, 300+ data types and 300+ database configs. Building data movement solutions that scale, therefore, requires emphasis to the unique design and capabilities of each datastore.

However, most existing data movement tools focus on breadth over quality of connectors. They often fail at scale due to painfully slow syncs, lack of reliability, and lack of features. These challenges are reflected in the number of companies building in-house solutions and maintaining large data engineering teams.

At PeerDB, we are building a specialized data movement tool for Postgres, the world’s most adopted Open Source Database. We are doing so with an emphasis on the intricacies of Postgres, to provide a high quality experience for moving data in and out of Postgres, from and to other datastores.

In this talk, I will do a deep dive into what it takes to build a Postgres-specialized data movement tool.
1. I will cover the architectural tradeoffs we took - Why we chose a peer-to-peer architecture that keeps data-stores at the center vs a hub-and-spoke one that optimizes for the breadth of connectors?
2. How are we implementing several Postgres native and infrastructural optimizations from day one including -
a. how we partition a Postgres table using internal tuple identifiers (CTIDs) and implement parallel snapshotting to move TBs of data in hrs vs days;
b. how we preserve data type nativity while moving specialized types such as Geospatial, JSONB etc.;
c. how we are building a Postgres-compatible SQL layer written in rust to easily build and manage data-pipelines.
3. To sum it up, I will share what needs to go into Postgres upstream to make data movement a first class citizen

Sai Srirampur
Sai Srirampur
CEO | PeerDB

Sai Srirampur co-founded PeerDB to build a first-class data-movement tool for Postgres, the world’s most widely used Open Source database. Their initial focus is to provide a fast and a cost-effective way to move data from Postgres to Data Warehouses, Queues and Storage.
Before PeerDB, Sai worked at Microsoft leading solutions engineering for all Postgres services on Azure. Before that he worked at Citus Data, as an early employee and saw it through the Microsoft acquisition. In the past 8 years, he’s been an active part of Postgres community helping 100s of customers including Enterprises and SMBs to adopt Postgres.