MongoDB is currently developing a predictive auto-scaling algorithm for our fleet of half a million database servers, which run in three public clouds and serve tens of thousands of customers. Each customer's demand varies dramatically over time, following cyclical patterns or not, leading to periods of over- and under-utilization. Our algorithm predicts customers' demand based on past patterns and recent trends, then scales resources in advance of changes in demand. When it works well, predictive auto-scaling prevents overloads, saves money for us and our customers, and reduces our carbon footprint. But if the algorithm makes a mistake, it can be very costly.
We will describe MongoDB's predictive auto-scaling algorithm, the machine learning techniques involved, which methods worked well and which haven't, and how we plan to safely deploy such a risky system.
This talk is a co-presentation with A. Jesse Jiryu Davis, Senior Staff Research Engineer, and Matthieu Humeau, Senior Data Scientist, both at MongoDB.
A. Jesse Jiryu Davis is a distributed systems researcher at MongoDB. He lives in upstate New York.
Matthieu is a Senior Data Scientist at MongoDB where he leads efforts on discovering and building predictive models to address inefficiencies across the data platform. Previously, he built predictive solutions to manage revenue and costs across the full product suite while at AWS. Matthieu lives in NYC and holds graduate degrees in Applied Maths from Ecole Centrale Paris and Analytics from MIT.