Machine learning is being deployed in a growing number of applications which demand real-time, accurate, and robust predictions under heavy serving loads. However, most machine learning frameworks and systems only address model training and not deployment.

Clipper is an open-source, general-purpose model-serving system that addresses these challenges. Interposing between applications that consume predictions and the machine-learning models that produce predictions, Clipper simplifies the model deployment process by adopting a modular serving architecture and isolating models in their own containers, allowing them to be evaluated using the same runtime environment as that used during training.

Clipper's modular architecture provides simple mechanisms for scaling out models to meet increased throughput demands and performing fine-grained physical resource allocation for each model. Further, by abstracting models behind a uniform serving interface, Clipper allows developers to compose many machine-learning models within a single application to support increasingly common techniques such as ensemble methods, multi-armed bandit algorithms, and prediction cascades.

In this talk Joey will provide an overview of the Clipper serving system and discuss their experience transforming a research prototype into an active, open source system. He will then discuss some recent work on end-to-end cost-aware resource allocation and scheduling for multi-model applications.

Download Slides

Joseph Gonzalez

Assistant Professor & Co-Director | UC Berkeley

Joseph Gonzalez is a 6th year PhD student in Dr. John Gilbert's CSC lab here at UCSB. His areas of interest include graph theory, scientific computing, and high performance computing.In the past Joseph Gonzalez is collaborated with Erik Boman and other folks at Sandia National Laboratories on a project using linear algebraic tools to analyze large unstructured networks.

Joseph Gonzalez