At Skillshare we needed a workflow platform to orchestrate and execute ETL and model training jobs. Like many startups, we faced the challenge of having a small data engineering staff (n=1) and limited resources and wanted a solution that would be scalable, reliable and maintainable by both our Data Science and SRE teams.
We decided to leverage our existing Kubernetes-based application infrastructure by using Argo, an open-source Kubernetes-native workflow engine. Argo provides capabilities such as scalable execution, versioned workflows, and consistent deployment across development/QA/production environments while minimizing operational overhead.
In this talk, I will discuss the challenges of setting up and maintaining a workflow platform with a small data team, why we chose Argo, and our experience implementing it.
Kai Rikhye has over ten years of experience building data infrastructure and tools at early and mid-stage startups. Currently he's the first data engineer at Skillshare, focusing on pipelines and warehousing. He is well-versed in the technical and organizational challenges of scaling data engineering functions, and draws inspiration from software engineering and SRE best practices.