Data Council - Data Science, Machine Learning, AI, and Engineering Blog

Data Council Blog

ETL and the Question of Happiness

etl-1.png 

No one is happy with fragile ETL pipelines. But it doesn't need to be that way.

One might surmise that data "analysis" is, first and foremost, about data "access." It goes without saying that someone in the analyst's role must first obtain access to the data they wish to analyze. And with data being spread all over the inside, and now outside, of the enterprise (think of both your on-premises data stores, plus all the cloud and SaaS vendors you're currently using) modern day analysts face deeper challanges than ever before in obtaining access to the data they need.

And of course, techno-philosophical concepts like "democratizing acess to data" do nothing at all to help one overcome any of the actual technical integration challenges required to practically enable such unfettered access to one's data.

To solve this problem, DBAs and engineers have traditionally built ETL systems (an acronymn for "extract, transform, load"). ETL referrs to the processes and data pipelines that get data from systems A, B, C, etc. ... into whatever data store is defined for reporting so that the analysts and quants can get the access to data they demand to perform the analysis they need. 

Historically a part of the engineering domain, setting up a functional ETL system is often considered a complex task. But what if a world existed where data analysts and scientists could do ETL tasks themselves? 

Many people tend to think that the management of ETL pipelines should be kept within the confines of IT departments. However, it turns out that successfully operating one's ETL pipeline is often even more work than building it in the first place. ETL systems are traditionally fragile and prone to run-time errors, often requiring what seems like endless debugging to keep them running smoothly. 

Christian Romming, the founder of Etleap and a DataEngConf NYC speaker this year, believes that there might be a better way. Christian is a proponent of giving analysts control over their own pipelines with a set of tools that can make them more productive and thus, happier. His argument is that this, in turn, ends up being less work for the engineers as well.


Meet Christian Romming of Etleap

christian-round-300x300.png

Christian is the founder and CEO of Etleap. Before that, he was CTO at VigLink, an ad-tech startup, where his team built and maintained complex infrastructure to support data scientists and analysts in understanding and monetizing user behavior online. He is passionate about data and distributed systems.

 


In Christian's talk at DataEngConf NYC, he’ll talk about ETL systems from the point of view of data analysts, and discuss techniques that can be applied to any ETL setup that give analysts more control and data engineers less headaches.

Christian's goal is for listeners to come away with concrete tips they can apply to their own ETL systems to improve manageability and flexibility. As a bonus, data engineers will walk away with an appreciation for what’s important for the data analysts in their organization.

Because ultimately, happy data analysts mean happy data engineers. 

 

New Call-to-action

Data Engineering, Event Updates, Databases

Pete Soderling

Written by Pete Soderling

Pete is a software engineer, 3x founder and angel investor. As the founder of Hakka Labs and DataEngConf he loves to build community for software engineers and has some bumps and bruises to prove it. He's spoken at conferences like RSA Security, O'Reilly Strata and TEDx, helped organize QCon events, and launched data meetups around the world. He's a mentor at 500 Startups in SF but he lives in Jackson Hole, WY, where the snow is far better.

Wanna be our Pen Pal?

Receive the latest news, tips and special events from our community directly to your inbox once in a while (we promise no spam)

Data Council Blog Signup