Here's our October 2020 roundup of good reads and podcast episodes that might be relevant to you as a data professional:
Created by Berlin-based developer Jan Oberhauser in 2019, n8n presents itself as “a free and open workflow automation tool”. Think of it as a locally hosted Zapier on steroids.
Here's our September 2020 roundup of good reads and podcast episodes that might be relevant to you as a data professional:
Our founder Pete Soderling co-authored a follow-on piece to his previous post with Great Expectations' core contributor Abe Gong and Partner at Amplify Partners Sarah Catanzaro, for which they had interviewed the makers of some of the hottest data tools. The focus is still the same: rather than what their data tools can do, we hear about what they don't do, as a way to better understand how they fit together. From ApertureData to Xplenty, this new installment covers 21 new tools, and you can read it here.
Here's our August 2020 roundup of good reads and great podcast episodes for anyone working with data:
AI engineer and author J.T. Wolohan was recently a guest of the Heroku’s Code[ish] podcast to discuss the contents of his book, “Mastering Large Datasets with Python.” Listen to the episode here or read the transcript for some practical advice on using Python to deal with massive datasets, especially in the context of machine learning.
With more than 1,300 stars on GitHub, Apache Hudi is a great open source solution for companies with large analytical datasets to quickly ingest data onto HDFS or cloud storage.
Here's our July 2020 roundup of relevant links for data professionals, from blog posts to podcast episodes:
Software Engineering Daily recently invited Apache Airflow's creator Maxime Beauchemin and Astronomer engineers Vikram Koka and Ash Berlin-Taylor to discuss the state of Airflow. Listen to the podcast episode or read the transcript to hear their comments on Airflow's use cases, its purpose, the open source ecosystem, and more.
Apache Iceberg is an open table format for very large analytic datasets. You can use it with Presto or Spark to add tables that use a high-performance format that vows to work just like a SQL table.
Cube.js is an open source analytics framework meant to answer the "lack of tools for software engineers who are building production, customer-facing applications and need to embed analytics features into these applications," its co-founder and CEO Artyom Keydunov explained in a blog post.
Here's our monthly roundup of relevant links for data professionals, from blog posts and tutorials to podcast episodes:
Peter Skomoroch and Mike Loukides co-authored a very interesting post on what makes product management different in the context of AI. Based on the specificities of AI software development, they make a series of recommendations for a process that also takes business priorities into account. Their post also ends with a list of relevant resources, so it is worth checking out.