Data Council Blog (3)

19/08/20 09:30 | by Data Council | in Data Science, Data Engineering, Open Source, Analytics

Large Datasets, Are Dashboards Dead, and More: Top 10 Links From Across the Web

Here's our August 2020 roundup of good reads and great podcast episodes for anyone working with data:

1. Processing Large Datasets with Python

AI engineer and author J.T. Wolohan was recently a guest of the Heroku’s Code[ish] podcast to discuss the contents of his book, “Mastering Large Datasets with Python.” Listen to the episode here or read the transcript for some practical advice on using Python to deal with massive datasets, especially in the context of machine learning.

07/08/20 10:15 | by Data Council | in Data Engineering, Open Source, Data Lakes

Open Source Highlight: Apache Hudi

With more than 1,300 stars on GitHub, Apache Hudi is a great open source solution for companies with large analytical datasets to quickly ingest data onto HDFS or cloud storage.

17/07/20 12:34 | by Data Council | in Data Science, Data Engineering, Data Warehousing, Machine Learning, Open Source, Analytics

Apache Airflow, Beyond Spreadsheets, and More: Top 10 Links From Across the Web

Here's our July 2020 roundup of relevant links for data professionals, from blog posts to podcast episodes:

1. The State of Airflow

Software Engineering Daily recently invited Apache Airflow's creator Maxime Beauchemin and Astronomer engineers Vikram Koka and Ash Berlin-Taylor to discuss the state of Airflow. Listen to the podcast episode or read the transcript to hear their comments on Airflow's use cases, its purpose, the open source ecosystem, and more.

17/07/20 12:16 | by Data Council | in Data Engineering, SQL, Open Source

Open Source Highlight: Apache Iceberg

Apache Iceberg is an open table format for very large analytic datasets. You can use it with Presto or Spark to add tables that use a high-performance format that vows to work just like a SQL table.

18/06/20 08:29 | by Data Council | in Data Science, Data Engineering, Data Infrastructure, Machine Learning, Analytics, Careers

AGI, Dask, Feature Stores, and More: Top 10 Links From Across the Web

Here's our June 2020 roundup of relevant links for data professionals, from blog posts to podcast episodes:

1. Self-Supervised Learning vs. AGI

"AGI does not exist — there is no such thing as general intelligence. We can talk about rat-level intelligence, cat-level intelligence, dog-level intelligence, or human-level intelligence, but not artificial general intelligence," Yann LeCun declared during an online session of the International Conference on Learning Representation (ICLR) 2020, which VentureBeat wrote about. Together with fellow Turing Award winner Yoshua Bengio, he advocated for pursuing humanlike AI through "self-supervised learning."

15/06/20 04:56 | by Anna Heim | in Data Science, Data Engineering, Analytics, Careers

Emerging Data Roles: The Analytics Engineer

Analytics Engineer: this term has started showing up in blog posts and job listings. It all happened quickly; just a couple of years ago, it wasn't a thing our friends in the data ecosystem talked about. So how did it start trending, what is it exactly, and is it here to stay? We decided to take a closer look, and here's what we found out.

03/06/20 06:39 | by Data Council | in Data Science, Data Engineering, Data Visualization, Startups, Open Source, Analytics

Open Source Highlight: Cube.js

Cube.js is an open source analytics framework meant to answer the "lack of tools for software engineers who are building production, customer-facing applications and need to embed analytics features into these applications," its co-founder and CEO Artyom Keydunov explained in a blog post.

18/05/20 10:00 | by Data Council | in Data Science, Data Engineering, Machine Learning, Open Source

What Data Tools DON’T Do, CD4ML and NoSQL: Top 10 Links from Across the Web

Here's our monthly roundup of relevant links for data professionals, from blog posts and tutorials to podcast episodes:

1. Product Management for AI

Peter Skomoroch and Mike Loukides co-authored a very interesting post on what makes product management different in the context of AI. Based on the specificities of AI software development, they make a series of recommendations for a process that also takes business priorities into account. Their post also ends with a list of relevant resources, so it is worth checking out.

15/05/20 03:01 | by Pete Soderling | in Data Science, Data Engineering, Machine Learning, Analytics

25 Hot New Data Tools and What They DON’T Do

“Wait, do tool X and tool Y work together? I thought they were competitive.”
There are dozens of new tools in the fast-growing data ecosystem today. Together, they are reshaping data work in exciting, productive and often surprising ways. The seeds of the data landscape for the next decade have been planted, and they’re growing wildly.
Turns out, cultivating a new ecosystem is messy.

14/05/20 07:37 | by Data Council | in Data Science, Data Visualization, Machine Learning, Open Source

Open Source Highlight: Streamlit

Streamlit officially launched out of beta on October 1st, 2019 with the promise to "turn Python scripts into beautiful ML tools." On the same day, Google's AI-focused venture fund Gradient Ventures announced its investment into the startup, which has since then attracted a considerable amount of attention despite its young age.

Data Council Blog

Large Datasets, Are Dashboards Dead, and More: Top 10 Links From Across the Web

1. Processing Large Datasets with Python

Open Source Highlight: Apache Hudi

Apache Airflow, Beyond Spreadsheets, and More: Top 10 Links From Across the Web

1. The State of Airflow

Open Source Highlight: Apache Iceberg

AGI, Dask, Feature Stores, and More: Top 10 Links From Across the Web

1. Self-Supervised Learning vs. AGI

Emerging Data Roles: The Analytics Engineer

Open Source Highlight: Cube.js

What Data Tools DON’T Do, CD4ML and NoSQL: Top 10 Links from Across the Web

1. Product Management for AI

25 Hot New Data Tools and What They DON’T Do

Open Source Highlight: Streamlit

Subscribe to Email Updates

Fresh Posts

Categories