Data Council Blog (4)

22/04/20 11:35 | by Data Council | in Data Science, Data Engineering, Machine Learning, Artificial Intelligence, Learning

Data Science, Data Analytics, Data Engineering and Artificial Intelligence: 11 Online Courses You Should Check Out

With COVID-19 forcing almost one billion people to shelter in place around the world, many people have turned to new activities, such as drawing, baking, gardening… or online learning. If that doesn't sound like you, don't feel guilty by any means – sometimes, surviving is enough! But if you want to get more knowledgeable about data science, data engineering and artificial intelligence, we are here for you.

This is why we came up with this list of courses that can help you prepare for a future job in the data field, upgrade your existing skills, or just satisfy your personal curiosity. From free entry-level courses to full-time bootcamps, here's our selection for you to check out:

06/04/20 12:16 | by Data Council | in Data Science, Data Engineering, Data Visualization, Machine Learning

PyTorch Lightning, ksqlDB and More: Top 10 Links from Across the Web

Here are 10 recent relevant links for data professionals, from blog posts and tutorials to podcast episodes:

1. PyTorch Lightning: a gentle introduction

Former Data Council speaker Will Falcon published an interesting post on PyTorch Lightning, the lightweight PyTorch wrapper born out of his Ph.D. AI research at NYU CILVR and Facebook AI Research (FAIR). Framed as "a gentle introduction", it includes a side-by-side comparison of building a simple MNIST classifier PyTorch and PyTorch Lightning, in order to illustrate how to refactor one into the other. This is highly recommended reading if you are working on AI/ML research, be it as a professional researcher, student or in production.

26/07/19 05:40 | by Data Council | in Data Engineering, data engineer salary

Data Engineer Salaries Around The World (2019)

Your potential salary as a data engineer heavily depends on where you are based; but cost of living also varies around the world. Wondering where you can actually earn more? Let's take a closer look at the United States, Europe and Asia to compare and benchmark data engineering salaries.

08/11/18 10:05 | by Pete Soderling | in Data Science, datacoral, Data Infrastructure, Data Pipelines

Should Datacoral Power Your New Data Infrastructure?

Today's companies aim to be data-driven, but data infrastructure is time intensive and costly to build, maintain, and secure. A coral is the exoskeleton of a small marine animal that attaches and grows on almost anything. Once it starts growing, it can create large reefs, which support a diverse ecosystem of plants and animals. So what happens if you apply that philosophy to the world of data?

13/04/18 11:02 | by Pete Soderling | in Data Engineering, Event Updates, Data Visualization, ops

How Histograms Can Help Improve Your Ops Monitoring

Life comes at you fast. Data even more so ...

When the engineering team at Circonus began to feel the pain of systems at scale, there were some common observability tools that provided them with a firehose of operational time series telemetry. However, managing all that data, yet alone making sense of it, was extremely difficult. And the existing tools they tried for managing time series metrics either didn't give mathematical insight, or fell over at modest workloads. They needed a better solution. So they decided to look into other statistical tooling options that had proven themselves for decades in other industries.

13/04/18 07:30 | by Robert Winslow | in Data Engineering, Data Warehouse, Data Strategy

Amberdata - Featured Startup SF '18

In this blog series leading up to our SF18 conference, we invite our featured startups to tell us more about their data engineering challenges. Today, we speak with Amberdata, an early-stage company building analysis tools for blockchain infrastructure, applications, and transactions.

13/04/18 05:30 | by Robert Winslow | in Data Engineering, Data Warehouse, Data Strategy

Intermix - Featured Startup SF '18

In this blog series leading up to our SF18 conference, we invite our featured startups to tell us more about their data engineering challenges. Today, we speak with Intermix, an early-stage company building performance analytics tools for Amazon Redshift.

11/04/18 12:38 | by Pete Soderling | in Data Science, Data Engineering, Event Updates, Startups, Apache Arrow

How to "Democratize" the Responsibility for Data Quality Across your Organization

Writing endless data transformations wasn't sustainable for an engineering team handling hundreds of inputs. Here's how Clover Health enabled their business users to help.

It's rare to find an ETL system that's completely static. As organizations change and grow they develop new business requirements. Because of this their data pipelines must change and adapt, ultimately becoming more robust and full-featured. Yet constant development can make already brittle ETL systems seem even more fragile.

Furthermore, systems with large numbers of different types of inputs bring special challenges - building, testing and managing an exploding number of data transformations can become a daunting project for the engineering team.

The Clover Health ETL system supports hundreds of inputs and more than 500 custom transformations in production as well as a large number of custom connections between their different ETL pipelines. When hearing about the magnitude of the system, one might rightfully wonder, "how does Clover guarantee and maintain data quality across so many different inputs and transforms?"

Exploring the development trajectory of Clover's system makes for a fascinating story; hearing about their data team's successes and pitfalls are illustrative lessons to other engineers as they seek to increase the robustness of their own ETL systems.

09/04/18 12:38 | by Eric Hanson | in Data Engineering, Databases, BigQuery, memsql, SQL

Shattering the Trillion-Rows-Per-Second Barrier With MemSQL

shattering-the-trillion-rows-per-second-barrier-with-memql

Recently at a conference, I had the privilege of demonstrating MemSQL processing over a trillion rows per second on the latest Intel Skylake servers.

05/04/18 11:00 | by Robert Winslow | in Data Engineering, Data Warehouse, Data Strategy

NuCypher - Featured Startup SF '18

In this blog series leading up to our SF18 conference, we invite our featured startups to tell us more about their data engineering challenges. Today, we speak with NuCypher, an early-stage company building a decentralized encryption service.

Data Council Blog

Data Science, Data Analytics, Data Engineering and Artificial Intelligence: 11 Online Courses You Should Check Out

PyTorch Lightning, ksqlDB and More: Top 10 Links from Across the Web

1. PyTorch Lightning: a gentle introduction

Data Engineer Salaries Around The World (2019)

Should Datacoral Power Your New Data Infrastructure?

How Histograms Can Help Improve Your Ops Monitoring

Life comes at you fast. Data even more so ...

Amberdata - Featured Startup SF '18

Intermix - Featured Startup SF '18

How to "Democratize" the Responsibility for Data Quality Across your Organization

Writing endless data transformations wasn't sustainable for an engineering team handling hundreds of inputs. Here's how Clover Health enabled their business users to help.

Shattering the Trillion-Rows-Per-Second Barrier With MemSQL

NuCypher - Featured Startup SF '18

Subscribe to Email Updates

Fresh Posts

Categories