Technical Talks

View All

Building a Recursive BigQuery Mapper

Darren McCleary Darren McCleary | Software Engineer | The New York Times

The New York Times Crossword is a cross-platform game that users can play anywhere. This means a puzzler can start a game on their phone on their bus ride to work in the morning, ponder over it on their work laptop during the day, then complete it on their tablet before bed. 

During our replatforming, we migrated our data from MySQL to GCP Datastore. This was a win from an application standpoint, but we gave up the ability to execute bespoke, complex SQL queries to gain insights into our users and how they interact with our Crossword product. With ever more detailed data available and over 300,000 paying subscribers, this was a temporarily necessary but painful ability to give up.

Therefore, we wrote a Go application that can do the job of replicating our data from Datastore into BigQuery orders of magnitude faster than alternative solutions. The application itself is recursive, able to scale itself to meet the needs of the data load it’s been given. Using this system, we were able to achieve over 1.5 million streaming inserts per second. Now we’re able to answer the questions everyone always asks about the NYT Crossword. How many people played the Mini today? Was it harder than last week's? What quantile does my time fall in? What word did solvers struggle with the most?

Darren McCleary
Darren McCleary
Software Engineer | The New York Times

Darren is a Senior Software Engineer at The New York Times