Technical Talks

Oops I did it Again -- Adapting a Pop Music Identifier to Find Syndicated Content in Talk Radio
At Cortico, we are making talk radio searchable in order to surface local voices and the range of issues and opinions being discussed across the country. With that comes a host of problems, including lots of duplicate content ranging from syndicated shows to repeated commercials.
This talk will go through how we adapted the technology used in identifying popular songs to automatically detect duplicate content within roughly 4000 hours of audio collected per day from nearly 200 radio stations. By utilizing audio fingerprinting, we encode and compare subsequences of audio to identify near duplicates.
To do this at scale, we set up an ephemeral Spark cluster within Kubernetes to find duplicates once a day. From this data, we can begin to map out the space of American talk radio.

Software Engineer
Allison King
Cortico
Allison is a software engineer at Cortico, a nonprofit that helps elevate local voices and share stories from communities all over the United States. Previously, she worked at MIT Lincoln Laboratory, researching secret stuff for the Department of Defense.
Discover the data foundations powering today's AI breakthroughs. Join leading minds as we explore both cutting-edge AI and the infrastructure behind it. Reserve your spot at before tickets sell out!