Statistical and Computational Challenges of Real-Time News Clustering

Jeiran Jahani, Research Data Scientist | Chartbeat

The proliferation of online news has been a challenge for both journalists, news consumers and policymakers who wish to take the pulse of the world because it is infeasible to manually browse and summarize this ever growing amount of data. Real time story clustering can solve this problem but it demands statistically robust and computationally efficient methodologies. As such, although heavily researched, it remains one of the open questions for both computational experts and media researchers how to group articles that cover the same news event upon their publication.



Jeiran Jahani

Research Data Scientist | Chartbeat

Jeiran Jahani