- Home
- A-Z Publicaties
- Computational Communication Research
- Previous Issues
- Volume 8, Issue 1, 2026
Computational Communication Research - Volume 8, Issue 1, 2026
Volume 8, Issue 1, 2026
-
-
Topic Classification of News Articles from URLs Alone
Meer MinderAuteur: Nick HagarThis paper presents a novel approach to classifying news articles by topic using only their URLs, addressing growing challenges in accessing article text due to paywalls and scraping restrictions. By fine-tuning a DistilBERT transformer model on URL data alone, I demonstrate topic classification performance that matches or exceeds traditional approaches requiring article text. Across three benchmark datasets spanning multiple languages and over 660,000 articles from more than 11,000 news domains, this URL-based topic classifier achieved superior F1 scores compared to both conventional machine learning methods and existing URL-based techniques. While this method requires more computational resources than simpler topic classification approaches, it dramatically reduces data collection requirements, offering researchers a practical alternative when text access is limited. These findings suggest that news article URLs contain richer semantic information than previously recognized, opening new possibilities for large-scale news content analysis in increasingly restrictive digital environments.
-
Most Read This Month
Most Cited Most Cited RSS feed
-
-
Computational observation
Auteurs: Mario Haim & Angela Nienierza
-
- More Less