SOTAVerified

Automated Journalistic Questions: A New Method for Extracting 5W1H in French

2025-05-20Unverified0· sign in to hype

Richard Khoury, Maxence Verhaverbeke, Julie A. Gramaccia

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

The 5W1H questions -- who, what, when, where, why and how -- are commonly used in journalism to ensure that an article describes events clearly and systematically. Answering them is a crucial prerequisites for tasks such as summarization, clustering, and news aggregation. In this paper, we design the first automated extraction pipeline to get 5W1H information from French news articles. To evaluate the performance of our algo- rithm, we also create a corpus of 250 Quebec news articles with 5W1H answers marked by four human annotators. Our results demonstrate that our pipeline performs as well in this task as the large language model GPT-4o.

Tasks

Reproductions