SPOT: An Annotated French Corpus and Benchmark for Detecting Critical Interventions in Online Conversations

Abstract

We introduce SPOT (Stopping Points in Online Threads), the first annotatedcorpus translating the sociological concept of stopping point into areproducible NLP task. Stopping points are ordinary critical interventions thatpause or redirect online discussions through a range of forms (irony, subtledoubt or fragmentary arguments) that frameworks like counterspeech or socialcorrection often overlook. We operationalize this concept as a binaryclassification task and provide reliable annotation guidelines. The corpuscontains 43,305 manually annotated French Facebook comments linked to URLsflagged as false information by social media users, enriched with contextualmetadata (article, post, parent comment, page or group, and source). Webenchmark fine-tuned encoder models (CamemBERT) and instruction-tuned LLMsunder various prompting strategies. Results show that fine-tuned encodersoutperform prompted LLMs in F1 score by more than 10 percentage points,confirming the importance of supervised learning for emerging non-Englishsocial media tasks. Incorporating contextual metadata further improves encodermodels F1 scores from 0.75 to 0.78. We release the anonymized dataset, alongwith the annotation guidelines and code in our code repository, to fostertransparency and reproducible research.

Quick Read (beta)

loading the full paper ...