ClaimFlow: Tracing the Evolution of Scientific Claims in NLP
Aniket Pramanick, Yufang Hou, Saif M. Mohammad, Iryna Gurevych
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Scientific papers do more than report results - they advance claims that later work supports, extends, or sometimes refutes. Yet existing methods for citation and claim analysis capture only fragments of this dialogue. In this work, we make these interactions explicit at the level of individual scientific claims. We introduce ClaimFlow, a claim-centric view of the NLP literature, built from 304 ACL Anthology papers (1979-2025) that are manually annotated with 1,084 claims and 832 cross-paper claim relations, indicating whether a citing paper supports, extends, qualifies, refutes, or references a claim as background. Using ClaimFlow, we define a new task - Claim Relation Classification - which requires models to infer the scientific stance toward a cited claim from the text and citation context. Evaluating strong neural models and large language models on this task, we report baseline performance of 0.78 macro-F1, highlighting that claim-relation classification is feasible but challenging. We further apply our model to 13k NLP papers to analyze how claims evolve across decades of NLP research. Our analysis reveals that 63.5% claims are never reused; only 11.1% are ever challenged; meanwhile, widely propagated claims are more often reshaped through qualification and extension than directly confirmed or refuted. Overall, ClaimFlow offers a lens for examining how ideas shift and mature within NLP, and a foundation for assessing whether models can interpret scientific argumentation.