SOTAVerified

Fact Checking

Papers

Showing 150 of 669 papers

TitleStatusHype
PiMRef: Detecting and Explaining Ever-evolving Spear Phishing Emails with Knowledge Base Invariants0
DS@GT at CheckThat! 2025: Evaluating Context and Tokenization Strategies for Numerical Fact VerificationCode0
Recon, Answer, Verify: Agents in Search of Truth0
Decide less, communicate more: On the construct validity of end-to-end fact-checking in medicineCode0
The Next Phase of Scientific Fact-Checking: Advanced Evidence Retrieval from Complex Structured Academic Papers0
Veracity: An Open-Source AI Fact-Checking System0
Verifying the Verifiers: Unveiling Pitfalls and Potentials in Fact VerifiersCode1
SPOT: Bridging Natural Language and Geospatial Search for Investigative Journalists0
RealFactBench: A Benchmark for Evaluating Large Language Models in Real-World Fact-CheckingCode0
In Crowd Veritas: Leveraging Human Intelligence To Fight Misinformation0
ClimateViz: A Benchmark for Statistical Reasoning and Fact Verification on Scientific ChartsCode0
Combating Misinformation in the Arab World: Challenges & Opportunities0
Search Arena: Analyzing Search-Augmented LLMsCode2
SUCEA: Reasoning-Intensive Retrieval for Adversarial Fact-checking through Claim Decomposition and EditingCode0
Facts are Harder Than Opinions -- A Multilingual, Comparative Analysis of LLM-Based Fact-Checking Reliability0
Improving Reliability and Explainability of Medical Question Answering through Atomic Fact Checking in Retrieval-Augmented LLMs0
Verify-in-the-Graph: Entity Disambiguation Enhancement for Complex Claim Verification with Interactive Graph Representation0
Community Moderation and the New Epistemology of Fact Checking on Social Media0
From Generation to Detection: A Multimodal Multi-Task Dataset for Benchmarking Health Misinformation0
Social Good or Scientific Curiosity? Uncovering the Research Framing Behind NLP Artefacts0
Debate-to-Detect: Reformulating Misinformation Detection as a Real-World Debate with Large Language Models0
Teaching with Lies: Curriculum DPO on Synthetic Negatives for Hallucination Detection0
Resolving Conflicting Evidence in Automated Fact-Checking: A Study on Retrieval-Augmented LLMsCode0
EMULATE: A Multi-Agent Framework for Determining the Veracity of Atomic Claims by Emulating Human ActionsCode0
CUB: Benchmarking Context Utilisation Techniques for Language Models0
Improving the fact-checking performance of language models by relying on their entailment ability0
UrduFactCheck: An Agentic Fact-Checking Framework for Urdu with Evidence Boosting and BenchmarkingCode0
MultiHal: Multilingual Dataset for Knowledge-Graph Grounded Evaluation of LLM HallucinationsCode0
Learning Auxiliary Tasks Improves Reference-Free Hallucination Detection in Open-Domain Long-Form Generation0
SemEval-2025 Task 7: Multilingual and Crosslingual Fact-Checked Claim Retrieval0
FACTors: A New Dataset for Studying the Fact-checking EcosystemCode0
Judging the Judges: Can Large Vision-Language Models Fairly Evaluate Chart Comprehension and Reasoning?Code0
Communication Styles and Reader Preferences of LLM and Human Experts in Explaining Health Information0
SciCom Wiki: Fact-Checking and FAIR Knowledge Distribution for Scientific Videos and Podcasts0
Computational Fact-Checking of Online Discourse: Scoring scientific accuracy in climate change related news articles0
Chronocept: Instilling a Sense of Time in MachinesCode1
TrumorGPT: Graph-Based Retrieval-Augmented Large Language Model for Fact-Checking0
Holmes: Automated Fact Check with Large Language Models0
A Generative-AI-Driven Claim Retrieval System Capable of Detecting and Retrieving Claims from Social Media Platforms in Multiple LanguagesCode0
Detecting Manipulated Contents Using Knowledge-Grounded InferenceCode0
Pushing the boundary on Natural Language Inference0
Assessing the Potential of Generative Agents in Crowdsourced Fact-Checking0
PASS-FC: Progressive and Adaptive Search Scheme for Fact Checking of Comprehensive Claims0
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens0
BOOST: Bootstrapping Strategy-Driven Reasoning Programs for Program-Guided Fact-Checking0
If an LLM Were a Character, Would It Know Its Own Story? Evaluating Lifelong Learning in LLMs0
Understanding Inequality of LLM Fact-Checking over Geographic Regions with Agent and Retrieval models0
MultiClaimNet: A Massively Multilingual Dataset of Fact-Checked Claim Clusters0
Fact-checking AI-generated news reports: Can LLMs catch their own lies?0
Can LLMs Automate Fact-Checking Article Writing?0
Show:102550
← PrevPage 1 of 14Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1monoT5-3BnDCG@100.78Unverified
2SGPT-BE-5.8BnDCG@100.75Unverified
3BM25+CEnDCG@100.69Unverified
4SGPT-CE-6.1BnDCG@100.68Unverified
5ColBERTnDCG@100.67Unverified
#ModelMetricClaimedVerifiedStatus
1SGPT-BE-5.8BnDCG@100.31Unverified
2monoT5-3BnDCG@100.28Unverified
3BM25+CEnDCG@100.25Unverified
4SGPT-CE-6.1BnDCG@100.16Unverified
#ModelMetricClaimedVerifiedStatus
1monoT5-3BnDCG@100.85Unverified
2BM25+CEnDCG@100.82Unverified
3SGPT-BE-5.8BnDCG@100.78Unverified
4SGPT-CE-6.1BnDCG@100.73Unverified
#ModelMetricClaimedVerifiedStatus
1HerOQuestion Only score0.48Unverified
2CTU AICQuestion Only score0.46Unverified
3InFactQuestion Only score0.45Unverified
#ModelMetricClaimedVerifiedStatus
1Abc0..5sec2Unverified
#ModelMetricClaimedVerifiedStatus
1MA-CINPrecision0.26Unverified
#ModelMetricClaimedVerifiedStatus
1FDHNAccuracy (Test)0.7Unverified