SOTAVerified

Fact Checking

Papers

Showing 126150 of 669 papers

TitleStatusHype
PiMRef: Detecting and Explaining Ever-evolving Spear Phishing Emails with Knowledge Base Invariants0
DS@GT at CheckThat! 2025: Evaluating Context and Tokenization Strategies for Numerical Fact VerificationCode0
Recon, Answer, Verify: Agents in Search of Truth0
Decide less, communicate more: On the construct validity of end-to-end fact-checking in medicineCode0
The Next Phase of Scientific Fact-Checking: Advanced Evidence Retrieval from Complex Structured Academic Papers0
Veracity: An Open-Source AI Fact-Checking System0
SPOT: Bridging Natural Language and Geospatial Search for Investigative Journalists0
RealFactBench: A Benchmark for Evaluating Large Language Models in Real-World Fact-CheckingCode0
ClimateViz: A Benchmark for Statistical Reasoning and Fact Verification on Scientific ChartsCode0
In Crowd Veritas: Leveraging Human Intelligence To Fight Misinformation0
SUCEA: Reasoning-Intensive Retrieval for Adversarial Fact-checking through Claim Decomposition and EditingCode0
Combating Misinformation in the Arab World: Challenges & Opportunities0
Facts are Harder Than Opinions -- A Multilingual, Comparative Analysis of LLM-Based Fact-Checking Reliability0
Improving Reliability and Explainability of Medical Question Answering through Atomic Fact Checking in Retrieval-Augmented LLMs0
Verify-in-the-Graph: Entity Disambiguation Enhancement for Complex Claim Verification with Interactive Graph Representation0
Community Moderation and the New Epistemology of Fact Checking on Social Media0
Debate-to-Detect: Reformulating Misinformation Detection as a Real-World Debate with Large Language Models0
Social Good or Scientific Curiosity? Uncovering the Research Framing Behind NLP Artefacts0
From Generation to Detection: A Multimodal Multi-Task Dataset for Benchmarking Health Misinformation0
Teaching with Lies: Curriculum DPO on Synthetic Negatives for Hallucination Detection0
Resolving Conflicting Evidence in Automated Fact-Checking: A Study on Retrieval-Augmented LLMsCode0
CUB: Benchmarking Context Utilisation Techniques for Language Models0
EMULATE: A Multi-Agent Framework for Determining the Veracity of Atomic Claims by Emulating Human ActionsCode0
UrduFactCheck: An Agentic Fact-Checking Framework for Urdu with Evidence Boosting and BenchmarkingCode0
Improving the fact-checking performance of language models by relying on their entailment ability0
Show:102550
← PrevPage 6 of 27Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1monoT5-3BnDCG@100.78Unverified
2SGPT-BE-5.8BnDCG@100.75Unverified
3BM25+CEnDCG@100.69Unverified
4SGPT-CE-6.1BnDCG@100.68Unverified
5ColBERTnDCG@100.67Unverified
#ModelMetricClaimedVerifiedStatus
1SGPT-BE-5.8BnDCG@100.31Unverified
2monoT5-3BnDCG@100.28Unverified
3BM25+CEnDCG@100.25Unverified
4SGPT-CE-6.1BnDCG@100.16Unverified
#ModelMetricClaimedVerifiedStatus
1monoT5-3BnDCG@100.85Unverified
2BM25+CEnDCG@100.82Unverified
3SGPT-BE-5.8BnDCG@100.78Unverified
4SGPT-CE-6.1BnDCG@100.73Unverified
#ModelMetricClaimedVerifiedStatus
1HerOQuestion Only score0.48Unverified
2CTU AICQuestion Only score0.46Unverified
3InFactQuestion Only score0.45Unverified
#ModelMetricClaimedVerifiedStatus
1Abc0..5sec2Unverified
#ModelMetricClaimedVerifiedStatus
1MA-CINPrecision0.26Unverified
#ModelMetricClaimedVerifiedStatus
1FDHNAccuracy (Test)0.7Unverified