SOTAVerified

Fact Checking

Papers

Showing 150 of 669 papers

TitleStatusHype
MiniCheck: Efficient Fact-Checking of LLMs on Grounding DocumentsCode7
Loki: An Open-Source Tool for Fact VerificationCode5
Semantic Operators: A Declarative Model for Rich, AI-based Data ProcessingCode5
Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented GenerationCode4
Don't Ignore Dual Logic Ability of LLMs while Privatizing: A Data-Intensive Analysis in Medical DomainCode4
Verdict: A Library for Scaling Judge-Time ComputeCode3
Search Arena: Analyzing Search-Augmented LLMsCode2
SemViQA: A Semantic Question Answering System for Vietnamese Information Fact-CheckingCode2
ChartGemma: Visual Instruction-tuning for Chart Reasoning in the WildCode2
OpenFactCheck: Building, Benchmarking Customized Fact-Checking Systems and Evaluating the Factuality of Claims and LLMsCode2
KnowHalu: Hallucination Detection via Multi-Form Knowledge Based Factual CheckingCode2
FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain ScenariosCode2
RETA-LLM: A Retrieval-Augmented Large Language Model ToolkitCode2
Multimodal Automated Fact-Checking: A SurveyCode2
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language ModelsCode2
Atlas: Few-shot Learning with Retrieval Augmented Language ModelsCode2
SGPT: GPT Sentence Embeddings for Semantic SearchCode2
Scaling Language Models: Methods, Analysis & Insights from Training GopherCode2
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval ModelsCode2
The KEEN Universe: An Ecosystem for Knowledge Graph Embeddings with a Focus on Reproducibility and TransferabilityCode2
Verifying the Verifiers: Unveiling Pitfalls and Potentials in Fact VerifiersCode1
Chronocept: Instilling a Sense of Time in MachinesCode1
FACT-AUDIT: An Adaptive Multi-Agent Framework for Dynamic Fact-Checking Evaluation of Large Language ModelsCode1
BiDeV: Bilateral Defusing Verification for Complex Claim Fact-CheckingCode1
HintsOfTruth: A Multimodal Checkworthiness Detection Dataset with Real and Synthetic ClaimsCode1
COVE: COntext and VEracity prediction for out-of-context imagesCode1
DEFAME: Dynamic Evidence-based FAct-checking with Multimodal ExpertsCode1
Truth or Mirage? Towards End-to-End Factuality Evaluation with LLM-OasisCode1
Belief in the Machine: Investigating Epistemological Blind Spots of Language ModelsCode1
FIRE: Fact-checking with Iterative Retrieval and VerificationCode1
HerO at AVeriTeC: The Herd of Open Large Language Models for Verifying Real-World ClaimsCode1
"Image, Tell me your story!" Predicting the original meta-context of visual misinformationCode1
OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMsCode1
Flooding Spread of Manipulated Knowledge in LLM-Based Multi-Agent CommunitiesCode1
Meerkat: Audio-Visual Large Language Model for Grounding in Space and TimeCode1
An Enhanced Fake News Detection System With Fuzzy Deep LearningCode1
MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language ModelsCode1
Document-level Claim Extraction and Decontextualisation for Fact-CheckingCode1
RATT: A Thought Structure for Coherent and Correct LLM ReasoningCode1
Attribute First, then Generate: Locally-attributable Grounded Text GenerationCode1
Heterogeneous Graph Reasoning for Fact Checking over Texts and TablesCode1
LLMCheckup: Conversational Examination of Large Language Models via Interpretability Tools and Self-ExplanationsCode1
Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkersCode1
ChartCheck: Explainable Fact-Checking over Real-World Chart ImagesCode1
Massive Editing for Large Language Models via Meta LearningCode1
Detecting Deepfakes Without Seeing AnyCode1
Lost in Translation, Found in Spans: Identifying Claims in Multilingual Social MediaCode1
Fake News in Sheep's Clothing: Robust Fake News Detection Against LLM-Empowered Style AttacksCode1
QACHECK: A Demonstration System for Question-Guided Multi-Hop Fact-CheckingCode1
HealthFC: Verifying Health Claims with Evidence-Based Medical Fact-CheckingCode1
Show:102550
← PrevPage 1 of 14Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1monoT5-3BnDCG@100.78Unverified
2SGPT-BE-5.8BnDCG@100.75Unverified
3BM25+CEnDCG@100.69Unverified
4SGPT-CE-6.1BnDCG@100.68Unverified
5ColBERTnDCG@100.67Unverified
#ModelMetricClaimedVerifiedStatus
1SGPT-BE-5.8BnDCG@100.31Unverified
2monoT5-3BnDCG@100.28Unverified
3BM25+CEnDCG@100.25Unverified
4SGPT-CE-6.1BnDCG@100.16Unverified
#ModelMetricClaimedVerifiedStatus
1monoT5-3BnDCG@100.85Unverified
2BM25+CEnDCG@100.82Unverified
3SGPT-BE-5.8BnDCG@100.78Unverified
4SGPT-CE-6.1BnDCG@100.73Unverified
#ModelMetricClaimedVerifiedStatus
1HerOQuestion Only score0.48Unverified
2CTU AICQuestion Only score0.46Unverified
3InFactQuestion Only score0.45Unverified
#ModelMetricClaimedVerifiedStatus
1Abc0..5sec2Unverified
#ModelMetricClaimedVerifiedStatus
1MA-CINPrecision0.26Unverified
#ModelMetricClaimedVerifiedStatus
1FDHNAccuracy (Test)0.7Unverified