SOTAVerified

Hallucination

Papers

Showing 16761700 of 1816 papers

TitleStatusHype
SLPL SHROOM at SemEval2024 Task 06: A comprehensive study on models ability to detect hallucinationCode0
Incorporating Task-specific Concept Knowledge into Script LearningCode0
Ever: Mitigating Hallucination in Large Language Models through Real-Time Verification and RectificationCode0
SmallPlan: Leverage Small Language Models for Sequential Path Planning with Simulation-Powered, LLM-Guided DistillationCode0
Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful ComparatorsCode0
PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded Dialogue SystemsCode0
Towards Lighter and Robust Evaluation for Retrieval Augmented GenerationCode0
Improving Factual Error Correction by Learning to Inject Factual ErrorsCode0
SmurfCat at SemEval-2024 Task 6: Leveraging Synthetic Data for Hallucination DetectionCode0
Socratic Questioning: Learn to Self-guide Multimodal Reasoning in the WildCode0
Why and How LLMs Hallucinate: Connecting the Dots with Subsequence AssociationsCode0
Plausible May Not Be Faithful: Probing Object Hallucination in Vision-Language Pre-trainingCode0
Sora Detector: A Unified Hallucination Detection for Large Text-to-Video ModelsCode0
Image Denoising with Control over Deep Network HallucinationCode0
Im2Flow: Motion Hallucination from Static Images for Action RecognitionCode0
PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition DynamicsCode0
Utilize the Flow before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction TuningCode0
Error Typing for Smarter Rewards: Improving Process Reward Models with Error-Aware Hierarchical SupervisionCode0
A Benchmark and Robustness Study of In-Context-Learning with Large Language Models in Music Entity DetectionCode0
Spurious reconstruction from brain activityCode0
Im2Avatar: Colorful 3D Reconstruction from a Single ImageCode0
Post-hoc Utterance Refining Method by Entity Mining for Faithful Knowledge Grounded ConversationsCode0
ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language ModelsCode0
Behind the Magic, MERLIM: Multi-modal Evaluation Benchmark for Large Image-Language ModelsCode0
StackRAG Agent: Improving Developer Answers with Retrieval-Augmented GenerationCode0
Show:102550
← PrevPage 68 of 73Next →

No leaderboard results yet.