SOTAVerified

Hallucination

Papers

Showing 16511700 of 1816 papers

TitleStatusHype
Optimal Transport for Unsupervised Hallucination Detection in Neural Machine TranslationCode0
Correction with Backtracking Reduces Hallucination in SummarizationCode0
Order Matters in Hallucination: Reasoning Order as Benchmark and Reflexive Prompting for Large-Language-ModelsCode0
Few-shot learning via tensor hallucinationCode0
Addressing Topic Granularity and Hallucination in Large Language Models for Topic ModellingCode0
OTTAWA: Optimal TransporT Adaptive Word Aligner for Hallucination and Omission Translation Errors DetectionCode0
Benchmarking ChatGPT-4 on ACR Radiation Oncology In-Training (TXIT) Exam and Red Journal Gray Zone Cases: Potentials and Challenges for AI-Assisted Medical Education and Decision Making in Radiation OncologyCode0
JourneyBench: A Challenging One-Stop Vision-Language Understanding Benchmark of Generated ImagesCode0
Joint stereo 3D object detection and implicit surface reconstructionCode0
Abstract Meaning Representation for Hospital Discharge SummarizationCode0
Iterative Teaching by Data HallucinationCode0
PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuningCode0
Fakes of Varying Shades: How Warning Affects Human Perception and Engagement Regarding LLM HallucinationsCode0
Investigating the performance of Retrieval-Augmented Generation and fine-tuning for the development of AI-driven knowledge-based systemsCode0
Investigating Multi-Pivot Ensembling with Massively Multilingual Machine Translation ModelsCode0
AILS-NTUA at SemEval-2024 Task 6: Efficient model tuning for hallucination detection and analysisCode0
Conversational Gold: Evaluating Personalized Conversational Search System using Gold NuggetsCode0
Parse Trees Guided LLM Prompt CompressionCode0
Controlling Risk of Retrieval-augmented Generation: A Counterfactual Prompting FrameworkCode0
Explaining Graph Neural Networks with Large Language Models: A Counterfactual Perspective for Molecular Property PredictionCode0
Evolutionary thoughts: integration of large language models and evolutionary algorithmsCode0
Investigating and Mitigating Object Hallucinations in Pretrained Vision-Language (CLIP) ModelsCode0
Integrating Chemistry Knowledge in Large Language Models via Prompt EngineeringCode0
Confidence Estimation for LLM-Based Dialogue State TrackingCode0
Instruction Makes a DifferenceCode0
SLPL SHROOM at SemEval2024 Task 06: A comprehensive study on models ability to detect hallucinationCode0
Incorporating Task-specific Concept Knowledge into Script LearningCode0
Ever: Mitigating Hallucination in Large Language Models through Real-Time Verification and RectificationCode0
SmallPlan: Leverage Small Language Models for Sequential Path Planning with Simulation-Powered, LLM-Guided DistillationCode0
Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful ComparatorsCode0
PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded Dialogue SystemsCode0
Towards Lighter and Robust Evaluation for Retrieval Augmented GenerationCode0
Improving Factual Error Correction by Learning to Inject Factual ErrorsCode0
SmurfCat at SemEval-2024 Task 6: Leveraging Synthetic Data for Hallucination DetectionCode0
Socratic Questioning: Learn to Self-guide Multimodal Reasoning in the WildCode0
Why and How LLMs Hallucinate: Connecting the Dots with Subsequence AssociationsCode0
Plausible May Not Be Faithful: Probing Object Hallucination in Vision-Language Pre-trainingCode0
Sora Detector: A Unified Hallucination Detection for Large Text-to-Video ModelsCode0
Image Denoising with Control over Deep Network HallucinationCode0
Im2Flow: Motion Hallucination from Static Images for Action RecognitionCode0
PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition DynamicsCode0
Utilize the Flow before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction TuningCode0
Error Typing for Smarter Rewards: Improving Process Reward Models with Error-Aware Hierarchical SupervisionCode0
A Benchmark and Robustness Study of In-Context-Learning with Large Language Models in Music Entity DetectionCode0
Spurious reconstruction from brain activityCode0
Im2Avatar: Colorful 3D Reconstruction from a Single ImageCode0
Post-hoc Utterance Refining Method by Entity Mining for Faithful Knowledge Grounded ConversationsCode0
ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language ModelsCode0
Behind the Magic, MERLIM: Multi-modal Evaluation Benchmark for Large Image-Language ModelsCode0
StackRAG Agent: Improving Developer Answers with Retrieval-Augmented GenerationCode0
Show:102550
← PrevPage 34 of 37Next →

No leaderboard results yet.