SOTAVerified

Hallucination

Papers

Showing 851900 of 1816 papers

TitleStatusHype
On Mitigating Code LLM Hallucinations with API Documentation0
DAHRS: Divergence-Aware Hallucination-Remediated SRL Projection0
Mitigating Entity-Level Hallucination in Large Language ModelsCode0
The Two Sides of the Coin: Hallucination Generation and Detection with LLMs as Evaluators for LLMs0
On the Universal Truthfulness Hyperplane Inside LLMsCode0
Lynx: An Open Source Hallucination Evaluation Model0
Knowledge Overshadowing Causes Amalgamated Hallucination in Large Language Models0
Learning with Instance-Dependent Noisy Labels by Anchor Hallucination and Hard Sample Label Correction0
Fuse, Reason and Verify: Geometry Problem Solving with Parsed Clauses from Diagram0
Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention MapsCode2
GTP-4o: Modality-prompted Heterogeneous Graph Learning for Omni-modal Biomedical Representation0
Controllable and Reliable Knowledge-Intensive Task-Oriented Conversational Agents with Declarative Genie WorksheetsCode2
KG-FPQ: Evaluating Factuality Hallucination in LLMs with Knowledge Graph-based False Premise QuestionsCode0
Vision-Language Models under Cultural and Inclusive Considerations0
Multi-Object Hallucination in Vision-Language ModelsCode1
VideoCoT: A Video Chain-of-Thought Dataset with Active Annotation Tool0
Enhancing Hallucination Detection through Perturbation-Based Synthetic Data Generation in System ResponsesCode0
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?Code1
Code Hallucination0
ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language ModelsCode2
Classification-Based Automatic HDL Code Generation Using LLMs0
Zero-shot Persuasive Chatbots with LLM-Generated Strategies and Information Retrieval0
Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models0
STOC-TOT: Stochastic Tree-of-Thought with Constrained Decoding for Complex Reasoning in Multi-Hop Question Answering0
Query-Guided Self-Supervised Summarization of Nursing Notes0
LLM Internal States Reveal Hallucination Risk Faced With a QueryCode0
FSM: A Finite State Machine Based Zero-Shot Prompting Paradigm for Multi-Hop Question Answering0
MedVH: Towards Systematic Evaluation of Hallucination for Large Vision Language Models in the Medical ContextCode1
A Comparative Study of DSL Code Generation: Fine-Tuning vs. Optimized Retrieval Augmentation0
Pelican: Correcting Hallucination in Vision-LLMs via Claim Decomposition and Program of Thought Verification0
Understanding Alignment in Multimodal LLMs: A Comprehensive Study0
MeMemo: On-device Retrieval Augmentation for Private and Personalized Text GenerationCode2
The Need for Guardrails with Large Language Models in Medical Safety-Critical Settings: An Artificial Intelligence Application in the Pharmacovigilance Ecosystem0
LLM Uncertainty Quantification through Directional Entailment Graph and Claim Level Response Augmentation0
Free-text Rationale Generation under Readability Level Control0
Large Language Models Are Involuntary Truth-Tellers: Exploiting Fallacy Failure for Jailbreak AttacksCode0
FineSurE: Fine-grained Summarization Evaluation using LLMsCode1
Unveiling Glitches: A Deep Dive into Image Encoding Bugs within CLIP0
Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language ModelsCode1
BioKGBench: A Knowledge Graph Checking Benchmark of AI Agent for Biomedical ScienceCode0
GraphArena: Benchmarking Large Language Models on Graph Computational ProblemsCode1
PFME: A Modular Approach for Fine-grained Hallucination Detection and Editing of Large Language Models0
A Study on Effect of Reference Knowledge Choice in Generating Technical Content Relevant to SAPPhIRE Model Using Large Language Model0
Applying RLAIF for Code Generation with API-usage in Lightweight LLMs0
ToolBeHonest: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language ModelsCode1
Handling Ontology Gaps in Semantic ParsingCode0
From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic DataCode0
Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented GenerationCode2
Mitigating Hallucination in Fictional Character Role-PlayCode0
Evaluating the Quality of Hallucination Benchmarks for Large Vision-Language ModelsCode1
Show:102550
← PrevPage 18 of 37Next →

No leaderboard results yet.