SOTAVerified

Hallucination

Papers

Showing 14011450 of 1816 papers

TitleStatusHype
RegaVAE: A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder for Language ModelingCode1
Metric Ensembles For Hallucination Detection0
Factored Verification: Detecting and Reducing Hallucination in Summaries of Academic PapersCode1
Assessing the Reliability of Large Language Model KnowledgeCode0
Configuration Validation with Large Language Models0
"Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in LLM-Generated Reference LettersCode1
Improving Large Language Models in Event Relation Logical PredictionCode1
KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination DetectionCode1
From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language ModelsCode2
GraphextQA: A Benchmark for Evaluating Graph-Enhanced Large Language ModelsCode0
GameGPT: Multi-agent Collaborative Framework for Game Development0
Enhancing Text-based Knowledge Graph Completion with Zero-Shot Large Language Models: A Focus on Semantic EnhancementCode1
Ferret: Refer and Ground Anything Anywhere at Any GranularityCode5
OpsEval: A Comprehensive IT Operations Benchmark Suite for Large Language ModelsCode1
A New Benchmark and Reverse Validation Method for Passage-level Hallucination DetectionCode0
Teaching Language Models to Hallucinate Less with Synthetic Tasks0
Towards Mitigating Hallucination in Large Language Models via Self-Reflection0
Negative Object Presence Evaluation (NOPE) to Measure Object Hallucination in Vision-Language Models0
The Troubling Emergence of Hallucination in Large Language Models -- An Extensive Definition, Quantification, and Prescriptive Remediations0
Improving the Reliability of Large Language Models by Leveraging Uncertainty-Aware In-Context Learning0
Chain of Natural Language Inference for Reducing Large Language Model Ungrounded HallucinationsCode1
Evaluating Hallucinations in Chinese Large Language ModelsCode3
FreshLLMs: Refreshing Large Language Models with Search Engine AugmentationCode2
MLAgentBench: Evaluating Language Agents on Machine Learning ExperimentationCode2
AGIR: Automating Cyber Threat Intelligence Reporting with Natural Language GenerationCode1
HallE-Control: Controlling Object Hallucination in Large Multimodal ModelsCode1
LLM Lies: Hallucinations are not Bugs, but Features as Adversarial ExamplesCode1
BTR: Binary Token Representations for Efficient Retrieval Augmented Language ModelsCode1
Analyzing and Mitigating Object Hallucination in Large Vision-Language ModelsCode1
AutoHall: Automated Hallucination Dataset Generation for Large Language Models0
Robust 3D Object Detection from LiDAR-Radar Point Clouds via Cross-Modal Feature AugmentationCode1
Self-Specialization: Uncovering Latent Expertise within Large Language Models0
Hallucination Reduction in Long Input Text SummarizationCode0
Self-supervised Cross-view Representation Reconstruction for Change CaptioningCode1
Augmenting LLMs with Knowledge: A survey on hallucination prevention0
Neuro Symbolic Reasoning for Planning: Counterexample Guided Inductive Synthesis using Large Language Models and Satisfiability Solving0
Lyra: Orchestrating Dual Correction in Automated Theorem ProvingCode1
Aligning Large Multimodal Models with Factually Augmented RLHF0
BAMBOO: A Comprehensive Benchmark for Evaluating Long Text Modeling Capacities of Large Language ModelsCode1
Construction of Paired Knowledge Graph-Text Datasets Informed by Cyclic Evaluation0
Exploring the Relationship between LLM Hallucinations and Prompt Linguistic Nuances: Readability, Formality, and Concreteness0
Chain-of-Verification Reduces Hallucination in Large Language ModelsCode0
Explaining Agent Behavior with Large Language Models0
PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded Dialogue SystemsCode0
A Multitask Training Approach to Enhance Whisper with Contextual Biasing and Open-Vocabulary Keyword Spotting0
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages0
Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data?Code1
"Merge Conflicts!" Exploring the Impacts of External Distractors to Parametric Knowledge GraphsCode0
MMICL: Empowering Vision-language Model with Multi-Modal In-Context LearningCode2
Cognitive Mirage: A Review of Hallucinations in Large Language ModelsCode1
Show:102550
← PrevPage 29 of 37Next →

No leaderboard results yet.