SOTAVerified

Hallucination

Papers

Showing 16011650 of 1816 papers

TitleStatusHype
Leveraging Pretrained Models for Automatic Summarization of Doctor-Patient ConversationsCode0
NormSAGE: Multi-Lingual Multi-Cultural Norm Discovery from Conversations On-the-FlyCode0
DAHL: Domain-specific Automated Hallucination Evaluation of Long-Form Text through a Benchmark Dataset in BiomedicineCode0
NVP-HRI: Zero Shot Natural Voice and Posture-based Human-Robot Interaction via Large Language ModelCode0
Learning with privileged information via adversarial discriminative modality distillationCode0
Self-Rationalization in the Wild: A Large Scale Out-of-Distribution Evaluation on NLI-related tasksCode0
Object Hallucination in Image CaptioningCode0
DAFNet: Dynamic Auxiliary Fusion for Sequential Model Editing in Large Language ModelsCode0
Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMsCode0
Localizing and Mitigating Errors in Long-form Question AnsweringCode0
Self-training Large Language Models through Knowledge DetectionCode0
Fine-grained Contract NER using instruction based modelCode0
Learning on LLM Output Signatures for gray-box LLM Behavior AnalysisCode0
Semantic Noise Matters for Neural Natural Language GenerationCode0
Learning Fine-grained Domain Generalization via Hyperbolic State Space HallucinationCode0
Learning Conformal Abstention Policies for Adaptive Risk Management in Large Language and Vision-Language ModelsCode0
Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP ConceptsCode0
Cross-modal Learning by Hallucinating Missing Modalities in RGB-D VisionCode0
SemEval-2025 Task 3: Mu-SHROOM, the Multilingual Shared Task on Hallucinations and Related Observable Overgeneration MistakesCode0
Large Language Models Are Involuntary Truth-Tellers: Exploiting Fallacy Failure for Jailbreak AttacksCode0
Brain-like Flexible Visual Inference by Harnessing Feedback-Feedforward AlignmentCode0
BordIRlines: A Dataset for Evaluating Cross-lingual Retrieval-Augmented GenerationCode0
On hallucinations in tomographic image reconstructionCode0
OnionEval: An Unified Evaluation of Fact-conflicting Hallucination for Small-Large Language ModelsCode0
On Large Language Models' Hallucination with Regard to Known FactsCode0
Critic-Driven Decoding for Mitigating Hallucinations in Data-to-text GenerationCode0
Tokenization Consistency Matters for Generative Models on Extractive NLP TasksCode0
Adversarial Semantic Hallucination for Domain Generalized Semantic SegmentationCode0
On-Policy Fine-grained Knowledge Feedback for Hallucination MitigationCode0
Vision-Encoders (Already) Know What They See: Mitigating Object Hallucination via Simple Fine-Grained CLIPScoreCode0
On the Benefits of Fine-Grained Loss Truncation: A Case Study on Factuality in SummarizationCode0
Crafting In-context Examples according to LMs' Parametric KnowledgeCode0
SH2: Self-Highlighted Hesitation Helps You Decode More TruthfullyCode0
Counterfactual Debating with Preset Stances for Hallucination Elimination of LLMsCode0
On the Hallucination in Simultaneous Machine TranslationCode0
Shakespearean Sparks: The Dance of Hallucination and Creativity in LLMs' Decoding LayersCode0
BioKGBench: A Knowledge Graph Checking Benchmark of AI Agent for Biomedical ScienceCode0
Beyond Ontology in Dialogue State Tracking for Goal-Oriented ChatbotCode0
SHROOM-INDElab at SemEval-2024 Task 6: Zero- and Few-Shot LLM-Based Classification for Hallucination DetectionCode0
Language Models Hallucinate, but May Excel at Fact VerificationCode0
On the Universal Truthfulness Hyperplane Inside LLMsCode0
Ontology-Constrained Generation of Domain-Specific Clinical SummariesCode0
SiGAN: Siamese Generative Adversarial Network for Identity-Preserving Face HallucinationCode0
Fidelity-Enriched Contrastive Search: Reconciling the Faithfulness-Diversity Trade-Off in Text GenerationCode0
SIGMORPHON 2020 Shared Task 0: Typologically Diverse Morphological InflectionCode0
Toward Reliable Biomedical Hypothesis Generation: Evaluating Truthfulness and Hallucination in Large Language ModelsCode0
Multi-Source Knowledge Pruning for Retrieval-Augmented Generation: A Benchmark and Empirical StudyCode0
Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word ProblemCode0
KG-FPQ: Evaluating Factuality Hallucination in LLMs with Knowledge Graph-based False Premise QuestionsCode0
keepitsimple at SemEval-2025 Task 3: LLM-Uncertainty based Approach for Multilingual Hallucination Span DetectionCode0
Show:102550
← PrevPage 33 of 37Next →

No leaderboard results yet.