| On the Universal Truthfulness Hyperplane Inside LLMs | Jul 11, 2024 | DiversityDomain Generalization | CodeCode Available | 0 |
| Lynx: An Open Source Hallucination Evaluation Model | Jul 11, 2024 | HallucinationHallucination Evaluation | —Unverified | 0 |
| Knowledge Overshadowing Causes Amalgamated Hallucination in Large Language Models | Jul 10, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| Learning with Instance-Dependent Noisy Labels by Anchor Hallucination and Hard Sample Label Correction | Jul 10, 2024 | Hallucination | —Unverified | 0 |
| Fuse, Reason and Verify: Geometry Problem Solving with Parsed Clauses from Diagram | Jul 10, 2024 | DecoderGeometry Problem Solving | —Unverified | 0 |
| GTP-4o: Modality-prompted Heterogeneous Graph Learning for Omni-modal Biomedical Representation | Jul 8, 2024 | BenchmarkingGraph Embedding | —Unverified | 0 |
| KG-FPQ: Evaluating Factuality Hallucination in LLMs with Knowledge Graph-based False Premise Questions | Jul 8, 2024 | HallucinationKnowledge Graphs | CodeCode Available | 0 |
| Vision-Language Models under Cultural and Inclusive Considerations | Jul 8, 2024 | HallucinationSurvey | —Unverified | 0 |
| Enhancing Hallucination Detection through Perturbation-Based Synthetic Data Generation in System Responses | Jul 7, 2024 | HallucinationLanguage Modeling | CodeCode Available | 0 |
| VideoCoT: A Video Chain-of-Thought Dataset with Active Annotation Tool | Jul 7, 2024 | Active LearningHallucination | —Unverified | 0 |
| Code Hallucination | Jul 5, 2024 | Hallucination | —Unverified | 0 |
| Query-Guided Self-Supervised Summarization of Nursing Notes | Jul 4, 2024 | Abstractive Text SummarizationDomain Adaptation | —Unverified | 0 |
| Zero-shot Persuasive Chatbots with LLM-Generated Strategies and Information Retrieval | Jul 4, 2024 | ChatbotHallucination | —Unverified | 0 |
| Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models | Jul 4, 2024 | HallucinationQuestion Answering | —Unverified | 0 |
| STOC-TOT: Stochastic Tree-of-Thought with Constrained Decoding for Complex Reasoning in Multi-Hop Question Answering | Jul 4, 2024 | HallucinationMulti-hop Question Answering | —Unverified | 0 |
| Classification-Based Automatic HDL Code Generation Using LLMs | Jul 4, 2024 | ClassificationCode Generation | —Unverified | 0 |
| FSM: A Finite State Machine Based Zero-Shot Prompting Paradigm for Multi-Hop Question Answering | Jul 3, 2024 | HallucinationMulti-hop Question Answering | —Unverified | 0 |
| A Comparative Study of DSL Code Generation: Fine-Tuning vs. Optimized Retrieval Augmentation | Jul 3, 2024 | Code GenerationHallucination | —Unverified | 0 |
| LLM Internal States Reveal Hallucination Risk Faced With a Query | Jul 3, 2024 | HallucinationResponse Generation | CodeCode Available | 0 |
| Pelican: Correcting Hallucination in Vision-LLMs via Claim Decomposition and Program of Thought Verification | Jul 2, 2024 | Claim VerificationHallucination | —Unverified | 0 |
| Understanding Alignment in Multimodal LLMs: A Comprehensive Study | Jul 2, 2024 | Hallucination | —Unverified | 0 |
| Large Language Models Are Involuntary Truth-Tellers: Exploiting Fallacy Failure for Jailbreak Attacks | Jul 1, 2024 | HallucinationLanguage Modeling | CodeCode Available | 0 |
| The Need for Guardrails with Large Language Models in Medical Safety-Critical Settings: An Artificial Intelligence Application in the Pharmacovigilance Ecosystem | Jul 1, 2024 | HallucinationPharmacovigilance | —Unverified | 0 |
| LLM Uncertainty Quantification through Directional Entailment Graph and Claim Level Response Augmentation | Jul 1, 2024 | HallucinationUncertainty Quantification | —Unverified | 0 |
| Free-text Rationale Generation under Readability Level Control | Jul 1, 2024 | HallucinationText Generation | —Unverified | 0 |
| Unveiling Glitches: A Deep Dive into Image Encoding Bugs within CLIP | Jun 30, 2024 | HallucinationImage Comprehension | —Unverified | 0 |
| A Study on Effect of Reference Knowledge Choice in Generating Technical Content Relevant to SAPPhIRE Model Using Large Language Model | Jun 29, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| BioKGBench: A Knowledge Graph Checking Benchmark of AI Agent for Biomedical Science | Jun 29, 2024 | AI AgentClaim Verification | CodeCode Available | 0 |
| PFME: A Modular Approach for Fine-grained Hallucination Detection and Editing of Large Language Models | Jun 29, 2024 | HallucinationSentence | —Unverified | 0 |
| Applying RLAIF for Code Generation with API-usage in Lightweight LLMs | Jun 28, 2024 | Code GenerationHallucination | —Unverified | 0 |
| Handling Ontology Gaps in Semantic Parsing | Jun 27, 2024 | HallucinationQuestion Answering | CodeCode Available | 0 |
| From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data | Jun 27, 2024 | HallucinationInformation Retrieval | CodeCode Available | 0 |
| Mitigating Hallucination in Fictional Character Role-Play | Jun 25, 2024 | HallucinationWorld Knowledge | CodeCode Available | 0 |
| VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models | Jun 24, 2024 | HallucinationVideo Understanding | —Unverified | 0 |
| Prompt-Consistency Image Generation (PCIG): A Unified Framework Integrating LLMs, Knowledge Graphs, and Controllable Diffusion Models | Jun 24, 2024 | HallucinationImage Generation | CodeCode Available | 0 |
| Large Language Models are Skeptics: False Negative Problem of Input-conflicting Hallucination | Jun 20, 2024 | Hallucination | —Unverified | 0 |
| HIGHT: Hierarchical Graph Tokenization for Molecule-Language Alignment | Jun 20, 2024 | Graph Neural NetworkHallucination | —Unverified | 0 |
| Does Object Grounding Really Reduce Hallucination of Large Vision-Language Models? | Jun 20, 2024 | Caption GenerationHallucination | —Unverified | 0 |
| From Descriptive Richness to Bias: Unveiling the Dark Side of Generative Image Caption Enrichment | Jun 20, 2024 | DescriptiveHallucination | —Unverified | 0 |
| StackRAG Agent: Improving Developer Answers with Retrieval-Augmented Generation | Jun 19, 2024 | HallucinationRetrieval | CodeCode Available | 0 |
| What Matters in Memorizing and Recalling Facts? Multifaceted Benchmarks for Knowledge Probing in Language Models | Jun 18, 2024 | DecoderHallucination | —Unverified | 0 |
| Detecting Errors through Ensembling Prompts (DEEP): An End-to-End LLM Framework for Detecting Factual Errors | Jun 18, 2024 | HallucinationLanguage Modeling | CodeCode Available | 0 |
| RichRAG: Crafting Rich Responses for Multi-faceted Queries in Retrieval-Augmented Generation | Jun 18, 2024 | HallucinationRAG | —Unverified | 0 |
| On-Policy Fine-grained Knowledge Feedback for Hallucination Mitigation | Jun 18, 2024 | HallucinationResponse Generation | CodeCode Available | 0 |
| Beyond Under-Alignment: Atomic Preference Enhanced Factuality Tuning for Large Language Models | Jun 18, 2024 | Hallucination | —Unverified | 0 |
| Do More Details Always Introduce More Hallucinations in LVLM-based Image Captioning? | Jun 18, 2024 | AttributeHallucination | —Unverified | 0 |
| Counterfactual Debating with Preset Stances for Hallucination Elimination of LLMs | Jun 17, 2024 | counterfactualHallucination | CodeCode Available | 0 |
| CoMT: Chain-of-Medical-Thought Reduces Hallucination in Medical Report Generation | Jun 17, 2024 | DiagnosticHallucination | —Unverified | 0 |
| Mitigating Large Language Model Hallucination with Faithful Finetuning | Jun 17, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| InternalInspector I^2: Robust Confidence Estimation in LLMs through Internal States | Jun 17, 2024 | BenchmarkingContrastive Learning | —Unverified | 0 |