| A Perspective for Adapting Generalist AI to Specialized Medical AI Applications and Their Challenges | Oct 28, 2024 | Drug DiscoveryHallucination | —Unverified | 0 |
| A Debate-Driven Experiment on LLM Hallucinations and Accuracy | Oct 25, 2024 | Fact CheckingHallucination | —Unverified | 0 |
| Conditional Hallucinations for Image Compression | Oct 25, 2024 | HallucinationImage Compression | —Unverified | 0 |
| TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning | Oct 25, 2024 | EgoSchemaHallucination | CodeCode Available | 2 |
| Investigating the Role of Prompting and External Tools in Hallucination Rates of Large Language Models | Oct 25, 2024 | HallucinationPrompt Engineering | —Unverified | 0 |
| MaCTG: Multi-Agent Collaborative Thought Graph for Automatic Programming | Oct 25, 2024 | Code GenerationHallucination | —Unverified | 0 |
| Multilingual Hallucination Gaps in Large Language Models | Oct 23, 2024 | HallucinationText Generation | —Unverified | 0 |
| AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models | Oct 23, 2024 | Hallucination | —Unverified | 0 |
| Leveraging the Domain Adaptation of Retrieval Augmented Generation Models for Question Answering and Reducing Hallucination | Oct 23, 2024 | Domain AdaptationHallucination | —Unverified | 0 |
| ProveRAG: Provenance-Driven Vulnerability Analysis with Automated Retrieval-Augmented LLMs | Oct 22, 2024 | ChunkingHallucination | CodeCode Available | 0 |
| Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination | Oct 22, 2024 | Hallucination | —Unverified | 0 |
| GeoCode-GPT: A Large Language Model for Geospatial Code Generation Tasks | Oct 22, 2024 | Code GenerationCode Summarization | —Unverified | 0 |
| Privacy-hardened and hallucination-resistant synthetic data generation with logic-solvers | Oct 22, 2024 | Generative Adversarial NetworkHallucination | —Unverified | 0 |
| IPL: Leveraging Multimodal Large Language Models for Intelligent Product Listing | Oct 22, 2024 | HallucinationRAG | —Unverified | 0 |
| Fine-Tuning Large Language Models to Appropriately Abstain with Semantic Entropy | Oct 22, 2024 | FormHallucination | —Unverified | 0 |
| Navigating Noisy Feedback: Enhancing Reinforcement Learning with Error-Prone Language Models | Oct 22, 2024 | HallucinationLanguage Modeling | CodeCode Available | 0 |
| SG-FSM: A Self-Guiding Zero-Shot Prompting Paradigm for Multi-Hop Question Answering Based on Finite State Machine | Oct 22, 2024 | HallucinationMulti-hop Question Answering | —Unverified | 0 |
| Large language models enabled multiagent ensemble method for efficient EHR data labeling | Oct 21, 2024 | Hallucination | —Unverified | 0 |
| Towards a Reliable Offline Personal AI Assistant for Long Duration Spaceflight | Oct 21, 2024 | HallucinationKnowledge Graphs | —Unverified | 0 |
| Mitigating Hallucinations of Large Language Models in Medical Information Extraction via Contrastive Decoding | Oct 21, 2024 | Hallucination | —Unverified | 0 |
| Mitigating Object Hallucination via Concentric Causal Attention | Oct 21, 2024 | HallucinationObject | CodeCode Available | 2 |
| ToW: Thoughts of Words Improve Reasoning in Large Language Models | Oct 21, 2024 | Data AugmentationHallucination | CodeCode Available | 0 |
| Can Knowledge Editing Really Correct Hallucinations? | Oct 21, 2024 | Hallucinationknowledge editing | CodeCode Available | 1 |
| Reducing Hallucinations in Vision-Language Models via Latent Space Steering | Oct 21, 2024 | Hallucination | CodeCode Available | 2 |
| NetSafe: Exploring the Topological Safety of Multi-agent Networks | Oct 21, 2024 | HallucinationMisinformation | —Unverified | 0 |
| Learning to Generate and Evaluate Fact-checking Explanations with Transformers | Oct 21, 2024 | Fact CheckingHallucination | —Unverified | 0 |
| A Survey of Hallucination in Large Visual Language Models | Oct 20, 2024 | HallucinationHallucination Evaluation | —Unverified | 0 |
| Hallucination Detox: Sensitivity Dropout (SenD) for Large Language Model Training | Oct 20, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| Explaining Graph Neural Networks with Large Language Models: A Counterfactual Perspective for Molecular Property Prediction | Oct 19, 2024 | counterfactualCounterfactual Explanation | CodeCode Available | 0 |
| Coarse-to-Fine Highlighting: Reducing Knowledge Hallucination in Large Language Models | Oct 19, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| Good Parenting is all you need -- Multi-agentic LLM Hallucination Mitigation | Oct 18, 2024 | AllHallucination | —Unverified | 0 |
| ELOQ: Resources for Enhancing LLM Detection of Out-of-Scope Questions | Oct 18, 2024 | HallucinationNatural Questions | CodeCode Available | 0 |
| Paths-over-Graph: Knowledge Graph Empowered Large Language Model Reasoning | Oct 18, 2024 | HallucinationKnowledge Base Question Answering | CodeCode Available | 1 |
| ETF: An Entity Tracing Framework for Hallucination Detection in Code Summaries | Oct 17, 2024 | Code SummarizationHallucination | —Unverified | 0 |
| From Single to Multi: How LLMs Hallucinate in Multi-Document Summarization | Oct 17, 2024 | Document SummarizationHallucination | CodeCode Available | 0 |
| MCQG-SRefine: Multiple Choice Question Generation and Evaluation with Iterative Self-Critique, Correction, and Comparison Feedback | Oct 17, 2024 | Fact VerificationHallucination | CodeCode Available | 0 |
| Mitigating Hallucinations in Large Vision-Language Models via Summary-Guided Decoding | Oct 17, 2024 | HallucinationObject Hallucination | CodeCode Available | 1 |
| Utilizing Large Language Models in an iterative paradigm with domain feedback for zero-shot molecule optimization | Oct 17, 2024 | Drug DiscoveryHallucination | —Unverified | 0 |
| FaithBench: A Diverse Hallucination Benchmark for Summarization by Modern LLMs | Oct 17, 2024 | DiversityHallucination | CodeCode Available | 1 |
| RosePO: Aligning LLM-based Recommenders with Human Values | Oct 16, 2024 | HallucinationRecommendation Systems | —Unverified | 0 |
| MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models | Oct 16, 2024 | DiagnosticHallucination | CodeCode Available | 3 |
| On A Scale From 1 to 5: Quantifying Hallucination in Faithfulness Evaluation | Oct 16, 2024 | HallucinationNatural Language Inference | —Unverified | 0 |
| When Not to Answer: Evaluating Prompts on GPT Models for Effective Abstention in Unanswerable Math Word Problems | Oct 16, 2024 | HallucinationMath | —Unverified | 0 |
| Iter-AHMCL: Alleviate Hallucination for Large Language Model via Iterative Model-level Contrastive Learning | Oct 16, 2024 | Contrastive Learninggraph construction | —Unverified | 0 |
| Graph-constrained Reasoning: Faithful Reasoning on Knowledge Graphs with Large Language Models | Oct 16, 2024 | HallucinationKnowledge Graphs | CodeCode Available | 3 |
| What Do LLMs Need to Understand Graphs: A Survey of Parametric Representation of Graphs | Oct 16, 2024 | Drug DiscoveryGraph Generation | —Unverified | 0 |
| Controlled Automatic Task-Specific Synthetic Data Generation for Hallucination Detection | Oct 16, 2024 | HallucinationIn-Context Learning | —Unverified | 0 |
| A Claim Decomposition Benchmark for Long-form Answer Verification | Oct 16, 2024 | FormHallucination | CodeCode Available | 0 |
| The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio | Oct 16, 2024 | Hallucination | CodeCode Available | 3 |
| Search Engines in an AI Era: The False Promise of Factual and Verifiable Source-Cited Responses | Oct 15, 2024 | HallucinationLanguage Modeling | CodeCode Available | 1 |