| RegaVAE: A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder for Language Modeling | Oct 16, 2023 | HallucinationLanguage Modeling | CodeCode Available | 1 |
| Metric Ensembles For Hallucination Detection | Oct 16, 2023 | Abstractive Text SummarizationHallucination | —Unverified | 0 |
| Factored Verification: Detecting and Reducing Hallucination in Summaries of Academic Papers | Oct 16, 2023 | 16kHallucination | CodeCode Available | 1 |
| Assessing the Reliability of Large Language Model Knowledge | Oct 15, 2023 | HallucinationKnowledge Probing | CodeCode Available | 0 |
| Configuration Validation with Large Language Models | Oct 15, 2023 | Code GenerationFew-Shot Learning | —Unverified | 0 |
| "Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in LLM-Generated Reference Letters | Oct 13, 2023 | BenchmarkingFairness | CodeCode Available | 1 |
| Improving Large Language Models in Event Relation Logical Prediction | Oct 13, 2023 | counterfactualEvent Relation Extraction | CodeCode Available | 1 |
| KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detection | Oct 13, 2023 | Abstractive Text SummarizationHallucination | CodeCode Available | 1 |
| From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models | Oct 13, 2023 | HallucinationImage Captioning | CodeCode Available | 2 |
| GraphextQA: A Benchmark for Evaluating Graph-Enhanced Large Language Models | Oct 12, 2023 | Answer GenerationHallucination | CodeCode Available | 0 |
| GameGPT: Multi-agent Collaborative Framework for Game Development | Oct 12, 2023 | Code GenerationHallucination | —Unverified | 0 |
| Enhancing Text-based Knowledge Graph Completion with Zero-Shot Large Language Models: A Focus on Semantic Enhancement | Oct 12, 2023 | Contrastive LearningData Augmentation | CodeCode Available | 1 |
| Ferret: Refer and Ground Anything Anywhere at Any Granularity | Oct 11, 2023 | HallucinationLanguage Modeling | CodeCode Available | 5 |
| OpsEval: A Comprehensive IT Operations Benchmark Suite for Large Language Models | Oct 11, 2023 | HallucinationIn-Context Learning | CodeCode Available | 1 |
| A New Benchmark and Reverse Validation Method for Passage-level Hallucination Detection | Oct 10, 2023 | HallucinationSentence | CodeCode Available | 0 |
| Teaching Language Models to Hallucinate Less with Synthetic Tasks | Oct 10, 2023 | Abstractive Text SummarizationHallucination | —Unverified | 0 |
| Towards Mitigating Hallucination in Large Language Models via Self-Reflection | Oct 10, 2023 | Answer GenerationHallucination | —Unverified | 0 |
| Negative Object Presence Evaluation (NOPE) to Measure Object Hallucination in Vision-Language Models | Oct 9, 2023 | HallucinationObject | —Unverified | 0 |
| The Troubling Emergence of Hallucination in Large Language Models -- An Extensive Definition, Quantification, and Prescriptive Remediations | Oct 8, 2023 | Hallucination | —Unverified | 0 |
| Improving the Reliability of Large Language Models by Leveraging Uncertainty-Aware In-Context Learning | Oct 7, 2023 | HallucinationIn-Context Learning | —Unverified | 0 |
| Chain of Natural Language Inference for Reducing Large Language Model Ungrounded Hallucinations | Oct 6, 2023 | HallucinationLanguage Modeling | CodeCode Available | 1 |
| Evaluating Hallucinations in Chinese Large Language Models | Oct 5, 2023 | HallucinationQuestion Answering | CodeCode Available | 3 |
| FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation | Oct 5, 2023 | HallucinationWorld Knowledge | CodeCode Available | 2 |
| MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation | Oct 5, 2023 | BenchmarkingDecision Making | CodeCode Available | 2 |
| AGIR: Automating Cyber Threat Intelligence Reporting with Natural Language Generation | Oct 4, 2023 | HallucinationText Generation | CodeCode Available | 1 |
| HallE-Control: Controlling Object Hallucination in Large Multimodal Models | Oct 3, 2023 | AttributeDecoder | CodeCode Available | 1 |
| LLM Lies: Hallucinations are not Bugs, but Features as Adversarial Examples | Oct 2, 2023 | Hallucination | CodeCode Available | 1 |
| BTR: Binary Token Representations for Efficient Retrieval Augmented Language Models | Oct 2, 2023 | HallucinationRetrieval | CodeCode Available | 1 |
| Analyzing and Mitigating Object Hallucination in Large Vision-Language Models | Oct 1, 2023 | HallucinationHallucination Evaluation | CodeCode Available | 1 |
| AutoHall: Automated Hallucination Dataset Generation for Large Language Models | Sep 30, 2023 | Dataset GenerationFact Checking | —Unverified | 0 |
| Robust 3D Object Detection from LiDAR-Radar Point Clouds via Cross-Modal Feature Augmentation | Sep 29, 2023 | 3D Object DetectionAttribute | CodeCode Available | 1 |
| Self-Specialization: Uncovering Latent Expertise within Large Language Models | Sep 29, 2023 | HallucinationInstruction Following | —Unverified | 0 |
| Hallucination Reduction in Long Input Text Summarization | Sep 28, 2023 | DecoderHallucination | CodeCode Available | 0 |
| Self-supervised Cross-view Representation Reconstruction for Change Captioning | Sep 28, 2023 | Caption GenerationHallucination | CodeCode Available | 1 |
| Augmenting LLMs with Knowledge: A survey on hallucination prevention | Sep 28, 2023 | HallucinationLanguage Modeling | —Unverified | 0 |
| Neuro Symbolic Reasoning for Planning: Counterexample Guided Inductive Synthesis using Large Language Models and Satisfiability Solving | Sep 28, 2023 | HallucinationQuestion Answering | —Unverified | 0 |
| Lyra: Orchestrating Dual Correction in Automated Theorem Proving | Sep 27, 2023 | Automated Theorem ProvingHallucination | CodeCode Available | 1 |
| Aligning Large Multimodal Models with Factually Augmented RLHF | Sep 25, 2023 | HallucinationImage Captioning | —Unverified | 0 |
| BAMBOO: A Comprehensive Benchmark for Evaluating Long Text Modeling Capacities of Large Language Models | Sep 23, 2023 | Code CompletionHallucination | CodeCode Available | 1 |
| Construction of Paired Knowledge Graph-Text Datasets Informed by Cyclic Evaluation | Sep 20, 2023 | HallucinationKnowledge Graphs | —Unverified | 0 |
| Exploring the Relationship between LLM Hallucinations and Prompt Linguistic Nuances: Readability, Formality, and Concreteness | Sep 20, 2023 | Hallucination | —Unverified | 0 |
| Chain-of-Verification Reduces Hallucination in Large Language Models | Sep 20, 2023 | HallucinationText Generation | CodeCode Available | 0 |
| Explaining Agent Behavior with Large Language Models | Sep 19, 2023 | counterfactualHallucination | —Unverified | 0 |
| PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded Dialogue Systems | Sep 19, 2023 | HallucinationLanguage Modelling | CodeCode Available | 0 |
| A Multitask Training Approach to Enhance Whisper with Contextual Biasing and Open-Vocabulary Keyword Spotting | Sep 18, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages | Sep 17, 2023 | HallucinationLanguage Identification | —Unverified | 0 |
| Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data? | Sep 16, 2023 | Hallucination | CodeCode Available | 1 |
| "Merge Conflicts!" Exploring the Impacts of External Distractors to Parametric Knowledge Graphs | Sep 15, 2023 | HallucinationKnowledge Graphs | CodeCode Available | 0 |
| MMICL: Empowering Vision-language Model with Multi-Modal In-Context Learning | Sep 14, 2023 | HallucinationIn-Context Learning | CodeCode Available | 2 |
| Cognitive Mirage: A Review of Hallucinations in Large Language Models | Sep 13, 2023 | HallucinationText Generation | CodeCode Available | 1 |