| Sources of Hallucination by Large Language Models on Inference Tasks | May 23, 2023 | HallucinationMemorization | CodeCode Available | 1 |
| WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia | May 23, 2023 | ChatbotHallucination | CodeCode Available | 3 |
| The Knowledge Alignment Problem: Bridging Human and External Knowledge for Large Language Models | May 23, 2023 | HallucinationLanguage Modeling | CodeCode Available | 0 |
| PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuning | May 23, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 0 |
| mmT5: Modular Multilingual Pre-Training Solves Source Language Hallucinations | May 23, 2023 | HallucinationNatural Language Understanding | —Unverified | 0 |
| How Language Model Hallucinations Can Snowball | May 22, 2023 | HallucinationLanguage Modeling | CodeCode Available | 1 |
| Element-aware Summarization with Large Language Models: Expert-aligned Evaluation and Chain-of-Thought Method | May 22, 2023 | BenchmarkingHallucination | CodeCode Available | 1 |
| Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources | May 22, 2023 | HallucinationLanguage Modelling | CodeCode Available | 1 |
| Scene Graph as Pivoting: Inference-time Image-free Unsupervised Multimodal Machine Translation with Visual Scene Hallucination | May 20, 2023 | HallucinationMachine Translation | CodeCode Available | 1 |
| HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models | May 19, 2023 | HallucinationHallucination Evaluation | CodeCode Available | 2 |
| HalOmi: A Manually Annotated Benchmark for Multilingual Hallucination and Omission Detection in Machine Translation | May 19, 2023 | HallucinationMachine Translation | CodeCode Available | 2 |
| RCOT: Detecting and Rectifying Factual Inconsistency in Reasoning by Reversing Chain-of-Thought | May 19, 2023 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| Appraising the Potential Uses and Harms of LLMs for Medical Systematic Reviews | May 19, 2023 | Decision MakingHallucination | CodeCode Available | 0 |
| Evaluating Object Hallucination in Large Vision-Language Models | May 17, 2023 | HallucinationObject | CodeCode Available | 2 |
| Is ChatGPT a Good Causal Reasoner? A Comprehensive Evaluation | May 12, 2023 | HallucinationIn-Context Learning | CodeCode Available | 1 |
| Meta-hallucinator: Towards Few-Shot Cross-Modality Cardiac Image Segmentation | May 11, 2023 | Cardiac SegmentationDomain Adaptation | —Unverified | 0 |
| Simple Token-Level Confidence Improves Caption Correctness | May 11, 2023 | HallucinationImage Captioning | —Unverified | 0 |
| Exploring Human-Like Translation Strategy with Large Language Models | May 6, 2023 | HallucinationMachine Translation | CodeCode Available | 2 |
| ChartSumm: A Comprehensive Benchmark for Automatic Chart Summarization of Long and Short Summaries | Apr 26, 2023 | Data SummarizationHallucination | CodeCode Available | 1 |
| Benchmarking ChatGPT-4 on ACR Radiation Oncology In-Training (TXIT) Exam and Red Journal Gray Zone Cases: Potentials and Challenges for AI-Assisted Medical Education and Decision Making in Radiation Oncology | Apr 24, 2023 | BenchmarkingDecision Making | CodeCode Available | 0 |
| The Dark Side of ChatGPT: Legal and Ethical Challenges from Stochastic Parrots and Hallucination | Apr 21, 2023 | Hallucination | —Unverified | 0 |
| Using Mobile Data and Deep Models to Assess Auditory Verbal Hallucinations | Apr 20, 2023 | HallucinationTransfer Learning | —Unverified | 0 |
| GPT-NER: Named Entity Recognition via Large Language Models | Apr 20, 2023 | Hallucinationnamed-entity-recognition | CodeCode Available | 2 |
| Dual Stage Stylization Modulation for Domain Generalized Semantic Segmentation | Apr 18, 2023 | DiversityDomain Generalization | —Unverified | 0 |
| OVTrack: Open-Vocabulary Multiple Object Tracking | Apr 17, 2023 | DenoisingHallucination | CodeCode Available | 1 |