| Chatlaw: A Multi-Agent Collaborative Legal Assistant with Knowledge Graph Enhanced Mixture-of-Experts Large Language Model | Jun 28, 2023 | HallucinationKnowledge Graphs | CodeCode Available | 5 |
| Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning | Jun 26, 2023 | HallucinationVisual Question Answering | CodeCode Available | 2 |
| Evidence for Reduced Sensory Precision and Increased Reliance on Priors in Hallucination-Prone Individuals in a General Population Sample | Jun 24, 2023 | Hallucination | —Unverified | 0 |
| IERL: Interpretable Ensemble Representation Learning -- Combining CrowdSourced Knowledge and Distributed Semantic Representations | Jun 24, 2023 | Ensemble LearningHallucination | —Unverified | 0 |
| ToolQA: A Dataset for LLM Question Answering with External Tools | Jun 23, 2023 | HallucinationQuestion Answering | CodeCode Available | 2 |
| A Survey on Multimodal Large Language Models | Jun 23, 2023 | HallucinationIn-Context Learning | —Unverified | 0 |
| Hallucination is the last thing you need | Jun 20, 2023 | Fact CheckingHallucination | —Unverified | 0 |
| Vision Transformer with Attention Map Hallucination and FFN Compaction | Jun 19, 2023 | Dimensionality ReductionHallucination | —Unverified | 0 |
| Are Large Language Models Really Good Logical Reasoners? A Comprehensive Evaluation and Beyond | Jun 16, 2023 | BenchmarkingEvidence Selection | CodeCode Available | 1 |
| Pushing the Limits of ChatGPT on NLP Tasks | Jun 16, 2023 | Dependency ParsingEvent Extraction | —Unverified | 0 |
| Explaining Legal Concepts with Augmented Large Language Models (GPT-4) | Jun 15, 2023 | HallucinationInformation Retrieval | —Unverified | 0 |
| KoLA: Carefully Benchmarking World Knowledge of Large Language Models | Jun 15, 2023 | BenchmarkingHallucination | CodeCode Available | 1 |
| LVLM-eHub: A Comprehensive Evaluation Benchmark for Large Vision-Language Models | Jun 15, 2023 | HallucinationImage Captioning | CodeCode Available | 2 |
| Aladdin: Zero-Shot Hallucination of Stylized 3D Assets from Abstract Scene Descriptions | Jun 9, 2023 | HallucinationLanguage Modeling | CodeCode Available | 1 |
| Trapping LLM Hallucinations Using Tagged Context Prompts | Jun 9, 2023 | Hallucination | —Unverified | 0 |
| Defocus to focus: Photo-realistic bokeh rendering by fusing defocus and radiance priors | Jun 7, 2023 | Hallucination | —Unverified | 0 |
| Efficient and Interpretable Compressive Text Summarisation with Unsupervised Dual-Agent Reinforcement Learning | Jun 6, 2023 | Hallucinationreinforcement-learning | CodeCode Available | 0 |
| Do Language Models Know When They're Hallucinating References? | May 29, 2023 | HallucinationLanguage Modeling | CodeCode Available | 0 |
| An Investigation of Evaluation Metrics for Automated Medical Note Generation | May 27, 2023 | Graph EmbeddingHallucination | CodeCode Available | 0 |
| AdaPlanner: Adaptive Planning from Feedback with Language Models | May 26, 2023 | Decision MakingHallucination | CodeCode Available | 1 |
| Getting Sick After Seeing a Doctor? Diagnosing and Mitigating Knowledge Conflicts in Event Temporal Reasoning | May 24, 2023 | counterfactualData Augmentation | CodeCode Available | 0 |
| Enabling Large Language Models to Generate Text with Citations | May 24, 2023 | HallucinationRetrieval | CodeCode Available | 2 |
| Lawyer LLaMA Technical Report | May 24, 2023 | ArticlesHallucination | CodeCode Available | 2 |
| Gorilla: Large Language Model Connected with Massive APIs | May 24, 2023 | HallucinationLanguage Modeling | CodeCode Available | 6 |
| RefGPT: Dialogue Generation of GPT, by GPT, and for GPT | May 24, 2023 | Dialogue GenerationHallucination | CodeCode Available | 1 |
| Sources of Hallucination by Large Language Models on Inference Tasks | May 23, 2023 | HallucinationMemorization | CodeCode Available | 1 |
| WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia | May 23, 2023 | ChatbotHallucination | CodeCode Available | 3 |
| The Knowledge Alignment Problem: Bridging Human and External Knowledge for Large Language Models | May 23, 2023 | HallucinationLanguage Modeling | CodeCode Available | 0 |
| PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuning | May 23, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 0 |
| mmT5: Modular Multilingual Pre-Training Solves Source Language Hallucinations | May 23, 2023 | HallucinationNatural Language Understanding | —Unverified | 0 |
| How Language Model Hallucinations Can Snowball | May 22, 2023 | HallucinationLanguage Modeling | CodeCode Available | 1 |
| Element-aware Summarization with Large Language Models: Expert-aligned Evaluation and Chain-of-Thought Method | May 22, 2023 | BenchmarkingHallucination | CodeCode Available | 1 |
| Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources | May 22, 2023 | HallucinationLanguage Modelling | CodeCode Available | 1 |
| Scene Graph as Pivoting: Inference-time Image-free Unsupervised Multimodal Machine Translation with Visual Scene Hallucination | May 20, 2023 | HallucinationMachine Translation | CodeCode Available | 1 |
| HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models | May 19, 2023 | HallucinationHallucination Evaluation | CodeCode Available | 2 |
| HalOmi: A Manually Annotated Benchmark for Multilingual Hallucination and Omission Detection in Machine Translation | May 19, 2023 | HallucinationMachine Translation | CodeCode Available | 2 |
| RCOT: Detecting and Rectifying Factual Inconsistency in Reasoning by Reversing Chain-of-Thought | May 19, 2023 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| Appraising the Potential Uses and Harms of LLMs for Medical Systematic Reviews | May 19, 2023 | Decision MakingHallucination | CodeCode Available | 0 |
| Evaluating Object Hallucination in Large Vision-Language Models | May 17, 2023 | HallucinationObject | CodeCode Available | 2 |
| Is ChatGPT a Good Causal Reasoner? A Comprehensive Evaluation | May 12, 2023 | HallucinationIn-Context Learning | CodeCode Available | 1 |
| Meta-hallucinator: Towards Few-Shot Cross-Modality Cardiac Image Segmentation | May 11, 2023 | Cardiac SegmentationDomain Adaptation | —Unverified | 0 |
| Simple Token-Level Confidence Improves Caption Correctness | May 11, 2023 | HallucinationImage Captioning | —Unverified | 0 |
| Exploring Human-Like Translation Strategy with Large Language Models | May 6, 2023 | HallucinationMachine Translation | CodeCode Available | 2 |
| ChartSumm: A Comprehensive Benchmark for Automatic Chart Summarization of Long and Short Summaries | Apr 26, 2023 | Data SummarizationHallucination | CodeCode Available | 1 |
| Benchmarking ChatGPT-4 on ACR Radiation Oncology In-Training (TXIT) Exam and Red Journal Gray Zone Cases: Potentials and Challenges for AI-Assisted Medical Education and Decision Making in Radiation Oncology | Apr 24, 2023 | BenchmarkingDecision Making | CodeCode Available | 0 |
| The Dark Side of ChatGPT: Legal and Ethical Challenges from Stochastic Parrots and Hallucination | Apr 21, 2023 | Hallucination | —Unverified | 0 |
| Using Mobile Data and Deep Models to Assess Auditory Verbal Hallucinations | Apr 20, 2023 | HallucinationTransfer Learning | —Unverified | 0 |
| GPT-NER: Named Entity Recognition via Large Language Models | Apr 20, 2023 | Hallucinationnamed-entity-recognition | CodeCode Available | 2 |
| Dual Stage Stylization Modulation for Domain Generalized Semantic Segmentation | Apr 18, 2023 | DiversityDomain Generalization | —Unverified | 0 |
| OVTrack: Open-Vocabulary Multiple Object Tracking | Apr 17, 2023 | DenoisingHallucination | CodeCode Available | 1 |