| TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models | May 28, 2024 | Hallucination | CodeCode Available | 1 |
| RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness | May 27, 2024 | HallucinationImage Captioning | CodeCode Available | 11 |
| Laboratory-Scale AI: Open-Weight Models are Competitive with ChatGPT Even in Low-Resource Settings | May 27, 2024 | Domain AdaptationGPU | —Unverified | 0 |
| Think Before You Act: A Two-Stage Framework for Mitigating Gender Bias Towards Vision-Language Tasks | May 27, 2024 | HallucinationObject Hallucination | CodeCode Available | 0 |
| GeneAgent: Self-verification Language Agent for Gene Set Knowledge Discovery using Domain Databases | May 25, 2024 | BenchmarkingHallucination | —Unverified | 0 |
| Large Language Model Pruning | May 24, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| Enhancing Visual-Language Modality Alignment in Large Vision Language Models via Self-Improvement | May 24, 2024 | HallucinationImage Comprehension | CodeCode Available | 2 |
| CHARP: Conversation History AwaReness Probing for Knowledge-grounded Dialogue Systems | May 24, 2024 | DiagnosticHallucination | —Unverified | 0 |
| Scaling Laws for Discriminative Classification in Large Language Models | May 24, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception | May 24, 2024 | Hallucination | CodeCode Available | 1 |