| MiniCPM-V: A GPT-4V Level MLLM on Your Phone | Aug 3, 2024 | HallucinationMultiple-choice | CodeCode Available | 12 | 5 |
| Attentive Reasoning Queries: A Systematic Method for Optimizing Instruction-Following in Large Language Models | Mar 5, 2025 | HallucinationInstruction Following | CodeCode Available | 11 | 5 |
| SWIFT:A Scalable lightWeight Infrastructure for Fine-Tuning | Aug 10, 2024 | HallucinationOptical Character Recognition | CodeCode Available | 11 | 5 |
| RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness | May 27, 2024 | HallucinationImage Captioning | CodeCode Available | 11 | 5 |
| O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? | Nov 25, 2024 | HallucinationKnowledge Distillation | CodeCode Available | 7 | 5 |
| MoE-LLaVA: Mixture of Experts for Large Vision-Language Models | Jan 29, 2024 | HallucinationMixture-of-Experts | CodeCode Available | 7 | 5 |
| Gorilla: Large Language Model Connected with Massive APIs | May 24, 2023 | HallucinationLanguage Modeling | CodeCode Available | 6 | 5 |
| RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback | Dec 1, 2023 | HallucinationImage Captioning | CodeCode Available | 6 | 5 |
| Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers | Apr 27, 2025 | HallucinationQuestion Answering | CodeCode Available | 5 | 5 |
| DeepEyes: Incentivizing "Thinking with Images" via Reinforcement Learning | May 20, 2025 | HallucinationMathematical Reasoning | CodeCode Available | 5 | 5 |
| Chatlaw: A Multi-Agent Collaborative Legal Assistant with Knowledge Graph Enhanced Mixture-of-Experts Large Language Model | Jun 28, 2023 | HallucinationKnowledge Graphs | CodeCode Available | 5 | 5 |
| UQLM: A Python Package for Uncertainty Quantification in Large Language Models | Jul 8, 2025 | HallucinationUncertainty Quantification | CodeCode Available | 5 | 5 |
| Ferret: Refer and Ground Anything Anywhere at Any Granularity | Oct 11, 2023 | HallucinationLanguage Modeling | CodeCode Available | 5 | 5 |
| Lean Copilot: Large Language Models as Copilots for Theorem Proving in Lean | Apr 18, 2024 | Automated Theorem ProvingHallucination | CodeCode Available | 5 | 5 |
| Weakly Supervised Detection of Hallucinations in LLM Activations | Dec 5, 2023 | HallucinationLanguage Modeling | CodeCode Available | 5 | 5 |
| G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering | Feb 12, 2024 | Common Sense ReasoningGraph Classification | CodeCode Available | 4 | 5 |
| Hallucination of Multimodal Large Language Models: A Survey | Apr 29, 2024 | HallucinationSurvey | CodeCode Available | 4 | 5 |
| A Survey of State of the Art Large Vision Language Models: Alignment, Benchmark, Evaluations and Challenges | Jan 4, 2025 | FairnessHallucination | CodeCode Available | 4 | 5 |
| Retrieval-Augmented Generation for Large Language Models: A Survey | Dec 18, 2023 | HallucinationRAG | CodeCode Available | 4 | 5 |
| Multimodal Chain-of-Thought Reasoning in Language Models | Feb 2, 2023 | HallucinationLanguage Modelling | CodeCode Available | 4 | 5 |
| LettuceDetect: A Hallucination Detection Framework for RAG Applications | Feb 24, 2025 | 8kGPU | CodeCode Available | 4 | 5 |
| LLM-Enhanced Data Management | Feb 4, 2024 | HallucinationManagement | CodeCode Available | 4 | 5 |
| Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models | Feb 12, 2024 | HallucinationObject Localization | CodeCode Available | 4 | 5 |
| Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling | Nov 1, 2023 | HallucinationKnowledge Distillation | CodeCode Available | 4 | 5 |
| Halu-J: Critique-Based Hallucination Judge | Jul 17, 2024 | Evidence SelectionHallucination | CodeCode Available | 4 | 5 |