| Unleashing the Emergent Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration | Jul 11, 2023 | HallucinationLogic Grid Puzzle | CodeCode Available | 4 | 5 |
| Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making | Oct 9, 2024 | BenchmarkingDecision Making | CodeCode Available | 3 | 5 |
| PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models | Feb 12, 2024 | Answer GenerationHallucination | CodeCode Available | 3 | 5 |
| MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models | Oct 16, 2024 | DiagnosticHallucination | CodeCode Available | 3 | 5 |
| PokeLLMon: A Human-Parity Agent for Pokemon Battles with Large Language Models | Feb 2, 2024 | Action GenerationDecision Making | CodeCode Available | 3 | 5 |
| LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation | Aug 28, 2024 | Computational EfficiencyHallucination | CodeCode Available | 3 | 5 |
| CRAG -- Comprehensive RAG Benchmark | Jun 7, 2024 | HallucinationLanguage Modelling | CodeCode Available | 3 | 5 |
| KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents | Mar 5, 2024 | HallucinationSelf-Learning | CodeCode Available | 3 | 5 |
| Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models | Mar 19, 2024 | Hallucination | CodeCode Available | 3 | 5 |
| HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems | Nov 5, 2024 | HallucinationRAG | CodeCode Available | 3 | 5 |