| Prompt-to-Leaderboard | Feb 20, 2025 | ChatbotLanguage Modeling | CodeCode Available | 3 |
| Agentic Deep Graph Reasoning Yields Self-Organizing Knowledge Networks | Feb 18, 2025 | graph constructionLarge Language Model | CodeCode Available | 3 |
| Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving | Feb 11, 2025 | Automated Theorem ProvingLarge Language Model | CodeCode Available | 3 |
| Multi-agent Architecture Search via Agentic Supernet | Feb 6, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| MedRAG: Enhancing Retrieval-augmented Generation with Knowledge Graph-Elicited Reasoning for Healthcare Copilot | Feb 6, 2025 | DiagnosticLarge Language Model | CodeCode Available | 3 |
| Partially Rewriting a Transformer in Natural Language | Jan 31, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation | Jan 24, 2025 | Autonomous DrivingLanguage Modeling | CodeCode Available | 3 |
| VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model | Jan 21, 2025 | Image GenerationInstruction Following | CodeCode Available | 3 |
| Lifelong Learning of Large Language Model based Agents: A Roadmap | Jan 13, 2025 | Incremental LearningLanguage Modeling | CodeCode Available | 3 |
| Valley2: Exploring Multimodal Models with Scalable Vision-Language Design | Jan 10, 2025 | Image CaptioningLanguage Modeling | CodeCode Available | 3 |