| Scaling Synthetic Data Creation with 1,000,000,000 Personas | Jun 28, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 11 |
| NeedleBench: Can LLMs Do Retrieval and Reasoning in Information-Dense Context? | Jul 16, 2024 | 4k8k | CodeCode Available | 9 |
| LLaVA-CoT: Let Vision Language Models Reason Step-by-Step | Nov 15, 2024 | Logical ReasoningMultimodal Reasoning | CodeCode Available | 7 |
| PIKE-RAG: sPecIalized KnowledgE and Rationale Augmented Generation | Jan 20, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 7 |
| Training Compute-Optimal Large Language Models | Mar 29, 2022 | AnachronismsAnalogical Similarity | CodeCode Available | 6 |
| SGLang: Efficient Execution of Structured Language Model Programs | Dec 12, 2023 | Few-Shot LearningLanguage Modeling | CodeCode Available | 6 |
| SoundMind: RL-Incentivized Logic Reasoning for Audio-Language Models | Jun 15, 2025 | Logical ReasoningReinforcement Learning (RL) | CodeCode Available | 5 |
| MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI | Nov 27, 2023 | Complex Query AnsweringLogical Reasoning | CodeCode Available | 5 |
| Training Large Language Models to Reason in a Continuous Latent Space | Dec 9, 2024 | Logical Reasoning | CodeCode Available | 5 |
| Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent | Nov 4, 2024 | Logical ReasoningMathematical Problem-Solving | CodeCode Available | 5 |
| From System 1 to System 2: A Survey of Reasoning Large Language Models | Feb 24, 2025 | Logical Reasoning | CodeCode Available | 5 |
| OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning | Dec 31, 2024 | BenchmarkingLogical Reasoning | CodeCode Available | 4 |
| R1-Onevision:An Open-Source Multimodal Large Language Model Capable of Deep Reasoning | Feb 24, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL | Mar 10, 2025 | Logical ReasoningMultimodal Reasoning | CodeCode Available | 4 |
| Knowledge Fusion of Large Language Models | Jan 19, 2024 | Code GenerationCommon Sense Reasoning | CodeCode Available | 4 |
| Reasoning with Language Model Prompting: A Survey | Dec 19, 2022 | Arithmetic ReasoningCommon Sense Reasoning | CodeCode Available | 3 |
| Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models | Apr 19, 2023 | Logical Reasoning | CodeCode Available | 3 |
| GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations | Feb 19, 2024 | Card GamesLogical Reasoning | CodeCode Available | 3 |
| Advancing LLM Reasoning Generalists with Preference Trees | Apr 2, 2024 | BenchmarkingCode Generation | CodeCode Available | 3 |
| Faithful Logical Reasoning via Symbolic Chain-of-Thought | May 28, 2024 | Logical Reasoning | CodeCode Available | 3 |
| Measuring AI Ability to Complete Long Tasks | Mar 18, 2025 | Logical Reasoning | CodeCode Available | 3 |
| LLM4Drive: A Survey of Large Language Models for Autonomous Driving | Nov 2, 2023 | Autonomous DrivingFew-Shot Learning | CodeCode Available | 3 |
| A Survey on Large Language Model Acceleration based on KV Cache Management | Dec 27, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Scaling Language Models: Methods, Analysis & Insights from Training Gopher | Dec 8, 2021 | Abstract AlgebraAnachronisms | CodeCode Available | 2 |
| SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond | May 26, 2025 | Logical ReasoningReinforcement Learning (RL) | CodeCode Available | 2 |
| Cumulative Reasoning with Large Language Models | Aug 8, 2023 | Decision MakingLogical Reasoning | CodeCode Available | 2 |
| Ontology Embedding: A Survey of Methods, Applications and Resources | Jun 16, 2024 | Logical ReasoningOntology Embedding | CodeCode Available | 2 |
| Easy Problems That LLMs Get Wrong | May 30, 2024 | Common Sense ReasoningLogical Reasoning | CodeCode Available | 2 |
| MedAgent-Pro: Towards Evidence-based Multi-modal Medical Diagnosis via Reasoning Agentic Workflow | Mar 21, 2025 | DiagnosticLogical Reasoning | CodeCode Available | 2 |
| Nexus: A Lightweight and Scalable Multi-Agent Framework for Complex Tasks Automation | Feb 26, 2025 | Code GenerationHumanEval | CodeCode Available | 2 |
| PaLM: Scaling Language Modeling with Pathways | Apr 5, 2022 | Auto DebuggingCode Generation | CodeCode Available | 2 |
| Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning | May 20, 2023 | Logical Reasoning | CodeCode Available | 2 |
| LeapVAD: A Leap in Autonomous Driving via Cognitive Perception and Dual-Process Thinking | Jan 14, 2025 | Autonomous DrivingDecision Making | CodeCode Available | 2 |
| LTNtorch: PyTorch Implementation of Logic Tensor Networks | Sep 24, 2024 | Binary ClassificationLogical Reasoning | CodeCode Available | 2 |
| LangBridge: Multilingual Reasoning Without Multilingual Supervision | Jan 19, 2024 | Code CompletionLogical Reasoning | CodeCode Available | 2 |
| Chain-of-Thought for Autonomous Driving: A Comprehensive Survey and Future Prospects | May 26, 2025 | Autonomous DrivingLogical Reasoning | CodeCode Available | 2 |
| ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning | Mar 19, 2022 | Chart Question AnsweringLogical Reasoning | CodeCode Available | 2 |
| Large Language Models are Zero-Shot Reasoners | May 24, 2022 | Arithmetic ReasoningCommon Sense Reasoning | CodeCode Available | 2 |
| MACM: Utilizing a Multi-Agent System for Condition Mining in Solving Complex Mathematical Problems | Apr 6, 2024 | Logical ReasoningMath | CodeCode Available | 2 |
| Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs | Jun 13, 2024 | Arithmetic ReasoningFact Verification | CodeCode Available | 2 |
| Flow of Reasoning:Training LLMs for Divergent Problem Solving with Minimal Examples | Jun 9, 2024 | ARCDiversity | CodeCode Available | 2 |
| Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1 | Mar 31, 2025 | Logical ReasoningMultiple-choice | CodeCode Available | 2 |
| FlashRNN: Optimizing Traditional RNNs on Modern Hardware | Dec 10, 2024 | GPULogical Reasoning | CodeCode Available | 2 |
| Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation | Dec 5, 2023 | Logical Reasoning | CodeCode Available | 2 |
| Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic Corpus | Nov 19, 2024 | Formal LogicLogical Reasoning | CodeCode Available | 2 |
| Learning from Committee: Reasoning Distillation from a Mixture of Teachers with Peer-Review | Oct 4, 2024 | Knowledge DistillationLogical Reasoning | CodeCode Available | 2 |
| Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing | Apr 3, 2025 | BenchmarkingLogical Reasoning | CodeCode Available | 2 |
| Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous Driving | May 24, 2024 | Autonomous DrivingDecision Making | CodeCode Available | 2 |
| Evaluating the World Model Implicit in a Generative Model | Jun 6, 2024 | Logical Reasoningmodel | CodeCode Available | 2 |
| InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners | Apr 19, 2025 | Action GenerationLogical Reasoning | CodeCode Available | 2 |