| (G)I-DLE: Generative Inference via Distribution-preserving Logit Exclusion with KL Divergence Minimization for Constrained Decoding | Mar 23, 2025 | Logical Reasoning | —Unverified | 0 |
| Enhancing Retrieval Systems with Inference-Time Logical Reasoning | Mar 22, 2025 | Computational EfficiencyLogical Reasoning | —Unverified | 0 |
| MedAgent-Pro: Towards Evidence-based Multi-modal Medical Diagnosis via Reasoning Agentic Workflow | Mar 21, 2025 | DiagnosticLogical Reasoning | CodeCode Available | 2 |
| LaMOuR: Leveraging Language Models for Out-of-Distribution Recovery in Reinforcement Learning | Mar 21, 2025 | Code GenerationDeep Reinforcement Learning | —Unverified | 0 |
| From Chaos to Order: The Atomic Reasoner Framework for Fine-grained Reasoning in Large Language Models | Mar 20, 2025 | Logical Reasoning | —Unverified | 0 |
| Bridging Technology and Humanities: Evaluating the Impact of Large Language Models on Social Sciences Research with DeepSeek-R1 | Mar 20, 2025 | Large Language ModelLogical Reasoning | —Unverified | 0 |
| Measuring AI Ability to Complete Long Tasks | Mar 18, 2025 | Logical Reasoning | CodeCode Available | 3 |
| Efficient but Vulnerable: Benchmarking and Defending LLM Batch Prompting Attack | Mar 18, 2025 | 8kBenchmarking | —Unverified | 0 |
| 3DAxisPrompt: Promoting the 3D Grounding and Reasoning in GPT-4o | Mar 17, 2025 | Logical ReasoningPrompt Engineering | —Unverified | 0 |
| Reasoning is All You Need for Video Generalization: A Counterfactual Benchmark with Sub-question Evaluation | Mar 12, 2025 | Allcounterfactual | —Unverified | 0 |
| Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models | Mar 12, 2025 | Logical ReasoningSurvey | —Unverified | 0 |
| LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL | Mar 10, 2025 | Logical ReasoningMultimodal Reasoning | CodeCode Available | 4 |
| Towards Superior Quantization Accuracy: A Layer-sensitive Approach | Mar 9, 2025 | Logical ReasoningModel Compression | —Unverified | 0 |
| SCoRE: Benchmarking Long-Chain Reasoning in Commonsense Scenarios | Mar 8, 2025 | BenchmarkingDiagnostic | CodeCode Available | 0 |
| The Society of HiveMind: Multi-Agent Optimization of Foundation Model Swarms to Unlock the Potential of Collective Intelligence | Mar 7, 2025 | Logical ReasoningWorld Knowledge | —Unverified | 0 |
| HelpSteer3: Human-Annotated Feedback and Edit Data to Empower Inference-Time Scaling in Open-Ended General-Domain Tasks | Mar 6, 2025 | ChatbotLogical Reasoning | —Unverified | 0 |
| DB-Explore: Automated Database Exploration and Instruction Synthesis for Text-to-SQL | Mar 6, 2025 | Logical ReasoningNatural Language Queries | —Unverified | 0 |
| Three tiers of computation in transformers and in brain architectures | Mar 5, 2025 | Logical Reasoning | CodeCode Available | 0 |
| Psy-Insight: Explainable Multi-turn Bilingual Dataset for Mental Health Counseling | Mar 5, 2025 | In-Context LearningLogical Reasoning | —Unverified | 0 |
| DeLTa: A Decoding Strategy based on Logit Trajectory Prediction Improves Factuality and Reasoning Ability | Mar 4, 2025 | GSM8KLogical Reasoning | CodeCode Available | 0 |
| KGCompiler: Deep Learning Compilation Optimization for Knowledge Graph Complex Logical Query Answering | Mar 4, 2025 | Knowledge GraphsLogical Reasoning | CodeCode Available | 0 |
| HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs | Mar 3, 2025 | Logical ReasoningReading Comprehension | —Unverified | 0 |
| Order Doesn't Matter, But Reasoning Does: Training LLMs with Order-Centric Augmentation | Feb 27, 2025 | Data AugmentationLogical Reasoning | —Unverified | 0 |
| Nexus: A Lightweight and Scalable Multi-Agent Framework for Complex Tasks Automation | Feb 26, 2025 | Code GenerationHumanEval | CodeCode Available | 2 |
| TextGames: Learning to Self-Play Text-Based Puzzle Games via Language Model Reasoning | Feb 25, 2025 | Instruction FollowingLanguage Modeling | CodeCode Available | 0 |