| Reversal Blessing: Thinking Backward May Outpace Thinking Forward in Multi-choice Questions | Feb 25, 2025 | Inductive BiasLogical Reasoning | —Unverified | 0 |
| Autoregressive Image Generation Guided by Chains of Thought | Feb 24, 2025 | Image GenerationLogical Reasoning | —Unverified | 0 |
| Quantifying Logical Consistency in Transformers via Query-Key Alignment | Feb 24, 2025 | Logical Reasoningvalid | —Unverified | 0 |
| Logic Haystacks: Probing LLMs Long-Context Logical Reasoning (Without Easily Identifiable Unrelated Padding) | Feb 24, 2025 | Logical ReasoningRetrieval | —Unverified | 0 |
| AutoLogi: Automated Generation of Logic Puzzles for Evaluating Reasoning Abilities of Large Language Models | Feb 24, 2025 | Logical ReasoningMultiple-choice | CodeCode Available | 1 |
| Intermediate Languages Matter: Formal Choice Drives Neurosymbolic LLM Reasoning | Feb 24, 2025 | In-Context LearningLogical Reasoning | —Unverified | 0 |
| R1-Onevision:An Open-Source Multimodal Large Language Model Capable of Deep Reasoning | Feb 24, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| From System 1 to System 2: A Survey of Reasoning Large Language Models | Feb 24, 2025 | Logical Reasoning | CodeCode Available | 5 |
| Empowering LLMs with Logical Reasoning: A Comprehensive Survey | Feb 21, 2025 | Logical ReasoningNegation | —Unverified | 0 |
| Identifying Features that Shape Perceived Consciousness in Large Language Model-based AI: A Quantitative Study of Human Responses | Feb 21, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| On the logical skills of large language models: evaluations using arbitrarily complex first-order logic problems | Feb 20, 2025 | Logical Reasoning | CodeCode Available | 0 |
| Triangulating LLM Progress through Benchmarks, Games, and Cognitive Tests | Feb 20, 2025 | Logical ReasoningMMLU | —Unverified | 0 |
| A Mousetrap: Fooling Large Reasoning Models for Jailbreak with Chain of Iterative Chaos | Feb 19, 2025 | Logical Reasoning | —Unverified | 0 |
| SPPD: Self-training with Process Preference Learning Using Dynamic Value Margin | Feb 19, 2025 | GPULogical Reasoning | —Unverified | 0 |
| HopRAG: Multi-Hop Reasoning for Logic-Aware Retrieval-Augmented Generation | Feb 18, 2025 | Logical ReasoningRAG | —Unverified | 0 |
| Inference-Time Computations for LLM Reasoning and Planning: A Benchmark and Insights | Feb 18, 2025 | Arithmetic ReasoningCommon Sense Reasoning | —Unverified | 0 |
| Unveiling the Magic of Code Reasoning through Hypothesis Decomposition and Amendment | Feb 17, 2025 | HallucinationLogical Reasoning | CodeCode Available | 2 |
| Integrating Expert Knowledge into Logical Programs via LLMs | Feb 17, 2025 | BenchmarkingLogical Reasoning | CodeCode Available | 0 |
| Beyond Single-Task: Robust Multi-Task Length Generalization for LLMs | Feb 17, 2025 | In-Context LearningLogical Reasoning | —Unverified | 0 |
| Exposing Numeracy Gaps: A Benchmark to Evaluate Fundamental Numerical Abilities in Large Language Models | Feb 16, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Quantifying the Capability Boundary of DeepSeek Models: An Application-Driven Performance Analysis | Feb 16, 2025 | Logical ReasoningModel Selection | —Unverified | 0 |
| Dialogue-based Explanations for Logical Reasoning using Structured Argumentation | Feb 16, 2025 | Logical Reasoning | —Unverified | 0 |
| Logical Reasoning in Large Language Models: A Survey | Feb 13, 2025 | Logical ReasoningSurvey | —Unverified | 0 |
| The Multilingual Mind : A Survey of Multilingual Reasoning in Language Models | Feb 13, 2025 | Logical ReasoningSurvey | —Unverified | 0 |
| Logical forms complement probability in understanding language model (and human) performance | Feb 13, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |