| Towards Superior Quantization Accuracy: A Layer-sensitive Approach | Mar 9, 2025 | Logical ReasoningModel Compression | —Unverified | 0 |
| SCoRE: Benchmarking Long-Chain Reasoning in Commonsense Scenarios | Mar 8, 2025 | BenchmarkingDiagnostic | CodeCode Available | 0 |
| The Society of HiveMind: Multi-Agent Optimization of Foundation Model Swarms to Unlock the Potential of Collective Intelligence | Mar 7, 2025 | Logical ReasoningWorld Knowledge | —Unverified | 0 |
| DB-Explore: Automated Database Exploration and Instruction Synthesis for Text-to-SQL | Mar 6, 2025 | Logical ReasoningNatural Language Queries | —Unverified | 0 |
| HelpSteer3: Human-Annotated Feedback and Edit Data to Empower Inference-Time Scaling in Open-Ended General-Domain Tasks | Mar 6, 2025 | ChatbotLogical Reasoning | —Unverified | 0 |
| Three tiers of computation in transformers and in brain architectures | Mar 5, 2025 | Logical Reasoning | CodeCode Available | 0 |
| Psy-Insight: Explainable Multi-turn Bilingual Dataset for Mental Health Counseling | Mar 5, 2025 | In-Context LearningLogical Reasoning | —Unverified | 0 |
| KGCompiler: Deep Learning Compilation Optimization for Knowledge Graph Complex Logical Query Answering | Mar 4, 2025 | Knowledge GraphsLogical Reasoning | —Unverified | 0 |
| DeLTa: A Decoding Strategy based on Logit Trajectory Prediction Improves Factuality and Reasoning Ability | Mar 4, 2025 | GSM8KLogical Reasoning | CodeCode Available | 0 |
| HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs | Mar 3, 2025 | Logical ReasoningReading Comprehension | —Unverified | 0 |
| Order Doesn't Matter, But Reasoning Does: Training LLMs with Order-Centric Augmentation | Feb 27, 2025 | Data AugmentationLogical Reasoning | —Unverified | 0 |
| Reversal Blessing: Thinking Backward May Outpace Thinking Forward in Multi-choice Questions | Feb 25, 2025 | Inductive BiasLogical Reasoning | —Unverified | 0 |
| TextGames: Learning to Self-Play Text-Based Puzzle Games via Language Model Reasoning | Feb 25, 2025 | Instruction FollowingLanguage Modeling | CodeCode Available | 0 |
| Intermediate Languages Matter: Formal Choice Drives Neurosymbolic LLM Reasoning | Feb 24, 2025 | In-Context LearningLogical Reasoning | —Unverified | 0 |
| Autoregressive Image Generation Guided by Chains of Thought | Feb 24, 2025 | Image GenerationLogical Reasoning | —Unverified | 0 |
| Logic Haystacks: Probing LLMs Long-Context Logical Reasoning (Without Easily Identifiable Unrelated Padding) | Feb 24, 2025 | Logical ReasoningRetrieval | —Unverified | 0 |
| Quantifying Logical Consistency in Transformers via Query-Key Alignment | Feb 24, 2025 | Logical Reasoningvalid | —Unverified | 0 |
| Empowering LLMs with Logical Reasoning: A Comprehensive Survey | Feb 21, 2025 | Logical ReasoningNegation | —Unverified | 0 |
| Identifying Features that Shape Perceived Consciousness in Large Language Model-based AI: A Quantitative Study of Human Responses | Feb 21, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Triangulating LLM Progress through Benchmarks, Games, and Cognitive Tests | Feb 20, 2025 | Logical ReasoningMMLU | —Unverified | 0 |
| On the logical skills of large language models: evaluations using arbitrarily complex first-order logic problems | Feb 20, 2025 | Logical Reasoning | CodeCode Available | 0 |
| A Mousetrap: Fooling Large Reasoning Models for Jailbreak with Chain of Iterative Chaos | Feb 19, 2025 | Logical Reasoning | —Unverified | 0 |
| SPPD: Self-training with Process Preference Learning Using Dynamic Value Margin | Feb 19, 2025 | GPULogical Reasoning | —Unverified | 0 |
| Inference-Time Computations for LLM Reasoning and Planning: A Benchmark and Insights | Feb 18, 2025 | Arithmetic ReasoningCommon Sense Reasoning | —Unverified | 0 |
| HopRAG: Multi-Hop Reasoning for Logic-Aware Retrieval-Augmented Generation | Feb 18, 2025 | Logical ReasoningRAG | —Unverified | 0 |