| Towards Superior Quantization Accuracy: A Layer-sensitive Approach | Mar 9, 2025 | Logical ReasoningModel Compression | —Unverified | 0 |
| SCoRE: Benchmarking Long-Chain Reasoning in Commonsense Scenarios | Mar 8, 2025 | BenchmarkingDiagnostic | CodeCode Available | 0 |
| The Society of HiveMind: Multi-Agent Optimization of Foundation Model Swarms to Unlock the Potential of Collective Intelligence | Mar 7, 2025 | Logical ReasoningWorld Knowledge | —Unverified | 0 |
| DB-Explore: Automated Database Exploration and Instruction Synthesis for Text-to-SQL | Mar 6, 2025 | Logical ReasoningNatural Language Queries | —Unverified | 0 |
| HelpSteer3: Human-Annotated Feedback and Edit Data to Empower Inference-Time Scaling in Open-Ended General-Domain Tasks | Mar 6, 2025 | ChatbotLogical Reasoning | —Unverified | 0 |
| Psy-Insight: Explainable Multi-turn Bilingual Dataset for Mental Health Counseling | Mar 5, 2025 | In-Context LearningLogical Reasoning | —Unverified | 0 |
| Three tiers of computation in transformers and in brain architectures | Mar 5, 2025 | Logical Reasoning | CodeCode Available | 0 |
| DeLTa: A Decoding Strategy based on Logit Trajectory Prediction Improves Factuality and Reasoning Ability | Mar 4, 2025 | GSM8KLogical Reasoning | CodeCode Available | 0 |
| KGCompiler: Deep Learning Compilation Optimization for Knowledge Graph Complex Logical Query Answering | Mar 4, 2025 | Knowledge GraphsLogical Reasoning | —Unverified | 0 |
| HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs | Mar 3, 2025 | Logical ReasoningReading Comprehension | —Unverified | 0 |
| Order Doesn't Matter, But Reasoning Does: Training LLMs with Order-Centric Augmentation | Feb 27, 2025 | Data AugmentationLogical Reasoning | —Unverified | 0 |
| Reversal Blessing: Thinking Backward May Outpace Thinking Forward in Multi-choice Questions | Feb 25, 2025 | Inductive BiasLogical Reasoning | —Unverified | 0 |
| TextGames: Learning to Self-Play Text-Based Puzzle Games via Language Model Reasoning | Feb 25, 2025 | Instruction FollowingLanguage Modeling | CodeCode Available | 0 |
| Logic Haystacks: Probing LLMs Long-Context Logical Reasoning (Without Easily Identifiable Unrelated Padding) | Feb 24, 2025 | Logical ReasoningRetrieval | —Unverified | 0 |
| Intermediate Languages Matter: Formal Choice Drives Neurosymbolic LLM Reasoning | Feb 24, 2025 | In-Context LearningLogical Reasoning | —Unverified | 0 |
| Autoregressive Image Generation Guided by Chains of Thought | Feb 24, 2025 | Image GenerationLogical Reasoning | —Unverified | 0 |
| Quantifying Logical Consistency in Transformers via Query-Key Alignment | Feb 24, 2025 | Logical Reasoningvalid | —Unverified | 0 |
| Empowering LLMs with Logical Reasoning: A Comprehensive Survey | Feb 21, 2025 | Logical ReasoningNegation | —Unverified | 0 |
| Identifying Features that Shape Perceived Consciousness in Large Language Model-based AI: A Quantitative Study of Human Responses | Feb 21, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Triangulating LLM Progress through Benchmarks, Games, and Cognitive Tests | Feb 20, 2025 | Logical ReasoningMMLU | —Unverified | 0 |
| On the logical skills of large language models: evaluations using arbitrarily complex first-order logic problems | Feb 20, 2025 | Logical Reasoning | CodeCode Available | 0 |
| A Mousetrap: Fooling Large Reasoning Models for Jailbreak with Chain of Iterative Chaos | Feb 19, 2025 | Logical Reasoning | —Unverified | 0 |
| SPPD: Self-training with Process Preference Learning Using Dynamic Value Margin | Feb 19, 2025 | GPULogical Reasoning | —Unverified | 0 |
| Inference-Time Computations for LLM Reasoning and Planning: A Benchmark and Insights | Feb 18, 2025 | Arithmetic ReasoningCommon Sense Reasoning | —Unverified | 0 |
| HopRAG: Multi-Hop Reasoning for Logic-Aware Retrieval-Augmented Generation | Feb 18, 2025 | Logical ReasoningRAG | —Unverified | 0 |
| Integrating Expert Knowledge into Logical Programs via LLMs | Feb 17, 2025 | BenchmarkingLogical Reasoning | CodeCode Available | 0 |
| Beyond Single-Task: Robust Multi-Task Length Generalization for LLMs | Feb 17, 2025 | In-Context LearningLogical Reasoning | —Unverified | 0 |
| Dialogue-based Explanations for Logical Reasoning using Structured Argumentation | Feb 16, 2025 | Logical Reasoning | —Unverified | 0 |
| Quantifying the Capability Boundary of DeepSeek Models: An Application-Driven Performance Analysis | Feb 16, 2025 | Logical ReasoningModel Selection | —Unverified | 0 |
| The Multilingual Mind : A Survey of Multilingual Reasoning in Language Models | Feb 13, 2025 | Logical ReasoningSurvey | —Unverified | 0 |
| Logical Reasoning in Large Language Models: A Survey | Feb 13, 2025 | Logical ReasoningSurvey | —Unverified | 0 |
| Logical Lease Litigation: Prolog and LLMs for Rental Law Compliance in New York | Feb 13, 2025 | Legal ReasoningLogical Reasoning | —Unverified | 0 |
| Logical forms complement probability in understanding language model (and human) performance | Feb 13, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| DMWM: Dual-Mind World Model with Long-Term Imagination | Feb 11, 2025 | Logical Reasoning | —Unverified | 0 |
| Structural Reformation of Large Language Model Neuron Encapsulation for Divergent Information Aggregation | Feb 10, 2025 | Decision MakingLanguage Modeling | —Unverified | 0 |
| S^2-MAD: Breaking the Token Barrier to Enhance Multi-Agent Debate Efficiency | Feb 7, 2025 | Logical Reasoning | —Unverified | 0 |
| SymAgent: A Neural-Symbolic Self-Learning Agent Framework for Complex Reasoning over Knowledge Graphs | Feb 5, 2025 | Knowledge GraphsLogical Reasoning | —Unverified | 0 |
| Standard Neural Computation Alone Is Insufficient for Logical Intelligence | Feb 4, 2025 | Inductive LearningLogical Reasoning | —Unverified | 0 |
| Automating Mathematical Proof Generation Using Large Language Model Agents and Knowledge Graphs | Feb 4, 2025 | Formal LogicKnowledge Graphs | —Unverified | 0 |
| ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning | Feb 3, 2025 | Logical Reasoning | —Unverified | 0 |
| Enhancing Large Language Model Efficiencyvia Symbolic Compression: A Formal Approach Towards Interpretability | Jan 30, 2025 | Code GenerationLanguage Modeling | —Unverified | 0 |
| Instantiation-based Formalization of Logical Reasoning Tasks using Language Models and Logical Solvers | Jan 28, 2025 | Logical Reasoning | —Unverified | 0 |
| Town Hall Debate Prompting: Enhancing Logical Reasoning in LLMs through Multi-Persona Interaction | Jan 28, 2025 | Logical ReasoningMultiple-choice | —Unverified | 0 |
| DBRouting: Routing End User Queries to Databases for Answerability | Jan 27, 2025 | Logical ReasoningSemantic Parsing | —Unverified | 0 |
| SedarEval: Automated Evaluation using Self-Adaptive Rubrics | Jan 26, 2025 | Logical Reasoning | CodeCode Available | 0 |
| A Causality-aware Paradigm for Evaluating Creativity of Multimodal Large Language Models | Jan 25, 2025 | Logical Reasoning | —Unverified | 0 |
| JustLogic: A Comprehensive Benchmark for Evaluating Deductive Reasoning in Large Language Models | Jan 24, 2025 | Logical Reasoning | CodeCode Available | 0 |
| VERUS-LM: a Versatile Framework for Combining LLMs with Symbolic Reasoning | Jan 24, 2025 | Logical Reasoning | —Unverified | 0 |
| Assessing the Alignment of FOL Closeness Metrics with Human Judgement | Jan 15, 2025 | Logical ReasoningSensitivity | CodeCode Available | 0 |
| Reasoning with Graphs: Structuring Implicit Knowledge to Enhance LLMs Reasoning | Jan 14, 2025 | Logical ReasoningMulti-hop Question Answering | —Unverified | 0 |