| Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment | May 6, 2024 | Arithmetic ReasoningCode Generation | —Unverified | 0 |
| Evaluating Gender Bias in Large Language Models via Chain-of-Thought Prompting | Jan 28, 2024 | Arithmetic ReasoningFact Checking | —Unverified | 0 |
| Exploring Group and Symmetry Principles in Large Language Models | Feb 9, 2024 | Arithmetic ReasoningNegation | —Unverified | 0 |
| Fact-Consistency Evaluation of Text-to-SQL Generation for Business Intelligence Using Exaone 3.5 | Apr 30, 2025 | Arithmetic ReasoningText to SQL | —Unverified | 0 |
| Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together | Jul 15, 2024 | Arithmetic ReasoningLanguage Modeling | —Unverified | 0 |
| FinLMM-R1: Enhancing Financial Reasoning in LMM through Scalable Data and Reward Design | Jun 16, 2025 | Answer GenerationArithmetic Reasoning | —Unverified | 0 |
| GaLore+: Boosting Low-Rank Adaptation for LLMs with Cross-Head Projection | Dec 15, 2024 | Arithmetic ReasoningText Generation | —Unverified | 0 |
| On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes | Jun 23, 2023 | Arithmetic ReasoningKnowledge Distillation | —Unverified | 0 |
| Hint Marginalization for Improved Reasoning in Large Language Models | Dec 17, 2024 | Arithmetic Reasoning | —Unverified | 0 |
| Inference-Time Computations for LLM Reasoning and Planning: A Benchmark and Insights | Feb 18, 2025 | Arithmetic ReasoningCommon Sense Reasoning | —Unverified | 0 |
| Joint Flashback Adaptation for Forgetting-Resistant Instruction Tuning | May 21, 2025 | Arithmetic ReasoningInstruction Following | —Unverified | 0 |
| KwaiYiiMath: Technical Report | Oct 11, 2023 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| Large Language Models are Null-Shot Learners | Jan 16, 2024 | Arithmetic ReasoningBenchmarking | —Unverified | 0 |
| Large Language Models Can Self-Correct with Key Condition Verification | May 23, 2024 | Arithmetic ReasoningMath | —Unverified | 0 |
| Large Language Models Can Self-Improve | Oct 20, 2022 | Arithmetic ReasoningCommon Sense Reasoning | —Unverified | 0 |
| Learning-at-Criticality in Large Language Models for Quantum Field Theory and Beyond | Jun 4, 2025 | Arithmetic ReasoningReinforcement Learning (RL) | —Unverified | 0 |
| Least-to-Most Prompting Enables Complex Reasoning in Large Language Models | May 21, 2022 | Arithmetic ReasoningMath | —Unverified | 0 |
| Model Card and Evaluations for Claude Models | Jul 11, 2023 | Arithmetic ReasoningBug fixing | —Unverified | 0 |
| Neural-Symbolic Recursive Machine for Systematic Generalization | Oct 4, 2022 | Arithmetic ReasoningMachine Translation | —Unverified | 0 |
| NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks | Apr 12, 2022 | Arithmetic ReasoningMathematical Reasoning | —Unverified | 0 |
| On Representational Dissociation of Language and Arithmetic in Large Language Models | Feb 17, 2025 | Arithmetic Reasoning | —Unverified | 0 |
| Making Large Language Models Better Reasoners with Step-Aware Verifier | Jun 6, 2022 | Arithmetic ReasoningFew-Shot Learning | —Unverified | 0 |
| OpenChat: Advancing Open-source Language Models with Mixed-Quality Data | Sep 20, 2023 | Arithmetic ReasoningCode Generation | —Unverified | 0 |
| Orca 2: Teaching Small Language Models How to Reason | Nov 18, 2023 | Arithmetic ReasoningCommon Sense Reasoning | —Unverified | 0 |
| Orca-Math: Unlocking the potential of SLMs in Grade School Math | Feb 16, 2024 | Arithmetic ReasoningGSM8K | —Unverified | 0 |