| BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning | Jan 6, 2025 | In-Context LearningMath | CodeCode Available | 1 |
| InfiFusion: A Unified Framework for Enhanced Cross-Model Reasoning via LLM Fusion | Jan 6, 2025 | GSM8KHumanEval | —Unverified | 0 |
| Quantization Meets Reasoning: Exploring LLM Low-Bit Quantization Degradation for Mathematical Reasoning | Jan 6, 2025 | MathMathematical Reasoning | —Unverified | 0 |
| Understand, Solve and Translate: Bridging the Multilingual Mathematical Reasoning Gap | Jan 5, 2025 | MathMathematical Reasoning | —Unverified | 0 |
| Empowering Bengali Education with AI: Solving Bengali Math Word Problems through Transformer Models | Jan 5, 2025 | Math | —Unverified | 0 |
| Instruction-Following Pruning for Large Language Models | Jan 3, 2025 | Instruction FollowingMath | —Unverified | 0 |
| A Probabilistic Model for Node Classification in Directed Graphs | Jan 3, 2025 | MathNode Classification | CodeCode Available | 0 |
| Recursive Decomposition of Logical Thoughts: Framework for Superior Reasoning and Knowledge Propagation in Large Language Models | Jan 3, 2025 | GSM8KMath | —Unverified | 0 |
| CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis | Jan 3, 2025 | Math | CodeCode Available | 1 |
| DIVE: Diversified Iterative Self-Improvement | Jan 1, 2025 | DiversityGSM8K | CodeCode Available | 0 |
| Experimental Demonstration of an Optical Neural PDE Solver via On-Chip PINN Training | Jan 1, 2025 | Math | —Unverified | 0 |
| Rethink Delay Doppler Channels and Time-Frequency Coding | Dec 31, 2024 | Math | —Unverified | 0 |
| Measuring Large Language Models Capacity to Annotate Journalistic Sourcing | Dec 30, 2024 | BenchmarkingEthics | —Unverified | 0 |
| Slow Perception: Let's Perceive Geometric Figures Step-by-step | Dec 30, 2024 | MathVisual Reasoning | —Unverified | 0 |
| Toward Adaptive Reasoning in Large Language Models with Thought Rollback | Dec 27, 2024 | Math | CodeCode Available | 1 |
| Dynamic Skill Adaptation for Large Language Models | Dec 26, 2024 | Math | —Unverified | 0 |
| CARL-GT: Evaluating Causal Reasoning Capabilities of Large Language Models | Dec 23, 2024 | Decision MakingMath | CodeCode Available | 1 |
| StructTest: Benchmarking LLMs' Reasoning through Compositional Structured Outputs | Dec 23, 2024 | BenchmarkingLogical Reasoning | —Unverified | 0 |
| Towards Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning | Dec 23, 2024 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought | Dec 23, 2024 | Machine TranslationMath | CodeCode Available | 3 |
| Evaluating the Design Features of an Intelligent Tutoring System for Advanced Mathematics Learning | Dec 23, 2024 | Math | —Unverified | 0 |
| Ask-Before-Detection: Identifying and Mitigating Conformity Bias in LLM-Powered Error Detector for Math Word Problem Solutions | Dec 22, 2024 | GSM8KMath | —Unverified | 0 |
| System-2 Mathematical Reasoning via Enriched Instruction Tuning | Dec 22, 2024 | ERPGSM8K | —Unverified | 0 |
| Correct implied volatility shapes and reliable pricing in the rough Heston model | Dec 20, 2024 | Math | —Unverified | 0 |
| Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning | Dec 20, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |