| Beyond Gold Standards: Epistemic Ensemble of LLM Judges for Formal Mathematical Reasoning | Jun 12, 2025 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Beyond Lines and Circles: Unveiling the Geometric Reasoning Gap in Large Language Models | Feb 6, 2024 | Mathematical ReasoningVariable Selection | —Unverified | 0 | 0 |
| Beyond the First Error: Process Reward Models for Reflective Mathematical Reasoning | May 20, 2025 | Mathematical Reasoning | —Unverified | 0 | 0 |
| BitNet b1.58 2B4T Technical Report | Apr 16, 2025 | Computational EfficiencyCPU | —Unverified | 0 | 0 |
| Fewer is More: Boosting LLM Reasoning with Reinforced Context Pruning | Dec 14, 2023 | Arithmetic ReasoningFew-Shot Learning | —Unverified | 0 | 0 |
| Boosting Lossless Speculative Decoding via Feature Sampling and Partial Alignment Distillation | Aug 28, 2024 | Knowledge DistillationLanguage Modelling | —Unverified | 0 | 0 |
| Bottlenecked Transformers: Periodic KV Cache Abstraction for Generalised Reasoning | May 22, 2025 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Brains vs. Bytes: Evaluating LLM Proficiency in Olympiad Mathematics | Apr 1, 2025 | MathMathematical Problem-Solving | —Unverified | 0 | 0 |
| Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models | Jun 6, 2024 | Arithmetic ReasoningCode Generation | —Unverified | 0 | 0 |
| Building Math Agents with Multi-Turn Iterative Preference Learning | Sep 4, 2024 | GSM8KMath | —Unverified | 0 | 0 |
| Can Language Models Rival Mathematics Students? Evaluating Mathematical Reasoning through Textual Manipulation and Human Experiments | Dec 16, 2024 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Can Large Language Models Explain Themselves? A Study of LLM-Generated Self-Explanations | Oct 17, 2023 | Mathematical ReasoningSentiment Analysis | —Unverified | 0 | 0 |
| Can Large Language Models Invent Algorithms to Improve Themselves? | Oct 21, 2024 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Can LLMs understand Math? -- Exploring the Pitfalls in Mathematical Reasoning | May 21, 2025 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| Can Pruning Improve Reasoning? Revisiting Long-CoT Compression with Capability in Mind for Better Reasoning | May 20, 2025 | Large Language ModelMathematical Reasoning | —Unverified | 0 | 0 |
| Can Theoretical Physics Research Benefit from Language Agents? | Jun 6, 2025 | Code GenerationMathematical Reasoning | —Unverified | 0 | 0 |
| Causal Head Gating: A Framework for Interpreting Roles of Attention Heads in Transformers | May 19, 2025 | In-Context LearningInstruction Following | —Unverified | 0 | 0 |
| Causal Inference with Large Language Model: A Survey | Sep 15, 2024 | Causal InferenceLanguage Modeling | —Unverified | 0 | 0 |
| CDW-CoT: Clustered Distance-Weighted Chain-of-Thoughts Reasoning | Jan 21, 2025 | ClusteringMathematical Reasoning | —Unverified | 0 | 0 |
| Chain-of-Reasoning: Towards Unified Mathematical Reasoning in Large Language Models via a Multi-Paradigm Perspective | Jan 19, 2025 | Automated Theorem ProvingMath | —Unverified | 0 | 0 |
| CHAMP: A Competition-level Dataset for Fine-Grained Analyses of LLMs' Mathematical Reasoning Capabilities | Jan 13, 2024 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| Channel Merging: Preserving Specialization for Merged Experts | Dec 18, 2024 | Code GenerationGPU | —Unverified | 0 | 0 |
| CLEAR: Contrasting Textual Feedback with Experts and Amateurs for Reasoning | Mar 24, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Coarse-to-Fine Process Reward Modeling for Enhanced Mathematical Reasoning | Jan 23, 2025 | AttributeMathematical Reasoning | —Unverified | 0 | 0 |
| CodeGemma: Open Code Models Based on Gemma | Jun 17, 2024 | Code CompletionMathematical Reasoning | —Unverified | 0 | 0 |