| Beyond Gold Standards: Epistemic Ensemble of LLM Judges for Formal Mathematical Reasoning | Jun 12, 2025 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Beyond Lines and Circles: Unveiling the Geometric Reasoning Gap in Large Language Models | Feb 6, 2024 | Mathematical ReasoningVariable Selection | —Unverified | 0 | 0 |
| Beyond the First Error: Process Reward Models for Reflective Mathematical Reasoning | May 20, 2025 | Mathematical Reasoning | —Unverified | 0 | 0 |
| BitNet b1.58 2B4T Technical Report | Apr 16, 2025 | Computational EfficiencyCPU | —Unverified | 0 | 0 |
| Fewer is More: Boosting LLM Reasoning with Reinforced Context Pruning | Dec 14, 2023 | Arithmetic ReasoningFew-Shot Learning | —Unverified | 0 | 0 |
| Boosting Lossless Speculative Decoding via Feature Sampling and Partial Alignment Distillation | Aug 28, 2024 | Knowledge DistillationLanguage Modelling | —Unverified | 0 | 0 |
| Bottlenecked Transformers: Periodic KV Cache Abstraction for Generalised Reasoning | May 22, 2025 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Brains vs. Bytes: Evaluating LLM Proficiency in Olympiad Mathematics | Apr 1, 2025 | MathMathematical Problem-Solving | —Unverified | 0 | 0 |
| Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models | Jun 6, 2024 | Arithmetic ReasoningCode Generation | —Unverified | 0 | 0 |
| Building Math Agents with Multi-Turn Iterative Preference Learning | Sep 4, 2024 | GSM8KMath | —Unverified | 0 | 0 |