| Balancing LoRA Performance and Efficiency with Simple Shard Sharing | Sep 19, 2024 | Computational EfficiencyGSM8K | CodeCode Available | 2 |
| Training Language Models to Self-Correct via Reinforcement Learning | Sep 19, 2024 | HumanEvalMath | CodeCode Available | 2 |
| VAE Explainer: Supplement Learning Variational Autoencoders with Interactive Visualization | Sep 13, 2024 | Math | CodeCode Available | 2 |
| CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models | Sep 4, 2024 | GSM8KMath | CodeCode Available | 2 |
| Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models | Aug 1, 2024 | Math | CodeCode Available | 2 |
| Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process | Jul 29, 2024 | GSM8KMath | CodeCode Available | 2 |
| Weak-to-Strong Reasoning | Jul 18, 2024 | GSM8KMath | CodeCode Available | 2 |
| We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning? | Jul 1, 2024 | MathMathematical Reasoning | CodeCode Available | 2 |
| MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data | Jun 26, 2024 | BenchmarkingMath | CodeCode Available | 2 |
| Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models | Jun 25, 2024 | DiversityMath | CodeCode Available | 2 |