| Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing | Apr 18, 2024 | Arithmetic ReasoningGSM8K | CodeCode Available | 1 |
| On the Empirical Complexity of Reasoning and Planning in LLMs | Apr 17, 2024 | Math | —Unverified | 0 |
| Self-Explore: Enhancing Mathematical Reasoning in Language Models with Fine-grained Rewards | Apr 16, 2024 | GSM8KMath | CodeCode Available | 2 |
| Mental Stress Detection: Development and Evaluation of a Wearable In-Ear Plethysmography | Apr 12, 2024 | MathMental Stress Detection | —Unverified | 0 |
| Rho-1: Not All Tokens Are What You Need | Apr 11, 2024 | AllContinual Pretraining | CodeCode Available | 3 |
| Personality-aware Student Simulation for Conversational Intelligent Tutoring Systems | Apr 10, 2024 | Math | —Unverified | 0 |
| MathVC: An LLM-Simulated Multi-Character Virtual Classroom for Mathematics Education | Apr 10, 2024 | Math | —Unverified | 0 |
| Evaluating Mathematical Reasoning Beyond Accuracy | Apr 8, 2024 | MathMathematical Reasoning | CodeCode Available | 2 |
| MM-MATH: Advancing Multimodal Math Evaluation with Process Evaluation and Fine-grained Classification | Apr 7, 2024 | Image ComprehensionMath | CodeCode Available | 0 |
| FRACTAL: Fine-Grained Scoring from Aggregate Text Labels | Apr 7, 2024 | MathMultiple Instance Learning | —Unverified | 0 |