| Step-level Value Preference Optimization for Mathematical Reasoning | Jun 16, 2024 | Learning-To-RankMath | CodeCode Available | 3 |
| CLST: Cold-Start Mitigation in Knowledge Tracing by Aligning a Generative Language Model as a Students' Knowledge Tracer | Jun 13, 2024 | Domain GeneralizationKnowledge Tracing | —Unverified | 0 |
| MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning | Jun 13, 2024 | Instruction FollowingMath | CodeCode Available | 3 |
| Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback | Jun 13, 2024 | Instruction FollowingMath | CodeCode Available | 7 |
| Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models | Jun 13, 2024 | MathQuantization | CodeCode Available | 2 |
| Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models | Jun 13, 2024 | Mathobject-detection | CodeCode Available | 3 |
| ReMI: A Dataset for Reasoning with Multiple Images | Jun 13, 2024 | Chart UnderstandingMath | —Unverified | 0 |
| Collective Constitutional AI: Aligning a Language Model with Public Input | Jun 12, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B | Jun 11, 2024 | Decision MakingGSM8K | CodeCode Available | 5 |
| Can I understand what I create? Self-Knowledge Evaluation of Large Language Models | Jun 10, 2024 | Math | —Unverified | 0 |
| Human Learning about AI | Jun 8, 2024 | Math | —Unverified | 0 |
| CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning | Jun 7, 2024 | Instruction FollowingMath | CodeCode Available | 2 |
| A multi-core periphery perspective: Ranking via relative centrality | Jun 6, 2024 | Math | —Unverified | 0 |
| Lean Workbook: A large-scale Lean problem set formalized from natural language math problems | Jun 6, 2024 | Automated Theorem ProvingMath | CodeCode Available | 4 |
| DICE: Detecting In-distribution Contamination in LLM's Fine-tuning Phase for Math Reasoning | Jun 6, 2024 | Math | CodeCode Available | 1 |
| Improve Mathematical Reasoning in Language Models by Automated Process Supervision | Jun 5, 2024 | GSM8KMath | —Unverified | 0 |
| NUMCoT: Numerals and Units of Measurement in Chain-of-Thought Reasoning using Large Language Models | Jun 5, 2024 | MathMathematical Reasoning | CodeCode Available | 0 |
| mCoT: Multilingual Instruction Tuning for Reasoning Consistency in Language Models | Jun 4, 2024 | Math | CodeCode Available | 0 |
| D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models | Jun 3, 2024 | GPUMath | —Unverified | 0 |
| Code Pretraining Improves Entity Tracking Abilities of Language Models | May 31, 2024 | Math | —Unverified | 0 |
| Cutting Through the Noise: Boosting LLM Performance on Math Word Problems | May 30, 2024 | 8kMath | CodeCode Available | 0 |
| Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation | May 30, 2024 | Code GenerationHumanEval | —Unverified | 0 |
| TAIA: Large Language Models are Out-of-Distribution Data Learners | May 30, 2024 | Math | CodeCode Available | 1 |
| MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Interactions | May 29, 2024 | BenchmarkingDialogue Understanding | CodeCode Available | 1 |
| Yuan 2.0-M32: Mixture of Experts with Attention Router | May 28, 2024 | ARCMath | CodeCode Available | 2 |