| WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks | Jul 7, 2024 | Arithmetic Reasoning | CodeCode Available | 3 |
| Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs | Jun 26, 2024 | Arithmetic ReasoningGSM8K | CodeCode Available | 3 |
| Improving Arithmetic Reasoning Ability of Large Language Models through Relation Tuples, Verification and Dynamic Feedback | Jun 25, 2024 | Arithmetic ReasoningRelation | CodeCode Available | 0 |
| DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving | Jun 18, 2024 | Arithmetic ReasoningMath | CodeCode Available | 2 |
| Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models Aligned with Human Cognitive Principles | Jun 18, 2024 | Arithmetic ReasoningCode Generation | CodeCode Available | 1 |
| An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs | Jun 18, 2024 | Arithmetic Reasoning | CodeCode Available | 1 |
| Breaking the Ceiling of the LLM Community by Treating Token Generation as a Classification for Ensembling | Jun 18, 2024 | Arithmetic ReasoningLanguage Modeling | CodeCode Available | 2 |
| Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs | Jun 13, 2024 | Arithmetic ReasoningFact Verification | CodeCode Available | 2 |
| Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models | Jun 6, 2024 | Arithmetic ReasoningCode Generation | —Unverified | 0 |
| QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation | May 31, 2024 | Arithmetic Reasoning | CodeCode Available | 1 |