SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 651–675 of 1596 papers

Title	Date	Tasks	Status	Hype
Evaluating Robustness of Reward Models for Mathematical Reasoning	Oct 2, 2024	MathMathematical Reasoning	—Unverified	0
Not All LLM Reasoners Are Created Equal	Oct 2, 2024	AllCode Generation	—Unverified	0
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models	Oct 2, 2024	Cross-Lingual TransferMath	—Unverified	0
Mind Scramble: Unveiling Large Language Model Psychology Via Typoglycemia	Oct 2, 2024	Language ModelingLanguage Modelling	CodeCode Available	0
VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment	Oct 2, 2024	GSM8KMath	CodeCode Available	2
Automated Knowledge Concept Annotation and Question Representation Learning for Knowledge Tracing	Oct 2, 2024	Contrastive LearningKnowledge Tracing	CodeCode Available	0
OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data	Oct 2, 2024	Arithmetic ReasoningLarge Language Model	CodeCode Available	4
LASeR: Learning to Adaptively Select Reward Models with Multi-Armed Bandits	Oct 2, 2024	Instruction FollowingMath	CodeCode Available	1
Scheherazade: Evaluating Chain-of-Thought Math Reasoning in LLMs with Chain-of-Problems	Sep 30, 2024	GSM8KMath	CodeCode Available	0
Instance-adaptive Zero-shot Chain-of-Thought Prompting	Sep 30, 2024	GSM8KMath	—Unverified	0
The Perfect Blend: Redefining RLHF with Mixture of Judges	Sep 30, 2024	Instruction FollowingMath	—Unverified	0
INC-Math: Integrating Natural Language and Code for Enhanced Mathematical Reasoning in Large Language Models	Sep 28, 2024	MathMathematical Reasoning	—Unverified	0
Revisiting the Superficial Alignment Hypothesis	Sep 27, 2024	Instruction FollowingMath	—Unverified	0
On the Inductive Bias of Stacking Towards Improving Reasoning	Sep 27, 2024	Inductive BiasMath	—Unverified	0
BEATS: Optimizing LLM Mathematical Capabilities with BackVerify and Adaptive Disambiguate based Efficient Tree Search	Sep 26, 2024	MathMathematical Problem-Solving	CodeCode Available	1
Learning to Love Edge Cases in Formative Math Assessment: Using the AMMORE Dataset and Chain-of-Thought Prompting to Improve Grading Accuracy	Sep 26, 2024	Knowledge TracingMath	—Unverified	0
Democratizing Signal Processing and Machine Learning: Math Learning Equity for Elementary and Middle School Students	Sep 25, 2024	Math	—Unverified	0
PMSS: Pretrained Matrices Skeleton Selection for LLM Fine-tuning	Sep 25, 2024	GSM8KMath	—Unverified	0
LLaMa-SciQ: An Educational Chatbot for Answering Science MCQ	Sep 25, 2024	ChatbotGSM8K	—Unverified	0
Models Can and Should Embrace the Communicative Nature of Human-Generated Math	Sep 25, 2024	Math	—Unverified	0
Archon: An Architecture Search Framework for Inference-Time Techniques	Sep 23, 2024	Hyperparameter OptimizationInstruction Following	CodeCode Available	2
PTD-SQL: Partitioning and Targeted Drilling with LLMs in Text-to-SQL	Sep 21, 2024	MathText to SQL	CodeCode Available	0
Beyond Accuracy Optimization: Computer Vision Losses for Large Language Model Fine-Tuning	Sep 20, 2024	Language ModelingLanguage Modelling	CodeCode Available	0
ControlMath: Controllable Data Generation Promotes Math Generalist Models	Sep 20, 2024	Data AugmentationDiversity	—Unverified	0
Balancing LoRA Performance and Efficiency with Simple Shard Sharing	Sep 19, 2024	Computational EfficiencyGSM8K	CodeCode Available	2

Show:10 25 50

← PrevPage 27 of 64Next →

No leaderboard results yet.