SOTAVerified

Math

Papers

Showing 651675 of 1596 papers

TitleStatusHype
Evaluating Robustness of Reward Models for Mathematical Reasoning0
Not All LLM Reasoners Are Created Equal0
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models0
Mind Scramble: Unveiling Large Language Model Psychology Via TypoglycemiaCode0
VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit AssignmentCode2
Automated Knowledge Concept Annotation and Question Representation Learning for Knowledge TracingCode0
OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction DataCode4
LASeR: Learning to Adaptively Select Reward Models with Multi-Armed BanditsCode1
Scheherazade: Evaluating Chain-of-Thought Math Reasoning in LLMs with Chain-of-ProblemsCode0
Instance-adaptive Zero-shot Chain-of-Thought Prompting0
The Perfect Blend: Redefining RLHF with Mixture of Judges0
INC-Math: Integrating Natural Language and Code for Enhanced Mathematical Reasoning in Large Language Models0
Revisiting the Superficial Alignment Hypothesis0
On the Inductive Bias of Stacking Towards Improving Reasoning0
BEATS: Optimizing LLM Mathematical Capabilities with BackVerify and Adaptive Disambiguate based Efficient Tree SearchCode1
Learning to Love Edge Cases in Formative Math Assessment: Using the AMMORE Dataset and Chain-of-Thought Prompting to Improve Grading Accuracy0
Democratizing Signal Processing and Machine Learning: Math Learning Equity for Elementary and Middle School Students0
PMSS: Pretrained Matrices Skeleton Selection for LLM Fine-tuning0
LLaMa-SciQ: An Educational Chatbot for Answering Science MCQ0
Models Can and Should Embrace the Communicative Nature of Human-Generated Math0
Archon: An Architecture Search Framework for Inference-Time TechniquesCode2
PTD-SQL: Partitioning and Targeted Drilling with LLMs in Text-to-SQLCode0
Beyond Accuracy Optimization: Computer Vision Losses for Large Language Model Fine-TuningCode0
ControlMath: Controllable Data Generation Promotes Math Generalist Models0
Balancing LoRA Performance and Efficiency with Simple Shard SharingCode2
Show:102550
← PrevPage 27 of 64Next →

No leaderboard results yet.