SOTAVerified

Math

Papers

Showing 651700 of 1596 papers

TitleStatusHype
Evaluating Robustness of Reward Models for Mathematical Reasoning0
VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit AssignmentCode2
Not All LLM Reasoners Are Created Equal0
OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction DataCode4
Automated Knowledge Concept Annotation and Question Representation Learning for Knowledge TracingCode0
LASeR: Learning to Adaptively Select Reward Models with Multi-Armed BanditsCode1
Mind Scramble: Unveiling Large Language Model Psychology Via TypoglycemiaCode0
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models0
Scheherazade: Evaluating Chain-of-Thought Math Reasoning in LLMs with Chain-of-ProblemsCode0
Instance-adaptive Zero-shot Chain-of-Thought Prompting0
The Perfect Blend: Redefining RLHF with Mixture of Judges0
INC-Math: Integrating Natural Language and Code for Enhanced Mathematical Reasoning in Large Language Models0
Revisiting the Superficial Alignment Hypothesis0
On the Inductive Bias of Stacking Towards Improving Reasoning0
BEATS: Optimizing LLM Mathematical Capabilities with BackVerify and Adaptive Disambiguate based Efficient Tree SearchCode1
Learning to Love Edge Cases in Formative Math Assessment: Using the AMMORE Dataset and Chain-of-Thought Prompting to Improve Grading Accuracy0
Democratizing Signal Processing and Machine Learning: Math Learning Equity for Elementary and Middle School Students0
PMSS: Pretrained Matrices Skeleton Selection for LLM Fine-tuning0
LLaMa-SciQ: An Educational Chatbot for Answering Science MCQ0
Models Can and Should Embrace the Communicative Nature of Human-Generated Math0
Archon: An Architecture Search Framework for Inference-Time TechniquesCode2
PTD-SQL: Partitioning and Targeted Drilling with LLMs in Text-to-SQLCode0
Beyond Accuracy Optimization: Computer Vision Losses for Large Language Model Fine-TuningCode0
ControlMath: Controllable Data Generation Promotes Math Generalist Models0
Balancing LoRA Performance and Efficiency with Simple Shard SharingCode2
InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning0
Training Language Models to Self-Correct via Reinforcement LearningCode2
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement0
MAgICoRe: Multi-Agent, Iterative, Coarse-to-Fine Refinement for ReasoningCode1
GRIN: GRadient-INformed MoE0
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoningCode1
Qwen2.5-Coder Technical ReportCode11
Reasoning Graph Enhanced Exemplars Retrieval for In-Context LearningCode0
Diversify and Conquer: Diversity-Centric Data Selection with Iterative RefinementCode1
NVLM: Open Frontier-Class Multimodal LLMs0
GPT takes the SAT: Tracing changes in Test Difficulty and Math Performance of Students0
Cracking the Code: Multi-domain LLM Evaluation on Real-World Professional Exams in Indonesia0
CPL: Critical Plan Step Learning Boosts LLM Generalization in Reasoning Tasks0
VAE Explainer: Supplement Learning Variational Autoencoders with Interactive VisualizationCode2
Explaining Datasets in Words: Statistical Models with Natural Language ParametersCode1
Knowledge Tagging with Large Language Model based Multi-Agent System0
Alignment with Preference Optimization Is All You Need for LLM Safety0
Leveraging Unstructured Text Data for Federated Instruction Tuning of Large Language Models0
A Practice of Post-Training on Llama-3 70B with Optimal Selection of Additional Language Mixture Ratio0
Mathematical Formalized Problem Solving and Theorem Proving in Different Fields in Lean 4Code0
Sirius: Contextual Sparsity with Correction for Efficient LLMsCode1
Prompt Baking0
Wavelet GPT: Wavelet Inspired Large Language Models0
CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal ModelsCode2
Building Math Agents with Multi-Turn Iterative Preference Learning0
Show:102550
← PrevPage 14 of 32Next →

No leaderboard results yet.