SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 751–775 of 1596 papers

Title	Date	Tasks	Status	Hype
MAPS: A Multilingual Benchmark for Global Agent Performance and Security	May 21, 2025	Code GenerationMath	—Unverified	0
Intriguing Properties of Large Language and Vision Models	Oct 7, 2024	cross-modal alignmentLarge Language Model	—Unverified	0
Interpretable Math Word Problem Solution Generation Via Step-by-step Planning	Jun 1, 2023	GSM8KLanguage Modeling	—Unverified	0
Interpretable Factorization for Neural Network ECG Models	Jun 26, 2020	Math	—Unverified	0
AdaptMI: Adaptive Skill-based In-context Math Instruction for Small Language Models	Apr 30, 2025	In-Context LearningMath	—Unverified	0
Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking	May 30, 2025	Language ModelingLanguage Modelling	—Unverified	0
Interleaved Reasoning for Large Language Models via Reinforcement Learning	May 26, 2025	Logical ReasoningMath	—Unverified	0
Lean-ing on Quality: How High-Quality Data Beats Diverse Multilingual Data in AutoFormalization	Feb 18, 2025	Math	—Unverified	0
Interactive Sketchpad: A Multimodal Tutoring System for Collaborative, Visual Problem-Solving	Feb 12, 2025	Mathmultimodal interaction	—Unverified	0
Describe-then-Reason: Improving Multimodal Mathematical Reasoning through Visual Comprehension Training	Apr 22, 2024	MathMathematical Reasoning	—Unverified	0
CRANE: Reasoning with constrained LLM generation	Feb 13, 2025	Code GenerationMath	—Unverified	0
Learning Beyond Pattern Matching? Assaying Mathematical Understanding in LLMs	May 24, 2024	In-Context LearningLanguage Modeling	—Unverified	0
Automated Feedback in Math Education: A Comparative Analysis of LLMs for Open-Ended Responses	Oct 29, 2024	MathZero-Shot Learning	—Unverified	0
Learning by Analogy: Enhancing Few-Shot Prompting for Math Word Problem Solving with Computational Graph-Based Retrieval	Nov 25, 2024	MathMath Word Problem Solving	—Unverified	0
MAmmoTH2: Scaling Instructions from the Web	May 6, 2024	ChatbotGSM8K	—Unverified	0
Cramer-Rao bound and absolute sensitivity in chemical reaction networks	Jan 13, 2024	MathSensitivity	—Unverified	0
Integer Networks for Data Compression with Latent-Variable Models	May 1, 2019	Data CompressionMath	—Unverified	0
A Method to Support Difficult Re-finding Tasks	Jan 27, 2016	Math	—Unverified	0
Instruction-Following Pruning for Large Language Models	Jan 3, 2025	Instruction FollowingMath	—Unverified	0
Cracking the Code: Multi-domain LLM Evaluation on Real-World Professional Exams in Indonesia	Sep 13, 2024	MathMultiple-choice	—Unverified	0
MALT: Improving Reasoning with Multi-Agent LLM Training	Dec 2, 2024	Common Sense ReasoningGSM8K	—Unverified	0
Instance-adaptive Zero-shot Chain-of-Thought Prompting	Sep 30, 2024	GSM8KMath	—Unverified	0
CPL: Critical Plan Step Learning Boosts LLM Generalization in Reasoning Tasks	Sep 13, 2024	ARCCode Generation	—Unverified	0
Inspecting Spoken Language Understanding from Kids for Basic Math Learning at Home	Jun 1, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Innovative Thinking, Infinite Humor: Humor Research of Large Language Models through Structured Thought Leaps	Oct 14, 2024	Math	—Unverified	0

Show:10 25 50

← PrevPage 31 of 64Next →

No leaderboard results yet.