SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1076–1100 of 1596 papers

Title	Date	Tasks	Status	Hype
Assessing and Verifying Task Utility in LLM-Powered Applications	May 3, 2024	Math	—Unverified	0
Self-Refine Instruction-Tuning for Aligning Reasoning in Language Models	May 1, 2024	Math	—Unverified	0
A Careful Examination of Large Language Model Performance on Grade School Arithmetic	May 1, 2024	GSM8KLanguage Modeling	—Unverified	0
Math Multiple Choice Question Generation via Human-Large Language Model Collaboration	May 1, 2024	Language ModelingLanguage Modelling	—Unverified	0
Iterative Reasoning Preference Optimization	Apr 30, 2024	ARCGSM8K	—Unverified	0
Small Language Models Need Strong Verifiers to Self-Correct Reasoning	Apr 26, 2024	Math	CodeCode Available	0
Describe-then-Reason: Improving Multimodal Mathematical Reasoning through Visual Comprehension Training	Apr 22, 2024	MathMathematical Reasoning	—Unverified	0
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone	Apr 22, 2024	Language ModelingLanguage Modelling	—Unverified	0
PARAMANU-GANITA: Language Model with Mathematical Capabilities	Apr 22, 2024	Domain AdaptationGSM8K	—Unverified	0
Improving Automated Distractor Generation for Math Multiple-choice Questions with Overgenerate-and-rank	Apr 19, 2024	Distractor GenerationMath	—Unverified	0
On the Empirical Complexity of Reasoning and Planning in LLMs	Apr 17, 2024	Math	—Unverified	0
Mental Stress Detection: Development and Evaluation of a Wearable In-Ear Plethysmography	Apr 12, 2024	MathMental Stress Detection	—Unverified	0
Personality-aware Student Simulation for Conversational Intelligent Tutoring Systems	Apr 10, 2024	Math	—Unverified	0
MathVC: An LLM-Simulated Multi-Character Virtual Classroom for Mathematics Education	Apr 10, 2024	Math	—Unverified	0
FRACTAL: Fine-Grained Scoring from Aggregate Text Labels	Apr 7, 2024	MathMultiple Instance Learning	—Unverified	0
MM-MATH: Advancing Multimodal Math Evaluation with Process Evaluation and Fine-grained Classification	Apr 7, 2024	Image ComprehensionMath	CodeCode Available	0
Data Augmentation with In-Context Learning and Comparative Evaluation in Math Word Problem Solving	Apr 5, 2024	Data AugmentationIn-Context Learning	—Unverified	0
HyperCLOVA X Technical Report	Apr 2, 2024	Instruction FollowingMachine Translation	—Unverified	0
Exploring Automated Distractor Generation for Math Multiple-choice Questions via Large Language Models	Apr 2, 2024	Distractor GenerationIn-Context Learning	CodeCode Available	0
LM^2: A Simple Society of Language Models Solves Complex Reasoning	Apr 2, 2024	MathMedQA	CodeCode Available	0
IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations	Apr 1, 2024	BenchmarkingMath	—Unverified	0
Exploring the Mystery of Influential Data for Mathematical Reasoning	Apr 1, 2024	MathMathematical Reasoning	—Unverified	0
Stable Code Technical Report	Apr 1, 2024	Code CompletionLanguage Modelling	—Unverified	0
Self-Demos: Eliciting Out-of-Demonstration Generalizability in Large Language Models	Apr 1, 2024	In-Context LearningMath	CodeCode Available	0
Can LLMs Master Math? Investigating Large Language Models on Math Stack Exchange	Mar 30, 2024	MathMathematical Problem-Solving	CodeCode Available	0

Show:10 25 50

← PrevPage 44 of 64Next →

No leaderboard results yet.