SOTAVerified

Math

Papers

Showing 10761100 of 1596 papers

TitleStatusHype
Assessing and Verifying Task Utility in LLM-Powered Applications0
Self-Refine Instruction-Tuning for Aligning Reasoning in Language Models0
A Careful Examination of Large Language Model Performance on Grade School Arithmetic0
Math Multiple Choice Question Generation via Human-Large Language Model Collaboration0
Iterative Reasoning Preference Optimization0
Small Language Models Need Strong Verifiers to Self-Correct ReasoningCode0
Describe-then-Reason: Improving Multimodal Mathematical Reasoning through Visual Comprehension Training0
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone0
PARAMANU-GANITA: Language Model with Mathematical Capabilities0
Improving Automated Distractor Generation for Math Multiple-choice Questions with Overgenerate-and-rank0
On the Empirical Complexity of Reasoning and Planning in LLMs0
Mental Stress Detection: Development and Evaluation of a Wearable In-Ear Plethysmography0
Personality-aware Student Simulation for Conversational Intelligent Tutoring Systems0
MathVC: An LLM-Simulated Multi-Character Virtual Classroom for Mathematics Education0
FRACTAL: Fine-Grained Scoring from Aggregate Text Labels0
MM-MATH: Advancing Multimodal Math Evaluation with Process Evaluation and Fine-grained ClassificationCode0
Data Augmentation with In-Context Learning and Comparative Evaluation in Math Word Problem Solving0
HyperCLOVA X Technical Report0
Exploring Automated Distractor Generation for Math Multiple-choice Questions via Large Language ModelsCode0
LM^2: A Simple Society of Language Models Solves Complex ReasoningCode0
IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations0
Exploring the Mystery of Influential Data for Mathematical Reasoning0
Stable Code Technical Report0
Self-Demos: Eliciting Out-of-Demonstration Generalizability in Large Language ModelsCode0
Can LLMs Master Math? Investigating Large Language Models on Math Stack ExchangeCode0
Show:102550
← PrevPage 44 of 64Next →

No leaderboard results yet.