SOTAVerified

Math

Papers

Showing 876900 of 1596 papers

TitleStatusHype
Personality-aware Student Simulation for Conversational Intelligent Tutoring Systems0
MathVC: An LLM-Simulated Multi-Character Virtual Classroom for Mathematics Education0
Evaluating Mathematical Reasoning Beyond AccuracyCode2
MM-MATH: Advancing Multimodal Math Evaluation with Process Evaluation and Fine-grained ClassificationCode0
FRACTAL: Fine-Grained Scoring from Aggregate Text Labels0
MACM: Utilizing a Multi-Agent System for Condition Mining in Solving Complex Mathematical ProblemsCode2
Data Augmentation with In-Context Learning and Comparative Evaluation in Math Word Problem Solving0
BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language ModelsCode3
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique PipelineCode2
Benchmarking Large Language Models for Persian: A Preliminary Study Focusing on ChatGPTCode1
LM^2: A Simple Society of Language Models Solves Complex ReasoningCode0
Exploring Automated Distractor Generation for Math Multiple-choice Questions via Large Language ModelsCode0
HyperCLOVA X Technical Report0
Exploring the Mystery of Influential Data for Mathematical Reasoning0
Stable Code Technical Report0
IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations0
Self-Demos: Eliciting Out-of-Demonstration Generalizability in Large Language ModelsCode0
What is in Your Safe Data? Identifying Benign Data that Breaks SafetyCode1
Can LLMs Master Math? Investigating Large Language Models on Math Stack ExchangeCode0
ML2SC: Deploying Machine Learning Models as Smart Contracts on the Blockchain0
Large Language Models Are Struggle to Cope with Unreasonability in Math Problems0
Scaling up ridge regression for brain encoding in a massive individual fMRI datasetCode0
Few-Shot Recalibration of Language Models0
The Invalsi Benchmarks: measuring Linguistic and Mathematical understanding of Large Language Models in Italian0
Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with AutoformalizationCode1
Show:102550
← PrevPage 36 of 64Next →

No leaderboard results yet.