SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 926–950 of 1596 papers

Title	Date	Tasks	Status	Hype
Evaluating and Optimizing Educational Content with Large Language Model Judgments	Mar 5, 2024	Language ModelingLanguage Modelling	CodeCode Available	0
Experimenting with Generative AI: Does ChatGPT Really Increase Everyone's Productivity?	Mar 4, 2024	EconometricsMath	—Unverified	0
The Claude 3 Model Family: Opus, Sonnet, Haiku	Mar 4, 2024	1 Image, 2*2 StitchingArithmetic Reasoning	—Unverified	0
Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training	Mar 4, 2024	MathPhrase Grounding	—Unverified	0
Brilla AI: AI Contestant for the National Science and Maths Quiz	Mar 4, 2024	MathQuestion Answering	CodeCode Available	1
Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models	Mar 4, 2024	Data AugmentationGSM8K	CodeCode Available	1
Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning	Mar 4, 2024	GSM8KMath	—Unverified	0
Improving the Validity of Automatically Generated Feedback via Reinforcement Learning	Mar 2, 2024	MathMisconceptions	CodeCode Available	1
ClickTree: A Tree-based Method for Predicting Math Students' Performance Based on Clickstream Data	Mar 1, 2024	Math	—Unverified	0
Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap	Feb 29, 2024	Math	CodeCode Available	2
PRSA: Prompt Stealing Attacks against Real-World Prompt Services	Feb 29, 2024	Math	—Unverified	0
GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers	Feb 29, 2024	GSM8KMath	CodeCode Available	2
StarCoder 2 and The Stack v2: The Next Generation	Feb 29, 2024	Code CompletionCode Generation	CodeCode Available	7
Data Interpreter: An LLM Agent For Data Science	Feb 28, 2024	Code GenerationLanguage Modelling	—Unverified	0
Adversarial Math Word Problem Generation	Feb 27, 2024	Math	CodeCode Available	0
MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical Reasoning	Feb 27, 2024	8kLanguage Modeling	CodeCode Available	0
Case-Based or Rule-Based: How Do Transformers Do the Math?	Feb 27, 2024	MathSystematic Generalization	CodeCode Available	1
MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs	Feb 26, 2024	GSM8KMath	—Unverified	0
Stepwise Self-Consistent Mathematical Reasoning with Large Language Models	Feb 24, 2024	MathMathematical Reasoning	CodeCode Available	1
How Do Humans Write Code? Large Models Do It the Same Way Too	Feb 24, 2024	Code GenerationMath	CodeCode Available	0
MATHWELL: Generating Educational Math Word Problems Using Teacher Annotations	Feb 24, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
Brain-Inspired Two-Stage Approach: Enhancing Mathematical Reasoning by Imitating Human Thought Processes	Feb 23, 2024	MathMathematical Reasoning	CodeCode Available	0
ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models	Feb 22, 2024	MathMathematical Reasoning	CodeCode Available	1
Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset	Feb 22, 2024	DiversityMath	CodeCode Available	2
MoELoRA: Contrastive Learning Guided Mixture of Experts on Parameter-Efficient Fine-Tuning for Large Language Models	Feb 20, 2024	Common Sense ReasoningContrastive Learning	—Unverified	0

Show:10 25 50

← PrevPage 38 of 64Next →

No leaderboard results yet.