Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 601–625 of 1596 papers

Title	Date	Tasks	Status	Hype
When Not to Answer: Evaluating Prompts on GPT Models for Effective Abstention in Unanswerable Math Word Problems	Oct 16, 2024	HallucinationMath	—Unverified	0
Not All Votes Count! Programs as Verifiers Improve Self-Consistency of Language Models for Math Reasoning	Oct 16, 2024	AllGSM8K	CodeCode Available	0
LoRA Soups: Merging LoRAs for Practical Skill Composition Tasks	Oct 16, 2024	Mathparameter-efficient fine-tuning	CodeCode Available	1
JudgeBench: A Benchmark for Evaluating LLM-based Judges	Oct 16, 2024	Math	CodeCode Available	2
MIND: Math Informed syNthetic Dialogues for Pretraining LLMs	Oct 15, 2024	GSM8KMath	—Unverified	0
Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling	Oct 15, 2024	Instruction FollowingKnowledge Distillation	—Unverified	0
One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks	Oct 14, 2024	FairnessGSM8K	CodeCode Available	0
Embedding Self-Correction as an Inherent Ability in Large Language Models for Enhanced Mathematical Reasoning	Oct 14, 2024	MathMathematical Reasoning	—Unverified	0
Innovative Thinking, Infinite Humor: Humor Research of Large Language Models through Structured Thought Leaps	Oct 14, 2024	Math	—Unverified	0
CoMAT: Chain of Mathematically Annotated Thought Improves Mathematical Reasoning	Oct 14, 2024	MathMathematical Reasoning	CodeCode Available	1
Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces	Oct 13, 2024	Computational EfficiencyMath	—Unverified	0
HARDMath: A Benchmark Dataset for Challenging Problems in Applied Mathematics	Oct 13, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
Expanding Search Space with Diverse Prompting Agents: An Efficient Sampling Approach for LLM Mathematical Reasoning	Oct 13, 2024	MathMathematical Reasoning	—Unverified	0
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models	Oct 12, 2024	Mathreinforcement-learning	CodeCode Available	5
Testing GPT-4-o1-preview on math and science problems: A follow-up study	Oct 11, 2024	MathSpatial Reasoning	—Unverified	0
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization	Oct 11, 2024	GSM8KLanguage Modeling	CodeCode Available	2
SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights	Oct 11, 2024	GSM8KMath	CodeCode Available	4
The Geometry of Concepts: Sparse Autoencoder Feature Structure	Oct 10, 2024	Math	CodeCode Available	1
VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models	Oct 10, 2024	Math	CodeCode Available	2
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models	Oct 10, 2024	GSM8KMath	CodeCode Available	2
Cognitive Noise and Altruistic Preferences	Oct 10, 2024	Math	—Unverified	0
Teaching-Inspired Integrated Prompting Framework: A Novel Approach for Enhancing Reasoning in Large Language Models	Oct 10, 2024	Arithmetic ReasoningMath	CodeCode Available	0
MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code	Oct 10, 2024	MathMathematical Reasoning	CodeCode Available	2
Herald: A Natural Language Annotated Lean 4 Dataset	Oct 9, 2024	MathMathematical Reasoning	—Unverified	0
Hallucinating AI Hijacking Attack: Large Language Models and Malicious Code Recommenders	Oct 9, 2024	Math	—Unverified	0

Show:10 25 50

← PrevPage 25 of 64Next →

No leaderboard results yet.