SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1076–1100 of 1596 papers

Title	Date	Tasks	Status	Hype
Concise and Organized Perception Facilitates Reasoning in Large Language Models	Oct 5, 2023	LAMBADAMath	—Unverified	0
Retrieval-augmented Generation to Improve Math Question-Answering: Trade-offs Between Groundedness and Human Preference	Oct 4, 2023	MathQuestion Answering	CodeCode Available	1
The Rise of Open Science: Tracking the Evolution and Perceived Value of Data and Methods Link-Sharing Practices	Oct 4, 2023	ArticlesMath	CodeCode Available	0
Novice Learner and Expert Tutor: Evaluating Math Reasoning Abilities of Large Language Models with Misconceptions	Oct 3, 2023	MathMathematical Reasoning	—Unverified	0
Instances Need More Care: Rewriting Prompts for Instances with LLMs in the Loop Yields Better Zero-Shot Performance	Oct 3, 2023	Code GenerationLogical Reasoning	CodeCode Available	0
Large Language Models as Analogical Reasoners	Oct 3, 2023	Code GenerationGSM8K	—Unverified	0
Benchmarking and Improving Generator-Validator Consistency of Language Models	Oct 3, 2023	BenchmarkingInstruction Following	—Unverified	0
SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training	Oct 3, 2023	Contrastive LearningEquation Discovery	CodeCode Available	1
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts	Oct 3, 2023	ChatbotImage Captioning	CodeCode Available	2
A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration	Oct 3, 2023	Arithmetic ReasoningCode Generation	CodeCode Available	1
Fill in the Blank: Exploring and Enhancing LLM Capabilities for Backward Reasoning in Math Word Problems	Oct 3, 2023	GSM8KMath	CodeCode Available	0
FELM: Benchmarking Factuality Evaluation of Large Language Models	Oct 1, 2023	BenchmarkingMath	CodeCode Available	1
Investigating the Efficacy of Large Language Models in Reflective Assessment Methods through Chain of Thoughts Prompting	Sep 30, 2023	Math	—Unverified	0
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving	Sep 29, 2023	Arithmetic ReasoningComputational Efficiency	CodeCode Available	3
L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models	Sep 29, 2023	Code GenerationMath	—Unverified	0
Qwen Technical Report	Sep 28, 2023	Language ModelingLanguage Modelling	CodeCode Available	6
NLPBench: Evaluating Large Language Models on Solving NLP Problems	Sep 27, 2023	BenchmarkingMath	CodeCode Available	1
ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse LLMs	Sep 22, 2023	Math	CodeCode Available	2
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models	Sep 21, 2023	Arithmetic ReasoningGSM8K	CodeCode Available	2
Fairness Hub Technical Briefs: AUC Gap	Sep 20, 2023	FairnessMath	—Unverified	0
Design of Chain-of-Thought in Math Problem Solving	Sep 20, 2023	DiversityGSM8K	CodeCode Available	1
Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning	Sep 19, 2023	Instruction FollowingLanguage Modeling	CodeCode Available	1
Contrastive Decoding Improves Reasoning in Large Language Models	Sep 17, 2023	GSM8KHellaSwag	—Unverified	0
Odd period cycles and ergodic properties in price dynamics for an exchange economy	Sep 17, 2023	Math	—Unverified	0
ChatGPT-4 with Code Interpreter can be used to solve introductory college-level vector calculus and electromagnetism problems	Sep 16, 2023	Electrical EngineeringMath	—Unverified	0

Show:10 25 50

← PrevPage 44 of 64Next →

No leaderboard results yet.