Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 651–700 of 1596 papers

Title	Date	Tasks	Status	Hype
Evaluating Robustness of Reward Models for Mathematical Reasoning	Oct 2, 2024	MathMathematical Reasoning	—Unverified	0
VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment	Oct 2, 2024	GSM8KMath	CodeCode Available	2
Not All LLM Reasoners Are Created Equal	Oct 2, 2024	AllCode Generation	—Unverified	0
OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data	Oct 2, 2024	Arithmetic ReasoningLarge Language Model	CodeCode Available	4
Automated Knowledge Concept Annotation and Question Representation Learning for Knowledge Tracing	Oct 2, 2024	Contrastive LearningKnowledge Tracing	CodeCode Available	0
LASeR: Learning to Adaptively Select Reward Models with Multi-Armed Bandits	Oct 2, 2024	Instruction FollowingMath	CodeCode Available	1
Mind Scramble: Unveiling Large Language Model Psychology Via Typoglycemia	Oct 2, 2024	Language ModelingLanguage Modelling	CodeCode Available	0
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models	Oct 2, 2024	Cross-Lingual TransferMath	—Unverified	0
Scheherazade: Evaluating Chain-of-Thought Math Reasoning in LLMs with Chain-of-Problems	Sep 30, 2024	GSM8KMath	CodeCode Available	0
Instance-adaptive Zero-shot Chain-of-Thought Prompting	Sep 30, 2024	GSM8KMath	—Unverified	0
The Perfect Blend: Redefining RLHF with Mixture of Judges	Sep 30, 2024	Instruction FollowingMath	—Unverified	0
INC-Math: Integrating Natural Language and Code for Enhanced Mathematical Reasoning in Large Language Models	Sep 28, 2024	MathMathematical Reasoning	—Unverified	0
Revisiting the Superficial Alignment Hypothesis	Sep 27, 2024	Instruction FollowingMath	—Unverified	0
On the Inductive Bias of Stacking Towards Improving Reasoning	Sep 27, 2024	Inductive BiasMath	—Unverified	0
BEATS: Optimizing LLM Mathematical Capabilities with BackVerify and Adaptive Disambiguate based Efficient Tree Search	Sep 26, 2024	MathMathematical Problem-Solving	CodeCode Available	1
Learning to Love Edge Cases in Formative Math Assessment: Using the AMMORE Dataset and Chain-of-Thought Prompting to Improve Grading Accuracy	Sep 26, 2024	Knowledge TracingMath	—Unverified	0
Democratizing Signal Processing and Machine Learning: Math Learning Equity for Elementary and Middle School Students	Sep 25, 2024	Math	—Unverified	0
PMSS: Pretrained Matrices Skeleton Selection for LLM Fine-tuning	Sep 25, 2024	GSM8KMath	—Unverified	0
LLaMa-SciQ: An Educational Chatbot for Answering Science MCQ	Sep 25, 2024	ChatbotGSM8K	—Unverified	0
Models Can and Should Embrace the Communicative Nature of Human-Generated Math	Sep 25, 2024	Math	—Unverified	0
Archon: An Architecture Search Framework for Inference-Time Techniques	Sep 23, 2024	Hyperparameter OptimizationInstruction Following	CodeCode Available	2
PTD-SQL: Partitioning and Targeted Drilling with LLMs in Text-to-SQL	Sep 21, 2024	MathText to SQL	CodeCode Available	0
Beyond Accuracy Optimization: Computer Vision Losses for Large Language Model Fine-Tuning	Sep 20, 2024	Language ModelingLanguage Modelling	CodeCode Available	0
ControlMath: Controllable Data Generation Promotes Math Generalist Models	Sep 20, 2024	Data AugmentationDiversity	—Unverified	0
Balancing LoRA Performance and Efficiency with Simple Shard Sharing	Sep 19, 2024	Computational EfficiencyGSM8K	CodeCode Available	2
InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning	Sep 19, 2024	MathMathematical Reasoning	—Unverified	0
Training Language Models to Self-Correct via Reinforcement Learning	Sep 19, 2024	HumanEvalMath	CodeCode Available	2
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement	Sep 18, 2024	GSM8KMath	—Unverified	0
MAgICoRe: Multi-Agent, Iterative, Coarse-to-Fine Refinement for Reasoning	Sep 18, 2024	Math	CodeCode Available	1
GRIN: GRadient-INformed MoE	Sep 18, 2024	HellaSwagHumanEval	—Unverified	0
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning	Sep 18, 2024	MathMMLU	CodeCode Available	1
Qwen2.5-Coder Technical Report	Sep 18, 2024	Code Generation	CodeCode Available	11
Reasoning Graph Enhanced Exemplars Retrieval for In-Context Learning	Sep 17, 2024	Few-Shot LearningIn-Context Learning	CodeCode Available	0
Diversify and Conquer: Diversity-Centric Data Selection with Iterative Refinement	Sep 17, 2024	Active LearningDiversity	CodeCode Available	1
NVLM: Open Frontier-Class Multimodal LLMs	Sep 17, 2024	MathMultimodal Reasoning	—Unverified	0
GPT takes the SAT: Tracing changes in Test Difficulty and Math Performance of Students	Sep 16, 2024	Math	—Unverified	0
Cracking the Code: Multi-domain LLM Evaluation on Real-World Professional Exams in Indonesia	Sep 13, 2024	MathMultiple-choice	—Unverified	0
CPL: Critical Plan Step Learning Boosts LLM Generalization in Reasoning Tasks	Sep 13, 2024	ARCCode Generation	—Unverified	0
VAE Explainer: Supplement Learning Variational Autoencoders with Interactive Visualization	Sep 13, 2024	Math	CodeCode Available	2
Explaining Datasets in Words: Statistical Models with Natural Language Parameters	Sep 13, 2024	ClusteringLanguage Modeling	CodeCode Available	1
Knowledge Tagging with Large Language Model based Multi-Agent System	Sep 12, 2024	Language ModelingLanguage Modelling	—Unverified	0
Alignment with Preference Optimization Is All You Need for LLM Safety	Sep 12, 2024	AllMath	—Unverified	0
Leveraging Unstructured Text Data for Federated Instruction Tuning of Large Language Models	Sep 11, 2024	Language ModellingLarge Language Model	—Unverified	0
A Practice of Post-Training on Llama-3 70B with Optimal Selection of Additional Language Mixture Ratio	Sep 10, 2024	Emotional IntelligenceMath	—Unverified	0
Mathematical Formalized Problem Solving and Theorem Proving in Different Fields in Lean 4	Sep 9, 2024	Abstract AlgebraAutomated Theorem Proving	CodeCode Available	0
Sirius: Contextual Sparsity with Correction for Efficient LLMs	Sep 5, 2024	Math	CodeCode Available	1
Prompt Baking	Sep 4, 2024	ARCGSM8K	—Unverified	0
Wavelet GPT: Wavelet Inspired Large Language Models	Sep 4, 2024	DecoderMath	—Unverified	0
CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models	Sep 4, 2024	GSM8KMath	CodeCode Available	2
Building Math Agents with Multi-Turn Iterative Preference Learning	Sep 4, 2024	GSM8KMath	—Unverified	0

Show:10 25 50

← PrevPage 14 of 32Next →

No leaderboard results yet.