Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 901–950 of 1596 papers

Title	Date	Tasks	Status
Can Stories Help LLMs Reason? Curating Information Space Through Narrative	Oct 25, 2024	Math	—Unverified
ReasonAgain: Using Extractable Symbolic Programs to Evaluate Mathematical Reasoning	Oct 24, 2024	GSM8KMath	—Unverified
From Blind Solvers to Logical Thinkers: Benchmarking LLMs' Logical Integrity on Faulty Mathematical Problems	Oct 24, 2024	BenchmarkingCommon Sense Reasoning	—Unverified
Mixture of Parrots: Experts improve memorization more than reasoning	Oct 24, 2024	MathMemorization	—Unverified
MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning	Oct 23, 2024	MathMixture-of-Experts	—Unverified
Polyak's Heavy Ball Method Achieves Accelerated Local Rate of Convergence under Polyak-Lojasiewicz Inequality	Oct 22, 2024	Math	—Unverified
JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation	Oct 22, 2024	Math	—Unverified
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration	Oct 22, 2024	Math	—Unverified
Optimizing Chain-of-Thought Reasoning: Tackling Arranging Bottleneck via Plan Augmentation	Oct 22, 2024	GSM8KMath	—Unverified
No more hard prompts: SoftSRV prompting for synthetic data generation	Oct 21, 2024	Language ModelingLanguage Modelling	—Unverified
PromptHive: Bringing Subject Matter Experts Back to the Forefront with Collaborative Prompt Engineering for Educational Content Creation	Oct 21, 2024	MathPrompt Engineering	—Unverified
On Designing Effective RL Reward at Training Time for LLM Reasoning	Oct 19, 2024	GSM8KMath	—Unverified
Do Large Language Models Truly Grasp Mathematics? An Empirical Exploration From Cognitive Psychology	Oct 19, 2024	Logical ReasoningMath	—Unverified
LLM The Genius Paradox: A Linguistic and Math Expert's Struggle with Simple Word-based Counting Problems	Oct 18, 2024	In-Context LearningMath	—Unverified
Step Guided Reasoning: Improving Mathematical Reasoning using Guidance Generation and Step Reasoning	Oct 18, 2024	MathMathematical Reasoning	—Unverified
Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens	Oct 18, 2024	MathQuestion Answering	—Unverified
SBI-RAG: Enhancing Math Word Problem Solving for Students through Schema-Based Instruction and Retrieval-Augmented Generation	Oct 17, 2024	GSM8KLanguage Modeling	CodeCode Available
Not All Votes Count! Programs as Verifiers Improve Self-Consistency of Language Models for Math Reasoning	Oct 16, 2024	AllGSM8K	CodeCode Available
When Not to Answer: Evaluating Prompts on GPT Models for Effective Abstention in Unanswerable Math Word Problems	Oct 16, 2024	HallucinationMath	—Unverified
Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling	Oct 15, 2024	Instruction FollowingKnowledge Distillation	—Unverified
MIND: Math Informed syNthetic Dialogues for Pretraining LLMs	Oct 15, 2024	GSM8KMath	—Unverified
Innovative Thinking, Infinite Humor: Humor Research of Large Language Models through Structured Thought Leaps	Oct 14, 2024	Math	—Unverified
One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks	Oct 14, 2024	FairnessGSM8K	CodeCode Available
Embedding Self-Correction as an Inherent Ability in Large Language Models for Enhanced Mathematical Reasoning	Oct 14, 2024	MathMathematical Reasoning	—Unverified
Expanding Search Space with Diverse Prompting Agents: An Efficient Sampling Approach for LLM Mathematical Reasoning	Oct 13, 2024	MathMathematical Reasoning	—Unverified
Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces	Oct 13, 2024	Computational EfficiencyMath	—Unverified
Testing GPT-4-o1-preview on math and science problems: A follow-up study	Oct 11, 2024	MathSpatial Reasoning	—Unverified
Cognitive Noise and Altruistic Preferences	Oct 10, 2024	Math	—Unverified
Teaching-Inspired Integrated Prompting Framework: A Novel Approach for Enhancing Reasoning in Large Language Models	Oct 10, 2024	Arithmetic ReasoningMath	CodeCode Available
Herald: A Natural Language Annotated Lean 4 Dataset	Oct 9, 2024	MathMathematical Reasoning	—Unverified
Subtle Errors Matter: Preference Learning via Error-injected Self-editing	Oct 9, 2024	GSM8KMath	—Unverified
Hallucinating AI Hijacking Attack: Large Language Models and Malicious Code Recommenders	Oct 9, 2024	Math	—Unverified
Give me a hint: Can LLMs take a hint to solve math problems?	Oct 8, 2024	Adversarial RobustnessMath	CodeCode Available
FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning	Oct 8, 2024	GSM8KHallucination	—Unverified
Beyond Captioning: Task-Specific Prompting for Improved VLM Performance in Mathematical Reasoning	Oct 8, 2024	Image RetrievalMath	—Unverified
Solving Functional Optimization with Deep Networks and Variational Principles	Oct 8, 2024	Math	—Unverified
Intriguing Properties of Large Language and Vision Models	Oct 7, 2024	cross-modal alignmentLarge Language Model	—Unverified
Rule-based Data Selection for Large Language Models	Oct 7, 2024	BenchmarkingMath	—Unverified
Reasoning Paths Optimization: Learning to Reason and Explore From Diverse Paths	Oct 7, 2024	AttributeGSM8K	—Unverified
fPLSA: Learning Semantic Structures in Document Collections Using Foundation Models	Oct 7, 2024	Math	—Unverified
Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification	Oct 5, 2024	GSM8KMath	—Unverified
BloomWise: Enhancing Problem-Solving capabilities of Large Language Models using Bloom's-Taxonomy-Inspired Prompts	Oct 5, 2024	Math	—Unverified
Deliberate Reasoning for LLMs as Structure-aware Planning with Accurate World Model	Oct 4, 2024	DiversityLogical Reasoning	—Unverified
Geometry is All You Need: A Unified Taxonomy of Matrix and Tensor Factorization for Compression of Generative Language Models	Oct 3, 2024	AllLanguage Modeling	—Unverified
CodePMP: Scalable Preference Model Pretraining for Large Language Model Reasoning	Oct 3, 2024	GSM8KLanguage Modeling	—Unverified
Towards the Pedagogical Steering of Large Language Models for Tutoring: A Case Study with Modeling Productive Failure	Oct 3, 2024	Math	CodeCode Available
Llama SLayer 8B: Shallow Layers Hold the Key to Knowledge Injection	Oct 3, 2024	Mathparameter-efficient fine-tuning	CodeCode Available
Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generation	Oct 3, 2024	GSM8KMath	—Unverified
An Exploration of Self-Supervised Mutual Information Alignment for Multi-Task Settings	Oct 2, 2024	8kMath	CodeCode Available
Evaluating Robustness of Reward Models for Mathematical Reasoning	Oct 2, 2024	MathMathematical Reasoning	—Unverified

Show:10 25 50

← PrevPage 19 of 32Next →

No leaderboard results yet.