SOTAVerified

Math

Papers

Showing 701750 of 1596 papers

TitleStatusHype
Deconfounded Causality-aware Parameter-Efficient Fine-Tuning for Problem-Solving Improvement of LLMs0
More is More: Addition Bias in Large Language ModelsCode0
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end ModelCode9
S^3c-Math: Spontaneous Step-level Self-correction Makes Large Language Models Better Mathematical Reasoners0
MultiMath: Bridging Visual and Mathematical Reasoning for Large Language ModelsCode1
Logic Contrastive Reasoning with Lightweight Large Language Model for Math Word Problems0
Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity0
Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems0
Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic0
SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models0
What makes math problems hard for reinforcement learning: a case studyCode1
Generative Verifiers: Reward Modeling as Next-Token Prediction0
Students' Perceived Roles, Opportunities, and Challenges of a Generative AI-powered Teachable Agent: A Case of Middle School Math Class0
Multi-tool Integration Application for Math Reasoning Using Large Language Model0
SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language ModelsCode1
Mathematical Information Retrieval: Search and Question Answering0
Benchmarking Large Language Models for Math Reasoning TasksCode0
QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning0
A Study of PHOC Spatial Region Configurations for Math Formula Retrieval0
Large Language Models Might Not Care What You Are Saying: Prompt Format Beats Descriptions0
Math-PUMA: Progressive Upward Multimodal Alignment to Enhance Mathematical ReasoningCode1
Does Reasoning Emerge? Examining the Probabilities of Causation in Large Language Models0
Leveraging Web-Crawled Data for High-Quality Fine-TuningCode0
Bridging and Modeling Correlations in Pairwise Data for Direct Preference OptimizationCode1
MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical BenchmarkCode0
A Perspective on Large Language Models, Intelligent Machines, and Knowledge Acquisition0
Mutual Reasoning Makes Smaller LLMs Stronger Problem-SolversCode4
P3: A Policy-Driven, Pace-Adaptive, and Diversity-Promoted Framework for data pruning in LLM Training0
Examining the Behavior of LLM Architectures Within the Framework of Standardized National Exams in Brazil0
Mathfish: Evaluating Language Model Math Reasoning via Grounding in Educational CurriculaCode1
AltCanvas: A Tile-Based Image Editor with Generative AI for Blind or Visually Impaired People0
The Logic of Political Survival Revisited: Consequences of Elite Uncertainty Under Authoritarian Rule0
On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty AgentsCode1
MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated CapabilitiesCode3
Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language ModelsCode2
Large Language Monkeys: Scaling Inference Compute with Repeated SamplingCode3
AI-Assisted Generation of Difficult Math QuestionsCode0
Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning ProcessCode2
Towards Effective and Efficient Continual Pre-training of Large Language ModelsCode0
Recursive Introspection: Teaching Language Model Agents How to Self-Improve0
Boosting Large Language Models with Socratic Method for Conversational Mathematics TeachingCode1
MathViz-E: A Case-study in Domain-Specialized Tool-Using AgentsCode1
LEAN-GitHub: Compiling GitHub LEAN repositories for a versatile LEAN proverCode4
Nerva: a Truly Sparse Implementation of Neural NetworksCode1
TaskGen: A Task-Based, Memory-Infused Agentic Framework using StrictJSONCode3
Toward Adaptive Reasoning in Large Language Models with Thought RollbackCode1
Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data0
Learning Goal-Conditioned Representations for Language Reward ModelsCode1
Weak-to-Strong ReasoningCode2
Prover-Verifier Games improve legibility of LLM outputsCode0
Show:102550
← PrevPage 15 of 32Next →

No leaderboard results yet.