SOTAVerified

Math

Papers

Showing 451500 of 1596 papers

TitleStatusHype
DrawEduMath: Evaluating Vision Language Models with Expert-Annotated Students' Hand-Drawn Math Images0
Leveraging Online Olympiad-Level Math Problems for LLMs Training and Contamination-Resistant EvaluationCode1
Advancing Mathematical Reasoning in Language Models: The Impact of Problem-Solving Data, Data Synthesis Methods, and Training Stages0
Pairwise RM: Perform Best-of-N Sampling with Knockout TournamentCode1
Kimi k1.5: Scaling Reinforcement Learning with LLMsCode7
An Optimal Transport approach to arbitrage correction: Application to volatility Stress-Tests0
Is your LLM trapped in a Mental Set? Investigative study on how mental sets affect the reasoning capabilities of LLMs0
RedStar: Does Scaling Long-CoT Data Unlock Better Slow-Reasoning Systems?0
Advancing Language Model Reasoning through Reinforcement Learning and Inference ScalingCode2
Control LLM: Controlled Evolution for Intelligence Retention in LLMCode1
Chain-of-Reasoning: Towards Unified Mathematical Reasoning in Large Language Models via a Multi-Paradigm Perspective0
Language Representation Favored Zero-Shot Cross-Domain Cognitive DiagnosisCode0
Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback0
Iterative Label Refinement Matters More than Preference Optimization under Weak SupervisionCode0
ArithmAttack: Evaluating Robustness of LLMs to Noisy Context in Math Problem SolvingCode0
Can Vision-Language Models Evaluate Handwritten Math?Code0
ZNO-Eval: Benchmarking reasoning capabilities of large language models in UkrainianCode1
Open Eyes, Then Reason: Fine-grained Visual Mathematical Understanding in MLLMsCode1
Cascaded Self-Evaluation Augmented Training for Efficient Multimodal Large Language Models0
Stream Aligner: Efficient Sentence-Level Alignment via Distribution InductionCode0
A General Retrieval-Augmented Generation Framework for Multimodal Case-Based Reasoning Applications0
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep ThinkingCode7
End-to-End Bangla AI for Solving Math Olympiad Problem Benchmark: Leveraging Large Language Model Using Integrated Approach0
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal MathematicsCode2
A Diversity-Enhanced Knowledge Distillation Model for Practical Math Word Problem SolvingCode0
BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoningCode1
InfiFusion: A Unified Framework for Enhanced Cross-Model Reasoning via LLM Fusion0
Quantization Meets Reasoning: Exploring LLM Low-Bit Quantization Degradation for Mathematical Reasoning0
Understand, Solve and Translate: Bridging the Multilingual Mathematical Reasoning Gap0
Empowering Bengali Education with AI: Solving Bengali Math Word Problems through Transformer Models0
Instruction-Following Pruning for Large Language Models0
A Probabilistic Model for Node Classification in Directed GraphsCode0
Recursive Decomposition of Logical Thoughts: Framework for Superior Reasoning and Knowledge Propagation in Large Language Models0
CoT-based Synthesizer: Enhancing LLM Performance through Answer SynthesisCode1
DIVE: Diversified Iterative Self-ImprovementCode0
Experimental Demonstration of an Optical Neural PDE Solver via On-Chip PINN Training0
Rethink Delay Doppler Channels and Time-Frequency Coding0
Measuring Large Language Models Capacity to Annotate Journalistic Sourcing0
Slow Perception: Let's Perceive Geometric Figures Step-by-step0
Toward Adaptive Reasoning in Large Language Models with Thought RollbackCode1
Dynamic Skill Adaptation for Large Language Models0
CARL-GT: Evaluating Causal Reasoning Capabilities of Large Language ModelsCode1
StructTest: Benchmarking LLMs' Reasoning through Compositional Structured Outputs0
Towards Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning0
DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-ThoughtCode3
Evaluating the Design Features of an Intelligent Tutoring System for Advanced Mathematics Learning0
Ask-Before-Detection: Identifying and Mitigating Conformity Bias in LLM-Powered Error Detector for Math Word Problem Solutions0
System-2 Mathematical Reasoning via Enriched Instruction Tuning0
Correct implied volatility shapes and reliable pricing in the rough Heston model0
Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning0
Show:102550
← PrevPage 10 of 32Next →

No leaderboard results yet.