SOTAVerified

Math

Papers

Showing 12511300 of 1596 papers

TitleStatusHype
CCoE: A Compact LLM with Collaboration of Experts0
Democratizing Signal Processing and Machine Learning: Math Learning Equity for Elementary and Middle School Students0
Describe-then-Reason: Improving Multimodal Mathematical Reasoning through Visual Comprehension Training0
V-STaR: Training Verifiers for Self-Taught Reasoners0
Designing a Tag-Based Statistical Math Word Problem Solver with Reasoning and Explanation0
Teaching LLMs According to Their Aptitude: Adaptive Reasoning for Mathematical Problem Solving0
DialogueReason: Rule-Based RL Sparks Dialogue Reasoning in LLMs0
DianJin-R1: Evaluating and Enhancing Financial Reasoning in Large Language Models0
Teaching LLMs for Step-Level Automatic Math Correction via Reinforcement Learning0
Dictionary Insertion Prompting for Multilingual Reasoning on Multilingual Large Language Models0
DIF: A Framework for Benchmarking and Verifying Implicit Bias in LLMs0
DiffSampling: Enhancing Diversity and Accuracy in Neural Text Generation0
Causal Decomposition Analysis with Synergistic Interventions: A Triply-Robust Machine Learning Approach to Addressing Multiple Dimensions of Social Disparities0
Digenes: genetic algorithms to discover conjectures about directed and undirected graphs0
Dimensionality reduction: theoretical perspective on practical measures0
Dimension Reduction via Colour Refinement0
DINGO: Constrained Inference for Diffusion LLMs0
Dipper: Diversity in Prompts for Producing Large Language Model Ensembles in Reasoning tasks0
Direct Reasoning Optimization: LLMs Can Reward And Refine Their Own Reasoning for Open-Ended Tasks0
DISC: DISC: Dynamic Decomposition Improves LLM Inference Scaling0
Technical Domain Identification using word2vec and BiLSTM0
DISK: Domain-constrained Instance Sketch for Math Word Problem Generation0
Distributed Skellam Mechanism: a Novel Approach to Federated Learning with Differential Privacy0
Cats Confuse Reasoning LLM: Query Agnostic Adversarial Triggers for Reasoning Models0
DiversiGATE: A Comprehensive Framework for Reliable Large Language Models0
Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation0
dMath: A Scalable Linear Algebra and Math Library for Heterogeneous GP-GPU Architectures0
dMath: Distributed Linear Algebra for DL0
Does Reasoning Emerge? Examining the Probabilities of Causation in Large Language Models0
Does Reasoning Introduce Bias? A Study of Social Bias Evaluation and Mitigation in LLM Reasoning0
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?0
Does Representation Intervention Really Identify Desired Concepts and Elicit Alignment?0
TelecomGPT: A Framework to Build Telecom-Specfic Large Language Models0
Do Large Language Models Truly Grasp Mathematics? An Empirical Exploration From Cognitive Psychology0
Dolphin: A Spoken Language Proficiency Assessment System for Elementary Education0
Done Is Better than Perfect: Unlocking Efficient Reasoning by Structured Multi-Turn Decomposition0
Temperature and Persona Shape LLM Agent Consensus With Minimal Accuracy Gains in Qualitative Coding0
Walk Before You Run! Concise LLM Reasoning via Reinforcement Learning0
Do Thinking Tokens Help or Trap? Towards More Efficient Large Reasoning Model0
DrawEduMath: Evaluating Vision Language Models with Expert-Annotated Students' Hand-Drawn Math Images0
Cascaded Self-Evaluation Augmented Training for Efficient Multimodal Large Language Models0
Can you hear me now? Sensitive comparisons of human and machine perception0
Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces0
DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models0
Can We Further Elicit Reasoning in LLMs? Critic-Guided Planning with Retrieval-Augmentation for Solving Challenging Tasks0
Testing GPT-4-o1-preview on math and science problems: A follow-up study0
Dynamic Scheduling of MPI-based Distributed Deep Learning Training Jobs0
Dynamic Skill Adaptation for Large Language Models0
Testing GPT-4 with Wolfram Alpha and Code Interpreter plug-ins on math and science problems0
EasyMath: A 0-shot Math Benchmark for SLMs0
Show:102550
← PrevPage 26 of 32Next →

No leaderboard results yet.