SOTAVerified

Math

Papers

Showing 10011025 of 1596 papers

TitleStatusHype
QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning0
Benchmarking Large Language Models for Math Reasoning TasksCode0
A Study of PHOC Spatial Region Configurations for Math Formula Retrieval0
Large Language Models Might Not Care What You Are Saying: Prompt Format Beats Descriptions0
Does Reasoning Emerge? Examining the Probabilities of Causation in Large Language Models0
Leveraging Web-Crawled Data for High-Quality Fine-TuningCode0
MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical BenchmarkCode0
A Perspective on Large Language Models, Intelligent Machines, and Knowledge Acquisition0
P3: A Policy-Driven, Pace-Adaptive, and Diversity-Promoted Framework for data pruning in LLM Training0
Examining the Behavior of LLM Architectures Within the Framework of Standardized National Exams in Brazil0
AltCanvas: A Tile-Based Image Editor with Generative AI for Blind or Visually Impaired People0
The Logic of Political Survival Revisited: Consequences of Elite Uncertainty Under Authoritarian Rule0
AI-Assisted Generation of Difficult Math QuestionsCode0
Towards Effective and Efficient Continual Pre-training of Large Language ModelsCode0
Recursive Introspection: Teaching Language Model Agents How to Self-Improve0
Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data0
Prover-Verifier Games improve legibility of LLM outputsCode0
A LLM Benchmark based on the Minecraft Builder Dialog Agent Task0
CCoE: A Compact LLM with Collaboration of Experts0
Reasoning with Large Language Models, a Survey0
Token-Supervised Value Models for Enhancing Mathematical Reasoning Capabilities of Large Language Models0
TelecomGPT: A Framework to Build Telecom-Specfic Large Language Models0
Stepwise Verification and Remediation of Student Reasoning Errors with Large Language Model TutorsCode0
Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On0
Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist0
Show:102550
← PrevPage 41 of 64Next →

No leaderboard results yet.