SOTAVerified

GSM8K

Papers

Showing 171180 of 439 papers

TitleStatusHype
Explicit Knowledge Transfer for Weakly-Supervised Code Generation0
Contrastive Decoding Improves Reasoning in Large Language Models0
Excessive Reasoning Attack on Reasoning LLMs0
Evolving LLMs' Self-Refinement Capability via Iterative Preference Optimization0
Concise Thoughts: Impact of Output Length on LLM Reasoning and Cost0
Evolutionary Pre-Prompt Optimization for Mathematical Reasoning0
Evaluation of LLMs for mathematical problem solving0
Complexity-Based Prompting for Multi-Step Reasoning0
Advancing Process Verification for Large Language Models via Tree-Based Preference Learning0
Entropy-Guided Watermarking for LLMs: A Test-Time Framework for Robust and Traceable Text Generation0
Show:102550
← PrevPage 18 of 44Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1XolverAccuracy98.1Unverified
2Orange-mini0-shot MRR98Unverified