SOTAVerified

GSM8K

Papers

Showing 110 of 439 papers

TitleStatusHype
GEMMAS: Graph-based Evaluation Metrics for Multi Agent Systems0
DAC: A Dynamic Attention-aware Approach for Task-Agnostic Prompt CompressionCode0
KisMATH: Do LLMs Have Knowledge of Implicit Structures in Mathematical Reasoning?0
CoRE: Enhancing Metacognition with Label-free Self-evaluation in LRMs0
Activation Steering for Chain-of-Thought CompressionCode0
any4: Learned 4-bit Numeric Representation for LLMsCode2
IRanker: Towards Ranking Foundation ModelCode1
Scaling Speculative Decoding with Lookahead ReasoningCode0
Plan for Speed -- Dilated Scheduling for Masked Diffusion Language Models0
CommVQ: Commutative Vector Quantization for KV Cache CompressionCode1
Show:102550
← PrevPage 1 of 44Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1XolverAccuracy98.1Unverified
2Orange-mini0-shot MRR98Unverified