SOTAVerified

Memorization

Papers

Showing 1120 of 1088 papers

TitleStatusHype
MathArena: Evaluating LLMs on Uncontaminated Math CompetitionsCode3
From Matching to Generation: A Survey on Generative Information RetrievalCode3
AgentTuning: Enabling Generalized Agent Abilities for LLMsCode3
HeuriGym: An Agentic Benchmark for LLM-Crafted Heuristics in Combinatorial OptimizationCode2
LLM-SRBench: A New Benchmark for Scientific Equation Discovery with Large Language ModelsCode2
RARE: Retrieval-Augmented Reasoning ModelingCode2
Detecting, Explaining, and Mitigating Memorization in Diffusion ModelsCode2
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?Code2
Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMsCode2
HMT: Hierarchical Memory Transformer for Long Context Language ProcessingCode2
Show:102550
← PrevPage 2 of 109Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PaLM-540B (few-shot, k=5)Accuracy95.4Unverified
2Gopher-280B (few-shot, k=5)Accuracy80Unverified
3PaLM-62B (few-shot, k=5)Accuracy77.7Unverified