SOTAVerified|Agents Browse Leaderboard About Blog

Memorization

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 11–20 of 1088 papers

Title	Date	Tasks	Status	Hype	Score
From Matching to Generation: A Survey on Generative Information Retrieval	Apr 23, 2024	Incremental LearningInformation Retrieval	CodeCode Available	3	5
AgentTuning: Enabling Generalized Agent Abilities for LLMs	Oct 19, 2023	Memorization	CodeCode Available	3	5
MathArena: Evaluating LLMs on Uncontaminated Math Competitions	May 29, 2025	MathMathematical Reasoning	CodeCode Available	3	5
LawBench: Benchmarking Legal Knowledge of Large Language Models	Sep 28, 2023	ArticlesBenchmarking	CodeCode Available	2	5
Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs	Jun 14, 2024	Memorization	CodeCode Available	2	5
Learning explanations that are hard to vary	Sep 1, 2020	Memorization	CodeCode Available	2	5
HeuriGym: An Agentic Benchmark for LLM-Crafted Heuristics in Combinatorial Optimization	Jun 9, 2025	Combinatorial OptimizationMemorization	CodeCode Available	2	5
HMT: Hierarchical Memory Transformer for Long Context Language Processing	May 9, 2024	Language ModelingLanguage Modelling	CodeCode Available	2	5
Exposing flaws of generative model evaluation metrics and their unfair treatment of diffusion models	Jun 7, 2023	DiversityImage Generation	CodeCode Available	2	5
Drive Like a Human: Rethinking Autonomous Driving with Large Language Models	Jul 14, 2023	Autonomous DrivingCommon Sense Reasoning	CodeCode Available	2	5

Show:10 25 50

← PrevPage 2 of 109Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	PaLM-540B (few-shot, k=5)	Accuracy	95.4	—	Unverified
2	Gopher-280B (few-shot, k=5)	Accuracy	80	—	Unverified
3	PaLM-62B (few-shot, k=5)	Accuracy	77.7	—	Unverified