SOTAVerified

Memorization

Papers

Showing 2650 of 1088 papers

TitleStatusHype
A Decade's Battle on Dataset Bias: Are We There Yet?Code2
SimplyRetrieve: A Private and Lightweight Retrieval-Centric Generative AI ToolCode2
Learning explanations that are hard to varyCode2
HeuriGym: An Agentic Benchmark for LLM-Crafted Heuristics in Combinatorial OptimizationCode2
Exposing flaws of generative model evaluation metrics and their unfair treatment of diffusion modelsCode2
Detecting, Explaining, and Mitigating Memorization in Diffusion ModelsCode2
Decoupling Knowledge from Memorization: Retrieval-augmented Prompt LearningCode2
Drive Like a Human: Rethinking Autonomous Driving with Large Language ModelsCode2
DS-1000: A Natural and Reliable Benchmark for Data Science Code GenerationCode2
LLM-SRBench: A New Benchmark for Scientific Equation Discovery with Large Language ModelsCode2
Data Contamination Quiz: A Tool to Detect and Estimate Contamination in Large Language ModelsCode1
Data Unlearning in Diffusion ModelsCode1
Beyond Gradient Averaging in Parallel Optimization: Improved Robustness through Gradient Agreement FilteringCode1
Advancing Cross-domain Discriminability in Continual Learning of Vision-Language ModelsCode1
Data Contamination Can Cross Language BarriersCode1
DAT: Training Deep Networks Robust To Label-Noise by Matching the Feature DistributionsCode1
Cousins Of The Vendi Score: A Family Of Similarity-Based Diversity Metrics For Science And Machine LearningCode1
C-SFDA: A Curriculum Learning Aided Self-Training Framework for Efficient Source Free Domain AdaptationCode1
Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization CorrelationsCode1
AutomaTikZ: Text-Guided Synthesis of Scientific Vector Graphics with TikZCode1
Zero-Shot Compositional Policy Learning via Language GroundingCode1
Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy LabelsCode1
DASH: Warm-Starting Neural Network Training in Stationary Settings without Loss of PlasticityCode1
Adaptive Early-Learning Correction for Segmentation from Noisy AnnotationsCode1
Deciphering the Factors Influencing the Efficacy of Chain-of-Thought: Probability, Memorization, and Noisy ReasoningCode1
Show:102550
← PrevPage 2 of 44Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PaLM-540B (few-shot, k=5)Accuracy95.4Unverified
2Gopher-280B (few-shot, k=5)Accuracy80Unverified
3PaLM-62B (few-shot, k=5)Accuracy77.7Unverified