SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2371–2380 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Benchmarking Linguistic Diversity of Large Language Models	Dec 13, 2024	BenchmarkingDiversity	CodeCode Available	0	5
DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition	Jul 16, 2025	BenchmarkingKnowledge Distillation	CodeCode Available	0	5
Ducho meets Elliot: Large-scale Benchmarks for Multimodal Recommendation	Sep 24, 2024	BenchmarkingMovie Recommendation	CodeCode Available	0	5
Geological Inference from Textual Data using Word Embeddings	Apr 10, 2025	BenchmarkingWord Embeddings	CodeCode Available	0	5
Global Prediction of COVID-19 Variant Emergence Using Dynamics-Informed Graph Neural Networks	Jan 7, 2024	BenchmarkingGraph Neural Network	CodeCode Available	0	5
Flexible Generation of Preference Data for Recommendation Analysis	Jul 23, 2024	BenchmarkingRecommendation Systems	CodeCode Available	0	5
Are Synthetic Corruptions A Reliable Proxy For Real-World Corruptions?	May 7, 2025	BenchmarkingSemantic Segmentation	CodeCode Available	0	5
DrugOOD: Out-of-Distribution (OOD) Dataset Curator and Benchmark for AI-aided Drug Discovery -- A Focus on Affinity Prediction Problems with Noise Annotations	Jan 24, 2022	BenchmarkingDrug Discovery	CodeCode Available	0	5
Benchmarking Learning Efficiency in Deep Reservoir Computing	Sep 29, 2022	Benchmarking	CodeCode Available	0	5
DQI: Measuring Data Quality in NLP	May 2, 2020	Active LearningBenchmarking	CodeCode Available	0	5

Show:10 25 50

← PrevPage 238 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified