SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2531–2540 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
From MNIST to ImageNet and Back: Benchmarking Continual Curriculum Learning	Mar 16, 2023	BenchmarkingContinual Learning	CodeCode Available	0	5
MIP-GAF: A MLLM-annotated Benchmark for Most Important Person Localization and Group Context Understanding	Sep 10, 2024	BenchmarkingLanguage Modeling	CodeCode Available	0	5
FRAMES-VQA: Benchmarking Fine-Tuning Robustness across Multi-Modal Shifts in Visual Question Answering	May 27, 2025	BenchmarkingQuestion Answering	CodeCode Available	0	5
FR-MRInet: A Deep Convolutional Encoder-Decoder for Brain Tumor Segmentation with Relu-RGB and Sliding-window	Jul 26, 2018	BenchmarkingBrain Tumor Segmentation	CodeCode Available	0	5
From Past to Present: A Survey of Malicious URL Detection Techniques, Datasets and Code Repositories	Apr 23, 2025	Benchmarking	CodeCode Available	0	5
Fast Benchmarking of Accuracy vs. Training Time with Cyclic Learning Rates	Jun 2, 2022	Benchmarking	CodeCode Available	0	5
Arabic Speech Recognition by End-to-End, Modular Systems and Human	Jan 21, 2021	Arabic Speech RecognitionAutomatic Speech Recognition	CodeCode Available	0	5
Detecting Stereotypes and Anti-stereotypes the Correct Way Using Social Psychological Underpinnings	Apr 4, 2025	Benchmarking	CodeCode Available	0	5
Recognizing Object Affordances to Support Scene Reasoning for Manipulation Tasks	Sep 12, 2019	Affordance DetectionAffordance Recognition	CodeCode Available	0	5
Forecasting time series with constraints	Feb 14, 2025	Additive modelsBenchmarking	CodeCode Available	0	5

Show:10 25 50

← PrevPage 254 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified