SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1751–1760 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Analyzing the Feature Extractor Networks for Face Image Synthesis	Jun 4, 2024	BenchmarkingImage Generation	CodeCode Available	0	5
Inverse Contextual Bandits: Learning How Behavior Evolves over Time	Jul 13, 2021	BenchmarkingDecision Making	CodeCode Available	0	5
Benchmarking Multi-dimensional AIGC Video Quality Assessment: A Dataset and Unified Model	Jul 31, 2024	BenchmarkingLarge Language Model	CodeCode Available	0	5
Integration of nested cross-validation, automated hyperparameter optimization, high-performance computing to reduce and quantify the variance of test performance estimation of deep learning models	Mar 11, 2025	BenchmarkingHyperparameter Optimization	CodeCode Available	0	5
Integrating Expert Knowledge into Logical Programs via LLMs	Feb 17, 2025	BenchmarkingLogical Reasoning	CodeCode Available	0	5
Calibrating Pre-trained Language Classifiers on LLM-generated Noisy Labels via Iterative Refinement	May 26, 2025	Benchmarking	CodeCode Available	0	5
COCO: A Platform for Comparing Continuous Optimizers in a Black-Box Setting	Mar 29, 2016	BenchmarkingMultiobjective Optimization	CodeCode Available	0	5
COCO: Performance Assessment	May 11, 2016	Benchmarking	CodeCode Available	0	5
STEP: A Unified Spiking Transformer Evaluation Platform for Fair and Reproducible Benchmarking	May 16, 2025	Benchmarking	CodeCode Available	0	5
Calibrated Adaptive Probabilistic ODE Solvers	Dec 15, 2020	BenchmarkingDescriptive	CodeCode Available	0	5

Show:10 25 50

← PrevPage 176 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified