SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4621–4630 of 5548 papers

Title	Date	Tasks	Status	Hype
Beyond Optimism: Exploration With Partially Observable Rewards	Jun 20, 2024	BenchmarkingReinforcement Learning (RL)	CodeCode Available	0
M3Dsynth: A dataset of medical 3D images with AI-generated local manipulations	Sep 14, 2023	BenchmarkingComputed Tomography (CT)	CodeCode Available	0
M4Fog: A Global Multi-Regional, Multi-Modal, and Multi-Stage Dataset for Marine Fog Detection and Forecasting to Bridge Ocean and Atmosphere	Jun 19, 2024	BenchmarkingSpatio-Temporal Forecasting	CodeCode Available	0
The Elusive Pursuit of Reproducing PATE-GAN: Benchmarking, Auditing, Debugging	Jun 20, 2024	Benchmarking	CodeCode Available	0
Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari	Feb 24, 2018	Atari GamesBenchmarking	CodeCode Available	0
Machine-assisted quantitizing designs: augmenting humanities and social sciences with artificial intelligence	Sep 24, 2023	BenchmarkingChange Detection	CodeCode Available	0
Beyond Marginal Uncertainty: How Accurately can Bayesian Regression Models Estimate Posterior Predictive Correlations?	Nov 6, 2020	Active LearningBenchmarking	CodeCode Available	0
Machine learning classification of non-Markovian noise disturbing quantum dynamics	Jan 8, 2021	BenchmarkingBIG-bench Machine Learning	CodeCode Available	0
Machine Learning Automation Toolbox (MLaut)	Jan 11, 2019	BenchmarkingBIG-bench Machine Learning	CodeCode Available	0
3D fluorescence microscopy data synthesis for segmentation and benchmarking	Jul 21, 2021	Benchmarking	CodeCode Available	0

Show:10 25 50

← PrevPage 463 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified