SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4171–4180 of 5548 papers

Title	Date	Tasks	Status	Hype
Benchmarking Data-driven Surrogate Simulators for Artificial Electromagnetic Materials	Nov 6, 2021	BenchmarkingNeural Network simulation	CodeCode Available	1
A new baseline for retinal vessel segmentation: Numerical identification and correction of methodological inconsistencies affecting 100+ papers	Nov 6, 2021	BenchmarkingRetinal Vessel Segmentation	CodeCode Available	0
Benchmarking Multimodal AutoML for Tabular Data with Text Fields	Nov 4, 2021	AutoMLBenchmarking	CodeCode Available	3
B-Pref: Benchmarking Preference-Based Reinforcement Learning	Nov 4, 2021	Benchmarkingreinforcement-learning	CodeCode Available	1
OpenFWI: Large-Scale Multi-Structural Benchmark Datasets for Seismic Full Waveform Inversion	Nov 4, 2021	2kBenchmarking	CodeCode Available	1
Is Bang-Bang Control All You Need? Solving Continuous Control with Bernoulli Policies	Nov 3, 2021	AllBenchmarking	—Unverified	0
Virus-MNIST: Machine Learning Baseline Calculations for Image Classification	Nov 3, 2021	BenchmarkingBIG-bench Machine Learning	—Unverified	0
Procedural Generalization by Planning with Self-Supervised World Models	Nov 2, 2021	BenchmarkingMeta-Learning	—Unverified	0
Don’t be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System	Nov 1, 2021	BenchmarkingResponse Generation	CodeCode Available	1
Constructing a Psychometric Testbed for Fair Natural Language Processing	Nov 1, 2021	BenchmarkingFairness	CodeCode Available	0

Show:10 25 50

← PrevPage 418 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified