SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3441–3450 of 5548 papers

Title	Date	Tasks	Status	Hype
Revisiting the Gumbel-Softmax in MADDPG	Feb 23, 2023	BenchmarkingMulti-agent Reinforcement Learning	CodeCode Available	1
A framework for benchmarking class-out-of-distribution detection and its application to ImageNet	Feb 23, 2023	BenchmarkingKnowledge Distillation	CodeCode Available	1
Dermatological Diagnosis Explainability Benchmark for Convolutional Neural Networks	Feb 23, 2023	BenchmarkingMedical Diagnosis	CodeCode Available	0
MultiRobustBench: Benchmarking Robustness Against Multiple Attacks	Feb 21, 2023	Benchmarking	—Unverified	0
An Efficient Two-stage Gradient Boosting Framework for Short-term Traffic State Estimation	Feb 21, 2023	BenchmarkingState Estimation	CodeCode Available	0
Time to Embrace Natural Language Processing (NLP)-based Digital Pathology: Benchmarking NLP- and Convolutional Neural Network-based Deep Learning Pipelines	Feb 21, 2023	Benchmarkingwhole slide images	—Unverified	0
Determinants of Performance in European ATM -- How to Analyze a Diverse Industry	Feb 20, 2023	BenchmarkingManagement	—Unverified	0
Arena-Rosnav 2.0: A Development and Benchmarking Platform for Robot Navigation in Highly Dynamic Environments	Feb 20, 2023	BenchmarkingRobot Navigation	CodeCode Available	0
Fuzzy Knowledge Distillation from High-Order TSK to Low-Order TSK	Feb 16, 2023	BenchmarkingKnowledge Distillation	—Unverified	0
Towards Fair Machine Learning Software: Understanding and Addressing Model Bias Through Counterfactual Thinking	Feb 16, 2023	Benchmarkingcounterfactual	—Unverified	0

Show:10 25 50

← PrevPage 345 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified