SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2631–2640 of 5548 papers

Title	Date	Tasks	Status	Hype
FuzzWiz -- Fuzzing Framework for Efficient Hardware Coverage	Oct 23, 2024	Benchmarking	—Unverified	0
Benchmarking Smoothness and Reducing High-Frequency Oscillations in Continuous Control Policies	Oct 22, 2024	Benchmarkingcontinuous-control	—Unverified	0
Safe Load Balancing in Software-Defined-Networking	Oct 22, 2024	BenchmarkingDeep Reinforcement Learning	—Unverified	0
ISImed: A Framework for Self-Supervised Learning using Intrinsic Spatial Information in Medical Images	Oct 22, 2024	BenchmarkingSelf-Supervised Learning	CodeCode Available	0
Polyp-E: Benchmarking the Robustness of Deep Segmentation Models via Polyp Editing	Oct 22, 2024	AttributeBenchmarking	—Unverified	0
Benchmarking Large Language Models for Image Classification of Marine Mammals	Oct 22, 2024	Benchmarkingimage-classification	CodeCode Available	0
Building Conformal Prediction Intervals with Approximate Message Passing	Oct 21, 2024	BenchmarkingConformal Prediction	CodeCode Available	0
Hiding in Plain Sight: Reframing Hardware Trojan Benchmarking as a Hide&Seek Modification	Oct 21, 2024	Benchmarking	—Unverified	0
Sketch2Code: Evaluating Vision-Language Models for Interactive Web Design Prototyping	Oct 21, 2024	Benchmarking	—Unverified	0
Benchmarking Pathology Foundation Models: Adaptation Strategies and Scenarios	Oct 21, 2024	BenchmarkingFew-Shot Learning	CodeCode Available	0

Show:10 25 50

← PrevPage 264 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified