SOTAVerified

Benchmarking

Papers

Showing 47114720 of 5548 papers

TitleStatusHype
Roughness Index and Roughness Distance for Benchmarking Medical SegmentationCode0
The KANDY Benchmark: Incremental Neuro-Symbolic Learning and Reasoning with Kandinsky PatternsCode0
MEDFAIR: Benchmarking Fairness for Medical ImagingCode0
Benchmarking the Robustness of Optical Flow Estimation to CorruptionsCode0
Adaptive Power System Emergency Control using Deep Reinforcement LearningCode0
Benchmarking the Linear Algebra Awareness of TensorFlow and PyTorchCode0
gym-gazebo2, a toolkit for reinforcement learning using ROS 2 and GazeboCode0
Benchmarking the Hooke-Jeeves Method, MTS-LS1, and BSrr on the Large-scale BBOB Function SetCode0
Guidelines and Benchmarks for Deployment of Deep Learning Models on Smartphones as Real-Time AppsCode0
Grounding Synthetic Data Evaluations of Language Models in Unsupervised Document CorporaCode0
Show:102550
← PrevPage 472 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified