SOTAVerified

Benchmarking

Papers

Showing 27312740 of 5548 papers

TitleStatusHype
ASI: Accuracy-Stability Index for Evaluating Deep Learning Models0
An Empirical Investigation into Benchmarking Model Multiplicity for Trustworthy Machine Learning: A Case Study on Image Classification0
Benchmarking Robustness of Text-Image Composed RetrievalCode1
Large Language Models as Automated Aligners for benchmarking Vision-Language Models0
Dialogue Quality and Emotion Annotations for Customer Support ConversationsCode0
Learning Dynamic Selection and Pricing of Out-of-Home DeliveriesCode0
Creating and Leveraging a Synthetic Dataset of Cloud Optical Thickness Measures for Cloud Detection in MSICode0
Automated 3D Tumor Segmentation using Temporal Cubic PatchGAN (TCuP-GAN)0
PG-Video-LLaVA: Pixel Grounding Large Video-Language ModelsCode2
Benchmarking Toxic Molecule Classification using Graph Neural Networks and Few Shot Learning0
Show:102550
← PrevPage 274 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified