SOTAVerified

Benchmarking

Papers

Showing 34013410 of 5548 papers

TitleStatusHype
Adaptive Experimentation at Scale: A Computational Framework for Flexible Batches0
DeID-GPT: Zero-shot Medical Text De-Identification by GPT-4Code1
A Multi-Task Deep Learning Approach for Sensor-based Human Activity Recognition and Segmentation0
Benchmarking Robustness of 3D Object Detection to Common Corruptions in Autonomous DrivingCode0
Revisiting Realistic Test-Time Training: Sequential Inference and Adaptation by Anchored Clustering Regularized Self-TrainingCode1
COVID-19 event extraction from Twitter via extractive question answering with continuous promptsCode1
CCTV-Gun: Benchmarking Handgun Detection in CCTV ImagesCode1
NoisyHate: Mining Online Human-Written Perturbations for Realistic Robustness Benchmarking of Content Moderation Models0
DeAR: Debiasing Vision-Language Models with Additive Residuals0
Highly Accurate Quantum Chemical Property Prediction with Uni-Mol+Code3
Show:102550
← PrevPage 341 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified