SOTAVerified

Benchmarking

Papers

Showing 36513660 of 5548 papers

TitleStatusHype
MTEB: Massive Text Embedding BenchmarkCode4
OpenOOD: Benchmarking Generalized Out-of-Distribution DetectionCode0
Benchmarking Long-tail Generalization with Likelihood SplitsCode0
Simulated Contextual Bandits for Personalization Tasks from Recommendation DatasetsCode0
Vote'n'Rank: Revision of Benchmarking with Social Choice TheoryCode0
DCL-Net: Deep Correspondence Learning Network for 6D Pose EstimationCode1
Understanding or Manipulation: Rethinking Online Performance Gains of Modern Recommender Systems0
Benchmarking saliency methods for chest X-ray interpretationCode1
A Survey of Methods for Addressing Class Imbalance in Deep-Learning Based Natural Language Processing0
Benchmarking Reinforcement Learning Techniques for Autonomous NavigationCode1
Show:102550
← PrevPage 366 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified