SOTAVerified

Benchmarking

Papers

Showing 911920 of 5548 papers

TitleStatusHype
NLPBench: Evaluating Large Language Models on Solving NLP ProblemsCode1
OceanBench: The Sea Surface Height EditionCode1
Benchmarking Local Robustness of High-Accuracy Binary Neural Networks for Enhanced Traffic Sign RecognitionCode1
Benchmarking Encoder-Decoder Architectures for Biplanar X-ray to 3D Shape ReconstructionCode1
Grad DFT: a software library for machine learning enhanced density functional theoryCode1
Prompt Tuned Embedding Classification for Multi-Label Industry Sector AllocationCode1
An Image Dataset for Benchmarking Recommender Systems with Raw PixelsCode1
Formalizing Multimedia Recommendation through Multimodal Deep LearningCode1
FreeMan: Towards Benchmarking 3D Human Pose Estimation under Real-World ConditionsCode1
RecAD: Towards A Unified Library for Recommender Attack and DefenseCode1
Show:102550
← PrevPage 92 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified