SOTAVerified

Benchmarking

Papers

Showing 31513160 of 5548 papers

TitleStatusHype
NoiseBench: Benchmarking the Impact of Real Label Noise on Named Entity RecognitionCode0
Comparative analysis of neural network architectures for short-term FOREX forecasting0
UCCIX: Irish-eXcellence Large Language Model0
Divergent Creativity in Humans and Large Language ModelsCode0
oTTC: Object Time-to-Contact for Motion Estimation in Autonomous Driving0
Benchmarking Retrieval-Augmented Large Language Models in Biomedical NLP: Application, Robustness, and Self-Awareness0
Benchmarking Cross-Domain Audio-Visual Deception Detection0
Replication Study and Benchmarking of Real-Time Object Detection ModelsCode0
Automating Code Adaptation for MLOps -- A Benchmarking Study on LLMs0
Agent-oriented Joint Decision Support for Data Owners in Auction-based Federated Learning0
Show:102550
← PrevPage 316 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified