SOTAVerified

Benchmarking

Papers

Showing 39513975 of 5548 papers

TitleStatusHype
Structure-Based Experimental Datasets for Benchmarking Protein Simulation Force Fields0
Learning to Adapt to Online Streams with Distribution Shifts0
Benchmarking Self-Supervised Contrastive Learning Methods for Image-Based Plant PhenotypingCode0
A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking0
Benchmarking Deepart Detection0
Predicting the Performance of a Computing System with Deep Networks0
Benchmarking of Cancelable Biometrics for Deep Templates0
STA: Self-controlled Text Augmentation for Improving Text ClassificationsCode0
Dermatological Diagnosis Explainability Benchmark for Convolutional Neural NetworksCode0
Dynamic Benchmarking of Masked Language Models on Temporal Concept Drift with Multiple Views0
MultiRobustBench: Benchmarking Robustness Against Multiple Attacks0
Time to Embrace Natural Language Processing (NLP)-based Digital Pathology: Benchmarking NLP- and Convolutional Neural Network-based Deep Learning Pipelines0
An Efficient Two-stage Gradient Boosting Framework for Short-term Traffic State EstimationCode0
Determinants of Performance in European ATM -- How to Analyze a Diverse Industry0
Arena-Rosnav 2.0: A Development and Benchmarking Platform for Robot Navigation in Highly Dynamic EnvironmentsCode0
Fuzzy Knowledge Distillation from High-Order TSK to Low-Order TSK0
Towards Fair Machine Learning Software: Understanding and Addressing Model Bias Through Counterfactual Thinking0
Benchmarking Continuous Time Models for Predicting Multiple Sclerosis Progression0
Efficiency in European Air Traffic Management -- A Fundamental Analysis of Data, Models, and Methods0
Model-Based Underwater 6D Pose Estimation from RGB0
A Neuromorphic Dataset for Object Segmentation in Indoor Cluttered EnvironmentCode0
Deep Imputation of Missing Values in Time Series Health Data: A Review with Benchmarking0
AI Sound Recognition on Asthma Medication Adherence: Evaluation With the RDA Benchmark SuiteCode0
CrossCodeBench: Benchmarking Cross-Task Generalization of Source Code Models0
Participatory Personalization in Classification0
Show:102550
← PrevPage 159 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified