SOTAVerified

Benchmarking

Papers

Showing 41514175 of 5548 papers

TitleStatusHype
CLMB: deep contrastive learning for robust metagenomic binningCode0
Benchmarking and scaling of deep learning models for land cover image classificationCode1
Benchmarking Quality-Dependent and Cost-Sensitive Score-Level Multimodal Biometric Fusion Algorithms0
MSAMSum: Towards Benchmarking Multi-lingual Dialogue Summarization0
Fantastic Questions and Where to Find Them: FairytaleQA--An Authentic Dataset for Narrative Comprehension0
FewNLU: Benchmarking State-of-the-Art Methods for Few-Shot Natural Language Understanding0
Mukayese: Turkish NLP Strikes Back0
CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning AlgorithmsCode3
Multiclass Optimal Classification Trees with SVM-splits0
Benchmarking deep generative models for diverse antibody sequence design0
ADCB: An Alzheimer's disease benchmark for evaluating observational estimators of causal effects0
Bi-Discriminator Class-Conditional Tabular GAN0
MLHarness: A Scalable Benchmarking System for MLCommons0
Which priors matter? Benchmarking models for learning latent dynamicsCode1
EvoLearner: Learning Description Logics with Evolutionary AlgorithmsCode0
Practical, Fast and Robust Point Cloud Registration for 3D Scene Stitching and Object Localization0
Characterizing the adversarial vulnerability of speech self-supervised learning0
Graph Robustness Benchmark: Benchmarking the Adversarial Robustness of Graph Machine LearningCode1
Personalized Benchmarking with the Ludwig Benchmarking ToolkitCode3
IOHexperimenter: Benchmarking Platform for Iterative Optimization HeuristicsCode1
Benchmarking Data-driven Surrogate Simulators for Artificial Electromagnetic MaterialsCode1
A new baseline for retinal vessel segmentation: Numerical identification and correction of methodological inconsistencies affecting 100+ papersCode0
Benchmarking Multimodal AutoML for Tabular Data with Text FieldsCode3
B-Pref: Benchmarking Preference-Based Reinforcement LearningCode1
OpenFWI: Large-Scale Multi-Structural Benchmark Datasets for Seismic Full Waveform InversionCode1
Show:102550
← PrevPage 167 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified