SOTAVerified

Benchmarking

Papers

Showing 44514475 of 5548 papers

TitleStatusHype
Bi-Discriminator Class-Conditional Tabular GAN0
Benchmarking deep generative models for diverse antibody sequence design0
ADCB: An Alzheimer's disease benchmark for evaluating observational estimators of causal effects0
MLHarness: A Scalable Benchmarking System for MLCommons0
Practical, Fast and Robust Point Cloud Registration for 3D Scene Stitching and Object Localization0
Characterizing the adversarial vulnerability of speech self-supervised learning0
EvoLearner: Learning Description Logics with Evolutionary AlgorithmsCode0
A new baseline for retinal vessel segmentation: Numerical identification and correction of methodological inconsistencies affecting 100+ papersCode0
Is Bang-Bang Control All You Need? Solving Continuous Control with Bernoulli Policies0
Virus-MNIST: Machine Learning Baseline Calculations for Image Classification0
Procedural Generalization by Planning with Self-Supervised World Models0
Who’s on First?: Probing the Learning and Representation Capabilities of Language Models on Deterministic Closed DomainsCode0
Automatic Resolution of Domain Name DisputesCode0
Constructing a Psychometric Testbed for Fair Natural Language ProcessingCode0
Livestock Monitoring with Transformer0
Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image SegmentationCode0
Towards a Taxonomy of Graph Learning Datasets0
Identifying and Benchmarking Natural Out-of-Context Prediction ProblemsCode0
Which Model to Trust: Assessing the Influence of Models on the Performance of Reinforcement Learning Algorithms for Continuous Control TasksCode0
Quantum Boosting using Domain-Partitioning HypothesesCode0
Scientific Machine Learning Benchmarks0
Benchmarking of Lightweight Deep Learning Architectures for Skin Cancer Classification using ISIC 2017 Dataset0
MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems0
Improved Multilingual Language Model Pretraining for Social Media Text via Translation Pair PredictionCode0
An Open Natural Language Processing Development Framework for EHR-based Clinical Research: A case demonstration using the National COVID Cohort Collaborative (N3C)0
Show:102550
← PrevPage 179 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified