SOTAVerified

Benchmarking

Papers

Showing 44514500 of 5548 papers

TitleStatusHype
Bi-Discriminator Class-Conditional Tabular GAN0
Benchmarking deep generative models for diverse antibody sequence design0
ADCB: An Alzheimer's disease benchmark for evaluating observational estimators of causal effects0
MLHarness: A Scalable Benchmarking System for MLCommons0
Practical, Fast and Robust Point Cloud Registration for 3D Scene Stitching and Object Localization0
Characterizing the adversarial vulnerability of speech self-supervised learning0
EvoLearner: Learning Description Logics with Evolutionary AlgorithmsCode0
A new baseline for retinal vessel segmentation: Numerical identification and correction of methodological inconsistencies affecting 100+ papersCode0
Is Bang-Bang Control All You Need? Solving Continuous Control with Bernoulli Policies0
Virus-MNIST: Machine Learning Baseline Calculations for Image Classification0
Procedural Generalization by Planning with Self-Supervised World Models0
Who’s on First?: Probing the Learning and Representation Capabilities of Language Models on Deterministic Closed DomainsCode0
Automatic Resolution of Domain Name DisputesCode0
Constructing a Psychometric Testbed for Fair Natural Language ProcessingCode0
Livestock Monitoring with Transformer0
Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image SegmentationCode0
Towards a Taxonomy of Graph Learning Datasets0
Identifying and Benchmarking Natural Out-of-Context Prediction ProblemsCode0
Which Model to Trust: Assessing the Influence of Models on the Performance of Reinforcement Learning Algorithms for Continuous Control TasksCode0
Quantum Boosting using Domain-Partitioning HypothesesCode0
Scientific Machine Learning Benchmarks0
Benchmarking of Lightweight Deep Learning Architectures for Skin Cancer Classification using ISIC 2017 Dataset0
MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems0
Improved Multilingual Language Model Pretraining for Social Media Text via Translation Pair PredictionCode0
An Open Natural Language Processing Development Framework for EHR-based Clinical Research: A case demonstration using the National COVID Cohort Collaborative (N3C)0
GAN-based disentanglement learning for chest X-ray rib suppression0
Benchmarking Biomedical Nested NER and Relation Extraction Models0
MTG: A Benchmarking Suite for Multilingual Text Generation0
OG-SPACE: Optimized Stochastic Simulation of Spatial Models of Cancer EvolutionCode0
Benchmarking human visual search computational models in natural scenes: models comparison and reference datasets0
What can 5.17 billion regression fits tell us about artificial models of the human visual system?0
The CaLiGraph Ontology as a Challenge for OWL ReasonersCode0
SCEHR: Supervised Contrastive Learning for Clinical Risk Prediction using Electronic Health RecordsCode0
Beyond Accuracy: A Consolidated Tool for Visual Question Answering BenchmarkingCode0
Evolving Evolutionary Algorithms with PatternsCode0
Hybrid Random FeaturesCode0
Explicitly Multi-Modal Benchmarks for Multi-Objective Optimization0
Process Extraction from Text: Benchmarking the State of the Art and Paving the Way for Future ChallengesCode0
Benchmarking Safety Monitors for Image Classifiers with Machine LearningCode0
A New Approach for Image Authentication Framework for Media Forensics Purpose0
Less is more: Selecting the right benchmarking set of data for time series classification0
Decentralized Learning for Overparameterized Problems: A Multi-Agent Kernel Approximation Approach0
NAS-Bench-Zero: A Large Scale Dataset for Understanding Zero-Shot Neural Architecture Search0
Modelling neuronal behaviour with time series regression: Recurrent Neural Networks on synthetic C. elegans data0
Benchmarking Machine Learning Robustness in Covid-19 Spike Sequence Classification0
FastEnsemble: Benchmarking and Accelerating Ensemble-based Uncertainty Estimation for Image-to-Image Translation0
Best Practices in Pool-based Active Learning for Image Classification0
Benchmarking person re-identification approaches and training datasets for practical real-world implementations0
A Two-Stage Neural-Filter Pareto Front Extractor and the need for Benchmarking0
Deep Learning of Intrinsically Motivated Options in the Arcade Learning Environment0
Show:102550
← PrevPage 90 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified