SOTAVerified

Benchmarking

Papers

Showing 39263950 of 5548 papers

TitleStatusHype
Multi-label classification for biomedical literature: an overview of the BioCreative VII LitCovid Track for COVID-19 literature topic annotations0
Analyzing the Impact of Undersampling on the Benchmarking and Configuration of Evolutionary Algorithms0
K-LITE: Learning Transferable Visual Models with External KnowledgeCode2
Radio Galaxy Zoo: Using semi-supervised learning to leverage large unlabelled data-sets for radio galaxy classification under data-set shiftCode0
Label Efficient Regularization and Propagation for Graph Node Classification0
Benchmarking Domain Generalization on EEG-based Emotion Recognition0
NICO++: Towards Better Benchmarking for Domain GeneralizationCode1
Stress-Testing Point Cloud Registration on Automotive LiDARCode1
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP TasksCode3
Deep learning model solves change point detection for multiple change typesCode1
SoccerNet-Tracking: Multiple Object Tracking Dataset and Benchmark in Soccer Videos0
From Environmental Sound Representation to Robustness of 2D CNN Models Against Adversarial Attacks0
Do You Really Mean That? Content Driven Audio-Visual Deepfake Dataset and Multimodal Method for Temporal Forgery LocalizationCode1
Benchmarking Active Learning Strategies for Materials Optimization and Discovery0
EVOPS Benchmark: Evaluation of Plane Segmentation from RGBD and LiDAR Data0
From Modern CNNs to Vision Transformers: Assessing the Performance, Robustness, and Classification Strategies of Deep Learning Models in HistopathologyCode0
Data Splits and Metrics for Method Benchmarking on Surgical Action Triplet DatasetsCode1
Metaethical Perspectives on 'Benchmarking' AI Ethics0
Benchmarking for Public Health Surveillance tasks on Social Media with a Domain-Specific Pretrained Language Model0
BioRED: A Rich Biomedical Relation Extraction DatasetCode1
Disability prediction in multiple sclerosis using performance outcome measures and demographic data0
tmVar 3.0: an improved variant concept recognition and normalization tool0
Deep Visual Geo-localization BenchmarkCode2
The Moral Integrity Corpus: A Benchmark for Ethical Dialogue SystemsCode1
CLEAVE: Scalable and Edge-native Benchmarking of Networked Control SystemsCode0
Show:102550
← PrevPage 158 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified