SOTAVerified

Benchmarking

Papers

Showing 41014150 of 5548 papers

TitleStatusHype
Benchmarking Safe Deep Reinforcement Learning in Aquatic Navigation0
High-Dimensional Inference in Bayesian NetworksCode1
Logically at Factify 2022: Multimodal Fact Verification0
A Modular Workflow for Performance Benchmarking of Neuronal Network SimulationsCode0
On the Use of Quality Diversity Algorithms for The Traveling Thief Problem0
Boosting Neural Image Compression for Machines Using Latent Space MaskingCode1
On the Value of ML Models0
GUNNEL: Guided Mixup Augmentation and Multi-View Fusion for Aquatic Animal SegmentationCode0
Benchmarking human visual search computational models in natural scenes: models comparison and reference datasetsCode1
Learning Representations with Contrastive Self-Supervised Learning for Histopathology ApplicationsCode1
Label, Verify, Correct: A Simple Few Shot Object Detection MethodCode1
7th AI Driving Olympics: 1st Place Report for Panoptic Tracking0
GreenPCO: An Unsupervised Lightweight Point Cloud Odometry Method0
Object Shape Error Response Using Bayesian 3-D Convolutional Neural Networks for Assembly Systems With Compliant PartsCode1
HyFactor: Hydrogen-count labelled graph-based defactorization AutoencoderCode1
Neuro-Symbolic Inductive Logic Programming with Logical Neural NetworksCode1
BenchML: an extensible pipelining framework for benchmarking representations of materials and molecules at scaleCode1
Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning Research0
TISE: Bag of Metrics for Text-to-Image Synthesis EvaluationCode1
CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of CancerCode1
NEORL: NeuroEvolution Optimization with Reinforcement LearningCode1
Certified Adversarial Defenses Meet Out-of-Distribution Corruptions: Benchmarking Robustness and Simple Baselines0
MC-Blur: A Comprehensive Benchmark for Image DeblurringCode1
Neural Regression, Representational Similarity, Model Zoology & Neural Taskonomy at Scale in Rodent Visual CortexCode1
TinyML Platforms Benchmarking0
An implementation of the "Guess who?" game using CLIPCode0
Synthetic weather radar using hybrid quantum-classical machine learning0
Dyna-bAbI: unlocking bAbI's potential with dynamic synthetic benchmarking0
HRNET: AI on Edge for mask detection and social distancingCode0
3D Compositional Zero-shot Learning with DeCompositional Consensus0
ClimART: A Benchmark Dataset for Emulating Atmospheric Radiative Transfer in Weather and Climate ModelsCode1
OOD-CV: A Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images0
An in-depth experimental study of sensor usage and visual reasoning of robots navigating in real environments0
EffCNet: An Efficient CondenseNet for Image Classification on NXP BlueBox0
Learning to Transfer for Traffic Forecasting via Multi-task LearningCode0
Benchmarking Shadow Removal for Facial Landmark Detection and Beyond0
Benchmarking Accuracy and Generalizability of Four Graph Neural Networks Using Large In Vitro ADME Datasets from Different Chemical SpacesCode1
A War Beyond Deepfake: Benchmarking Facial Counterfeits and Countermeasures0
Using Color To Identify Insider ThreatsCode0
Investigating Tradeoffs in Real-World Video Super-ResolutionCode2
EH-DNAS: End-to-End Hardware-aware Differentiable Neural Architecture SearchCode1
RadFusion: Benchmarking Performance and Fairness for Multimodal Pulmonary Embolism Detection from CT and EHR0
A Modular Framework for Centrality and Clustering in Complex Networks0
Filter Methods for Feature Selection in Supervised Machine Learning Applications -- Review and Benchmark0
Evaluating Adversarial Attacks on ImageNet: A Reality Check on Misclassification ClassesCode1
Benchmarking Detection Transfer Learning with Vision TransformersCode1
FedCV: A Federated Learning Framework for Diverse Computer Vision TasksCode1
Benchmarking emergency department triage prediction models with machine learning and large public electronic health recordsCode1
GRecX: An Efficient and Unified Benchmark for GNN-based RecommendationCode1
Novel Real-Time EMT-TS Modeling Architecture for Feeder Blackstart Simulations0
Show:102550
← PrevPage 83 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified