SOTAVerified

Benchmarking

Papers

Showing 14011450 of 5548 papers

TitleStatusHype
PT-Ranking: A Benchmarking Platform for Neural Learning-to-RankCode1
NATS-Bench: Benchmarking NAS Algorithms for Architecture Topology and SizeCode1
Image Colorization: A Survey and DatasetCode1
ScrewNet: Category-Independent Articulation Model Estimation From Depth Images Using Screw TheoryCode1
Quantitative Survey of the State of the Art in Sign Language RecognitionCode1
Automatic sleep stage classification with deep residual networks in a mixed-cohort settingCode1
ISSAFE: Improving Semantic Segmentation in Accidents by Fusing Event-based DataCode1
AIPerf: Automated machine learning as an AI-HPC benchmarkCode1
dMelodies: A Music Dataset for Disentanglement LearningCode1
WordCraft: An Environment for Benchmarking Commonsense AgentsCode1
Are We There Yet? Evaluating State-of-the-Art Neural Network based Geoparsers Using EUPEG as a Benchmarking PlatformCode1
Emoji Prediction: Extensions and BenchmarkingCode1
CheXphoto: 10,000+ Photos and Transformations of Chest X-rays for Benchmarking Deep Learning RobustnessCode1
GAMA: a General Automated Machine learning AssistantCode1
Enhancing spatial and textual analysis with EUPEG: an extensible and unified platform for evaluating geoparsersCode1
RobFR: Benchmarking Adversarial Robustness on Face RecognitionCode1
URSABench: Comprehensive Benchmarking of Approximate Bayesian Inference Methods for Deep Neural NetworksCode1
IOHanalyzer: Detailed Performance Analyses for Iterative Optimization HeuristicsCode1
Re-thinking Co-Salient Object DetectionCode1
Wiki-CS: A Wikipedia-Based Benchmark for Graph Neural NetworksCode1
Quo Vadis, Skeleton Action Recognition ?Code1
Descending through a Crowded Valley - Benchmarking Deep Learning OptimizersCode1
Meta-SAC: Auto-tune the Entropy Temperature of Soft Actor-Critic via MetagradientCode1
EndoSLAM Dataset and An Unsupervised Monocular Visual Odometry and Depth Estimation Approach for Endoscopic Videos: Endo-SfMLearnerCode1
Labelling unlabelled videos from scratch with multi-modal self-supervisionCode1
Monash University, UEA, UCR Time Series Extrinsic Regression ArchiveCode1
Mitigating Gender Bias in Captioning SystemsCode1
Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative TasksCode1
Benchmarking Unsupervised Object Representations for Video SequencesCode1
Supervised learning is an accurate method for network-based gene classificationCode1
Benchmarking Adversarial Robustness on Image ClassificationCode1
Taking a Deeper Look at Co-Salient Object DetectionCode1
UGC-VQA: Benchmarking Blind Video Quality Assessment for User Generated ContentCode1
Reference Pose Generation for Long-term Visual Localization via Learned Features and View SynthesisCode1
Curious Hierarchical Actor-Critic Reinforcement LearningCode1
A Ladder of Causal DistancesCode1
NTIRE 2020 Challenge on Real-World Image Super-Resolution: Methods and ResultsCode1
Introducing the VoicePrivacy InitiativeCode1
Benchmarking Multidomain English-Indonesian Machine TranslationCode1
Benchmarking Robustness of Machine Reading Comprehension ModelsCode1
Machine Learning Methods for Brain Network Classification: Application to Autism Diagnosis using Cortical Morphological NetworksCode1
MAVEN: A Massive General Domain Event Detection DatasetCode1
Deep Learning for ECG Analysis: Benchmarks and Insights from PTB-XLCode1
A Global Benchmark of Algorithms for Segmenting Late Gadolinium-Enhanced Cardiac Magnetic Resonance ImagingCode1
Global Wheat Head Detection (GWHD) dataset: a large and diverse dataset of high resolution RGB labelled images to develop and benchmark wheat head detection methodsCode1
New Protocols and Negative Results for Textual Entailment Data CollectionCode1
Shortcut Learning in Deep Neural NetworksCode1
Evaluating Multimodal Representations on Visual Semantic Textual SimilarityCode1
Benchmarking End-to-End Behavioural Cloning on Video GamesCode1
Event Probability Mask (EPM) and Event Denoising Convolutional Neural Network (EDnCNN) for Neuromorphic CamerasCode1
Show:102550
← PrevPage 29 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified