SOTAVerified

Benchmarking

Papers

Showing 12011250 of 5548 papers

TitleStatusHype
When Do Flat Minima Optimizers Work?Code1
Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical StudyCode1
Are we really making much progress? Revisiting, benchmarking, and refining heterogeneous graph neural networksCode1
Leveraging Trust for Joint Multi-Objective and Multi-Fidelity OptimizationCode1
Autonomous Reinforcement Learning: Formalism and BenchmarkingCode1
High-Dimensional Inference in Bayesian NetworksCode1
Boosting Neural Image Compression for Machines Using Latent Space MaskingCode1
Label, Verify, Correct: A Simple Few Shot Object Detection MethodCode1
Learning Representations with Contrastive Self-Supervised Learning for Histopathology ApplicationsCode1
Benchmarking human visual search computational models in natural scenes: models comparison and reference datasetsCode1
Object Shape Error Response Using Bayesian 3-D Convolutional Neural Networks for Assembly Systems With Compliant PartsCode1
Neuro-Symbolic Inductive Logic Programming with Logical Neural NetworksCode1
HyFactor: Hydrogen-count labelled graph-based defactorization AutoencoderCode1
BenchML: an extensible pipelining framework for benchmarking representations of materials and molecules at scaleCode1
CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of CancerCode1
TISE: Bag of Metrics for Text-to-Image Synthesis EvaluationCode1
Neural Regression, Representational Similarity, Model Zoology & Neural Taskonomy at Scale in Rodent Visual CortexCode1
MC-Blur: A Comprehensive Benchmark for Image DeblurringCode1
NEORL: NeuroEvolution Optimization with Reinforcement LearningCode1
ClimART: A Benchmark Dataset for Emulating Atmospheric Radiative Transfer in Weather and Climate ModelsCode1
Benchmarking Accuracy and Generalizability of Four Graph Neural Networks Using Large In Vitro ADME Datasets from Different Chemical SpacesCode1
EH-DNAS: End-to-End Hardware-aware Differentiable Neural Architecture SearchCode1
Benchmarking Detection Transfer Learning with Vision TransformersCode1
Evaluating Adversarial Attacks on ImageNet: A Reality Check on Misclassification ClassesCode1
Benchmarking emergency department triage prediction models with machine learning and large public electronic health recordsCode1
FedCV: A Federated Learning Framework for Diverse Computer Vision TasksCode1
GRecX: An Efficient and Unified Benchmark for GNN-based RecommendationCode1
Benchmarking and scaling of deep learning models for land cover image classificationCode1
Which priors matter? Benchmarking models for learning latent dynamicsCode1
Graph Robustness Benchmark: Benchmarking the Adversarial Robustness of Graph Machine LearningCode1
IOHexperimenter: Benchmarking Platform for Iterative Optimization HeuristicsCode1
Benchmarking Data-driven Surrogate Simulators for Artificial Electromagnetic MaterialsCode1
OpenFWI: Large-Scale Multi-Structural Benchmark Datasets for Seismic Full Waveform InversionCode1
B-Pref: Benchmarking Preference-Based Reinforcement LearningCode1
AdaPool: Exponential Adaptive Pooling for Information-Retaining DownsamplingCode1
OPF-Learn: An Open-Source Framework for Creating Representative AC Optimal Power Flow DatasetsCode1
Don’t be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue SystemCode1
Benchmarking Meta-embeddings: What Works and What Does NotCode1
FTNet: Feature Transverse Network for Thermal Image Semantic SegmentationCode1
Learning with Noisy Labels Revisited: A Study Using Real-World Human AnnotationsCode1
OpenABC-D: A Large-Scale Dataset For Machine Learning Guided Integrated Circuit SynthesisCode1
Text-Based Person Search with Limited DataCode1
NAS-HPO-Bench-II: A Benchmark Dataset on Joint Optimization of Convolutional Neural Network Architecture and Training HyperparametersCode1
HUMAN4D: A Human-Centric Multimodal Dataset for Motions and Immersive MediaCode1
Benchmarking the Robustness of Spatial-Temporal Models Against CorruptionsCode1
Codabench: Flexible, Easy-to-Use and Reproducible Benchmarking PlatformCode1
NAS-Bench-360: Benchmarking Neural Architecture Search on Diverse TasksCode1
S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised Speech RepresentationsCode1
EDFace-Celeb-1M: Benchmarking Face Hallucination with a Million-scale DatasetCode1
Performance Evaluation of Deep Transfer Learning on Multiclass Identification of Common Weed Species in Cotton Production SystemsCode1
Show:102550
← PrevPage 25 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified