SOTAVerified

Benchmarking

Papers

Showing 12261250 of 5548 papers

TitleStatusHype
FedCV: A Federated Learning Framework for Diverse Computer Vision TasksCode1
GRecX: An Efficient and Unified Benchmark for GNN-based RecommendationCode1
Benchmarking and scaling of deep learning models for land cover image classificationCode1
Which priors matter? Benchmarking models for learning latent dynamicsCode1
Graph Robustness Benchmark: Benchmarking the Adversarial Robustness of Graph Machine LearningCode1
IOHexperimenter: Benchmarking Platform for Iterative Optimization HeuristicsCode1
Benchmarking Data-driven Surrogate Simulators for Artificial Electromagnetic MaterialsCode1
OpenFWI: Large-Scale Multi-Structural Benchmark Datasets for Seismic Full Waveform InversionCode1
B-Pref: Benchmarking Preference-Based Reinforcement LearningCode1
AdaPool: Exponential Adaptive Pooling for Information-Retaining DownsamplingCode1
OPF-Learn: An Open-Source Framework for Creating Representative AC Optimal Power Flow DatasetsCode1
Don’t be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue SystemCode1
Benchmarking Meta-embeddings: What Works and What Does NotCode1
FTNet: Feature Transverse Network for Thermal Image Semantic SegmentationCode1
Learning with Noisy Labels Revisited: A Study Using Real-World Human AnnotationsCode1
OpenABC-D: A Large-Scale Dataset For Machine Learning Guided Integrated Circuit SynthesisCode1
Text-Based Person Search with Limited DataCode1
NAS-HPO-Bench-II: A Benchmark Dataset on Joint Optimization of Convolutional Neural Network Architecture and Training HyperparametersCode1
HUMAN4D: A Human-Centric Multimodal Dataset for Motions and Immersive MediaCode1
Benchmarking the Robustness of Spatial-Temporal Models Against CorruptionsCode1
Codabench: Flexible, Easy-to-Use and Reproducible Benchmarking PlatformCode1
NAS-Bench-360: Benchmarking Neural Architecture Search on Diverse TasksCode1
S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised Speech RepresentationsCode1
EDFace-Celeb-1M: Benchmarking Face Hallucination with a Million-scale DatasetCode1
Performance Evaluation of Deep Transfer Learning on Multiclass Identification of Common Weed Species in Cotton Production SystemsCode1
Show:102550
← PrevPage 50 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified