SOTAVerified

Benchmarking

Papers

Showing 13761400 of 5548 papers

TitleStatusHype
SoftGym: Benchmarking Deep Reinforcement Learning for Deformable Object ManipulationCode1
tvopt: A Python Framework for Time-Varying OptimizationCode1
Long Range Arena: A Benchmark for Efficient TransformersCode1
Collective Knowledge: organizing research projects as a database of reusable components and portable workflows with common APIsCode1
Benchmarking Meaning Representations in Neural Semantic ParsingCode1
A Critical Assessment of State-of-the-Art in Entity AlignmentCode1
Benchmarking Deep Learning Interpretability in Time Series PredictionsCode1
Kvasir-Instrument: Diagnostic and therapeutic tool segmentation dataset in gastrointestinal endoscopyCode1
KINNEWS and KIRNEWS: Benchmarking Cross-Lingual Text Classification for Kinyarwanda and KirundiCode1
Exploiting News Article Structure for Automatic Corpus Generation of Entailment DatasetsCode1
Self-Alignment Pretraining for Biomedical Entity RepresentationsCode1
German's Next Language ModelCode1
Promoting High Diversity Ensemble Learning with EnsembleBenchCode1
RobustBench: a standardized adversarial robustness benchmarkCode1
RADIATE: A Radar Dataset for Automotive Perception in Bad WeatherCode1
Light Field Salient Object Detection: A Review and BenchmarkCode1
Olympus: a benchmarking framework for noisy optimization and experiment planningCode1
OpenTraj: Assessing Prediction Complexity in Human Trajectories DatasetsCode1
Bag of Tricks for Adversarial TrainingCode1
HINT3: Raising the bar for Intent Detection in the WildCode1
Benchmarking deep inverse models over time, and the neural-adjoint methodCode1
A BFS-Tree of Ranking References for Unsupervised Manifold LearningCode1
CoDEx: A Comprehensive Knowledge Graph Completion BenchmarkCode1
BARS-CTR: Open Benchmarking for Click-Through Rate PredictionCode1
IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language UnderstandingCode1
Show:102550
← PrevPage 56 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified