SOTAVerified

Benchmarking

Papers

Showing 17761800 of 5548 papers

TitleStatusHype
Cable Tree Wiring -- Benchmarking Solvers on a Real-World Scheduling Problem with a Variety of Precedence ConstraintsCode0
Inverse Contextual Bandits: Learning How Behavior Evolves over TimeCode0
Introducing SLAMBench, a performance and accuracy benchmarking methodology for SLAMCode0
InViG: Benchmarking Interactive Visual Grounding with 500K Human-Robot InteractionsCode0
B-XAIC Dataset: Benchmarking Explainable AI for Graph Neural Networks Using Chemical DataCode0
INTERSPEECH 2009 Emotion Challenge Revisited: Benchmarking 15 Years of Progress in Speech Emotion RecognitionCode0
Analysis | OPEN | Published: 17 June 2019 Multitask learning and benchmarking with clinical time series dataCode0
Building Conformal Prediction Intervals with Approximate Message PassingCode0
Building and benchmarking an Arabic Speech Commands dataset for small-footprint keyword spottingCode0
Adaptive Visual Scene Understanding: Incremental Scene Graph GenerationCode0
Integrating Expert Knowledge into Logical Programs via LLMsCode0
Building a Large Scale Dataset for Image Emotion Recognition: The Fine Print and The BenchmarkCode0
ColorGrid: A Multi-Agent Non-Stationary Environment for Goal Inference and AssistanceCode0
Integration of nested cross-validation, automated hyperparameter optimization, high-performance computing to reduce and quantify the variance of test performance estimation of deep learning modelsCode0
Bugs in the Data: How ImageNet Misrepresents BiodiversityCode0
CleanPatrick: A Benchmark for Image Data CleaningCode0
BubGAN: Bubble Generative Adversarial Networks for Synthesizing Realistic Bubbly Flow ImagesCode0
InstaIndoor and Multi-modal Deep Learning for Indoor Scene RecognitionCode0
bsnsing: A decision tree induction method based on recursive optimal boolean rule compositionCode0
BSBench: will your LLM find the largest prime number?Code0
Adaptive Shrinkage Estimation For Personalized Deep Kernel Regression In Modeling Brain TrajectoriesCode0
inMOTIFin: a lightweight end-to-end simulation software for regulatory sequencesCode0
Towards Learning Universal, Regional, and Local Hydrological Behaviors via Machine-Learning Applied to Large-Sample DatasetsCode0
Bridging the Generalisation Gap: Synthetic Data Generation for Multi-Site Clinical Model ValidationCode0
Adaptive Power System Emergency Control using Deep Reinforcement LearningCode0
Show:102550
← PrevPage 72 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified