SOTAVerified

Benchmarking

Papers

Showing 29012910 of 5548 papers

TitleStatusHype
Profit: Benchmarking Personalization and Robustness Trade-off in Federated Prompt Tuning0
CIFAR-10-Warehouse: Broad and More Realistic Testbeds in Model Generalization Analysis0
Bringing Quantum Algorithms to Automated Machine Learning: A Systematic Review of AutoML Frameworks Regarding Extensibility for QML Algorithms0
A Review of Deep Reinforcement Learning in Serverless Computing: Function Scheduling and Resource Auto-Scaling0
PepMLM: Target Sequence-Conditioned Generation of Therapeutic Peptide Binders via Span Masked Language ModelingCode1
Benchmarking a foundation LLM on its ability to re-label structure names in accordance with the AAPM TG-263 report0
MLAgentBench: Evaluating Language Agents on Machine Learning ExperimentationCode2
Deep Reinforcement Learning Algorithms for Hybrid V2X Communication: A Benchmarking Study0
Can Language Models Employ the Socratic Method? Experiments with Code DebuggingCode1
Fully Automatic Segmentation of Gross Target Volume and Organs-at-Risk for Radiotherapy Planning of Nasopharyngeal CarcinomaCode0
Show:102550
← PrevPage 291 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified