SOTAVerified

Benchmarking

Papers

Showing 48614870 of 5548 papers

TitleStatusHype
ALDI++: Automatic and parameter-less discord and outlier detection for building energy load profilesCode0
Benchmarking Robustness in Object Detection: Autonomous Driving when Winter is ComingCode0
Motley: Benchmarking Heterogeneity and Personalization in Federated LearningCode0
ScoNe: Benchmarking Negation Reasoning in Language Models With Fine-Tuning and In-Context LearningCode0
Benchmarking Retinal Blood Vessel Segmentation Models for Cross-Dataset and Cross-Disease GeneralizationCode0
The Role of Model Architecture and Scale in Predicting Molecular Properties: Insights from Fine-Tuning RoBERTa, BART, and LLaMACode0
AutoJudger: An Agent-Driven Framework for Efficient Benchmarking of MLLMsCode0
Benchmarking Representation Learning for Natural World Image CollectionsCode0
Benchmarking Reinforcement Learning Algorithms on Real-World RobotsCode0
Benchmarking Quantum Reinforcement LearningCode0
Show:102550
← PrevPage 487 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified