SOTAVerified

Benchmarking

Papers

Showing 48514875 of 5548 papers

TitleStatusHype
Mol-MoE: Training Preference-Guided Routers for Molecule GenerationCode0
Benchmarking Robust Self-Supervised Learning Across Diverse Downstream TasksCode0
Fine-grained Hand Gesture Recognition in Multi-viewpoint Hand HygieneCode0
Moment Matching for Multi-Source Domain AdaptationCode0
Benchmarking Robustness to Text-Guided CorruptionsCode0
Fine-grained Entity Recognition with Reduced False Negatives and Large Type CoverageCode0
Finding the Perfect Fit: Applying Regression Models to ClimateBench v1.0Code0
Benchmarking Robustness of Endoscopic Depth Estimation with Synthetically Corrupted DataCode0
Benchmarking Robustness of 3D Object Detection to Common Corruptions in Autonomous DrivingCode0
Scission: Performance-driven and Context-aware Cloud-Edge Distribution of Deep Neural NetworksCode0
ALDI++: Automatic and parameter-less discord and outlier detection for building energy load profilesCode0
Benchmarking Robustness in Object Detection: Autonomous Driving when Winter is ComingCode0
Motley: Benchmarking Heterogeneity and Personalization in Federated LearningCode0
ScoNe: Benchmarking Negation Reasoning in Language Models With Fine-Tuning and In-Context LearningCode0
Benchmarking Retinal Blood Vessel Segmentation Models for Cross-Dataset and Cross-Disease GeneralizationCode0
The Role of Model Architecture and Scale in Predicting Molecular Properties: Insights from Fine-Tuning RoBERTa, BART, and LLaMACode0
AutoJudger: An Agent-Driven Framework for Efficient Benchmarking of MLLMsCode0
Benchmarking Representation Learning for Natural World Image CollectionsCode0
Benchmarking Reinforcement Learning Algorithms on Real-World RobotsCode0
Benchmarking Quantum Reinforcement LearningCode0
MSAMSum: Towards Benchmarking Multi-lingual Dialogue SummarizationCode0
Alchemy: A Quantum Chemistry Dataset for Benchmarking AI ModelsCode0
FHBench: Towards Efficient and Personalized Federated Learning for Multimodal HealthcareCode0
Benchmarking quantum machine learning kernel training for classification tasksCode0
The Saudi Privacy Policy DatasetCode0
Show:102550
← PrevPage 195 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified