Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4851–4875 of 5548 papers

Title	Date	Tasks	Status
Mol-MoE: Training Preference-Guided Routers for Molecule Generation	Feb 8, 2025	BenchmarkingDrug Design	CodeCode Available
Benchmarking Robust Self-Supervised Learning Across Diverse Downstream Tasks	Jul 17, 2024	Adversarial RobustnessBenchmarking	CodeCode Available
Fine-grained Hand Gesture Recognition in Multi-viewpoint Hand Hygiene	Sep 7, 2021	BenchmarkingFine-Grained Image Recognition	CodeCode Available
Moment Matching for Multi-Source Domain Adaptation	Dec 4, 2018	BenchmarkingDomain Adaptation	CodeCode Available
Benchmarking Robustness to Text-Guided Corruptions	Apr 6, 2023	BenchmarkingData Augmentation	CodeCode Available
Fine-grained Entity Recognition with Reduced False Negatives and Large Type Coverage	Apr 30, 2019	Benchmarking	CodeCode Available
Finding the Perfect Fit: Applying Regression Models to ClimateBench v1.0	Aug 23, 2023	Benchmarkingregression	CodeCode Available
Benchmarking Robustness of Endoscopic Depth Estimation with Synthetically Corrupted Data	Sep 24, 2024	BenchmarkingDepth Estimation	CodeCode Available
Benchmarking Robustness of 3D Object Detection to Common Corruptions in Autonomous Driving	Mar 20, 2023	3D Object DetectionAutonomous Driving	CodeCode Available
Scission: Performance-driven and Context-aware Cloud-Edge Distribution of Deep Neural Networks	Aug 8, 2020	BenchmarkingDecision Making	CodeCode Available
ALDI++: Automatic and parameter-less discord and outlier detection for building energy load profiles	Mar 13, 2022	BenchmarkingBIG-bench Machine Learning	CodeCode Available
Benchmarking Robustness in Object Detection: Autonomous Driving when Winter is Coming	Jul 17, 2019	Autonomous DrivingBenchmarking	CodeCode Available
Motley: Benchmarking Heterogeneity and Personalization in Federated Learning	Jun 18, 2022	BenchmarkingFairness	CodeCode Available
ScoNe: Benchmarking Negation Reasoning in Language Models With Fine-Tuning and In-Context Learning	May 30, 2023	BenchmarkingIn-Context Learning	CodeCode Available
Benchmarking Retinal Blood Vessel Segmentation Models for Cross-Dataset and Cross-Disease Generalization	Jun 21, 2024	BenchmarkingSegmentation	CodeCode Available
The Role of Model Architecture and Scale in Predicting Molecular Properties: Insights from Fine-Tuning RoBERTa, BART, and LLaMA	May 2, 2024	BenchmarkingDrug Discovery	CodeCode Available
AutoJudger: An Agent-Driven Framework for Efficient Benchmarking of MLLMs	May 27, 2025	BenchmarkingQuestion Selection	CodeCode Available
Benchmarking Representation Learning for Natural World Image Collections	Mar 30, 2021	BenchmarkingBinary Classification	CodeCode Available
Benchmarking Reinforcement Learning Algorithms on Real-World Robots	Sep 20, 2018	Benchmarkingcontinuous-control	CodeCode Available
Benchmarking Quantum Reinforcement Learning	Jan 27, 2025	Benchmarkingreinforcement-learning	CodeCode Available
MSAMSum: Towards Benchmarking Multi-lingual Dialogue Summarization	May 1, 2022	Benchmarkingdialogue summary	CodeCode Available
Alchemy: A Quantum Chemistry Dataset for Benchmarking AI Models	Jun 22, 2019	BenchmarkingBIG-bench Machine Learning	CodeCode Available
FHBench: Towards Efficient and Personalized Federated Learning for Multimodal Healthcare	Apr 15, 2025	BenchmarkingDiagnostic	CodeCode Available
Benchmarking quantum machine learning kernel training for classification tasks	Aug 17, 2024	BenchmarkingQuantum Machine Learning	CodeCode Available
The Saudi Privacy Policy Dataset	Apr 5, 2023	Benchmarking	CodeCode Available

Show:10 25 50

← PrevPage 195 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified