Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 926–950 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Benchmarking deep inverse models over time, and the neural-adjoint method	Sep 27, 2020	Benchmarking	CodeCode Available	1	5
A Call to Reflect on Evaluation Practices for Failure Detection in Image Classification	Nov 28, 2022	Benchmarkingimage-classification	CodeCode Available	1	5
Benchmarking Offline Reinforcement Learning on Real-Robot Hardware	Jul 28, 2023	Benchmarkingreinforcement-learning	CodeCode Available	1	5
AnomalyHop: An SSL-based Image Anomaly Localization Method	May 8, 2021	Anomaly LocalizationBenchmarking	CodeCode Available	1	5
Evaluating Multimodal Representations on Visual Semantic Textual Similarity	Apr 4, 2020	BenchmarkingImage Captioning	CodeCode Available	1	5
Evaluation of large language models for discovery of gene set function	Sep 7, 2023	BenchmarkingLanguage Modelling	CodeCode Available	1	5
Benchmarking Natural Language Understanding Services for building Conversational Agents	Mar 13, 2019	BenchmarkingGeneral Classification	CodeCode Available	1	5
Evaluating Adversarial Attacks on ImageNet: A Reality Check on Misclassification Classes	Nov 22, 2021	Benchmarking	CodeCode Available	1	5
Benchmarking Deep Learning Interpretability in Time Series Predictions	Oct 26, 2020	BenchmarkingDeep Learning	CodeCode Available	1	5
Benchmarking Multimodal Variational Autoencoders: CdSprites+ Dataset and Toolkit	Sep 7, 2022	Benchmarking	CodeCode Available	1	5
Guardians of Image Quality: Benchmarking Defenses Against Adversarial Attacks on Image Quality Metrics	Aug 2, 2024	Adversarial AttackAdversarial Purification	CodeCode Available	1	5
An Open-source Benchmark of Deep Learning Models for Audio-visual Apparent and Self-reported Personality Recognition	Oct 17, 2022	Benchmarking	CodeCode Available	1	5
Benchmarking Deep Models for Salient Object Detection	Feb 7, 2022	BenchmarkingObject	CodeCode Available	1	5
Benchmarking Multi-Scene Fire and Smoke Detection	Oct 22, 2024	Benchmarking	CodeCode Available	1	5
Evaluating Attribution for Graph Neural Networks	Dec 1, 2020	Benchmarking	CodeCode Available	1	5
Benchmarking Deep Reinforcement Learning for Navigation in Denied Sensor Environments	Oct 18, 2024	Autonomous NavigationBenchmarking	CodeCode Available	1	5
CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer	Dec 2, 2021	BenchmarkingOrdinal Classification	CodeCode Available	1	5
Benchmarking Neural Network Generalization for Grammar Induction	Aug 16, 2023	Benchmarking	CodeCode Available	1	5
Data-Driven Denoising of Stationary Accelerometer Signals	Jun 13, 2022	BenchmarkingDenoising	CodeCode Available	1	5
Curious Hierarchical Actor-Critic Reinforcement Learning	May 7, 2020	BenchmarkingHierarchical Reinforcement Learning	CodeCode Available	1	5
Benchmarking emergency department triage prediction models with machine learning and large public electronic health records	Nov 22, 2021	Benchmarking	CodeCode Available	1	5
Benchmarking Multimodal Knowledge Conflict for Large Multimodal Models	May 26, 2025	BenchmarkingRAG	CodeCode Available	1	5
Benchmarking Detection Transfer Learning with Vision Transformers	Nov 22, 2021	Benchmarkingobject-detection	CodeCode Available	1	5
3DYoga90: A Hierarchical Video Dataset for Yoga Pose Understanding	Oct 16, 2023	Action RecognitionBenchmarking	CodeCode Available	1	5
Benchmarking Multi-modal Semantic Segmentation under Sensor Failures: Missing and Noisy Modality Robustness	Mar 24, 2025	BenchmarkingSemantic Segmentation	CodeCode Available	1	5

Show:10 25 50

← PrevPage 38 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified