Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 976–1000 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Delving into Out-of-Distribution Detection with Medical Vision-Language Models	Mar 2, 2025	Benchmarkingimage-classification	CodeCode Available	1	5
Demystifying Learning Rate Policies for High Accuracy Training of Deep Neural Networks	Aug 18, 2019	BenchmarkingImage Classification	CodeCode Available	1	5
DependEval: Benchmarking LLMs for Repository Dependency Understanding	Mar 9, 2025	BenchmarkingCode Generation	CodeCode Available	1	5
Benchmarking Offline Reinforcement Learning on Real-Robot Hardware	Jul 28, 2023	Benchmarkingreinforcement-learning	CodeCode Available	1	5
Benchmarking Object Detectors under Real-World Distribution Shifts in Satellite Imagery	Mar 24, 2025	BenchmarkingHumanitarian	CodeCode Available	1	5
Descending through a Crowded Valley — Benchmarking Deep Learning Optimizers	Jan 1, 2021	BenchmarkingDeep Learning	CodeCode Available	1	5
Multimodal Fusion via Teacher-Student Network for Indoor Action Recognition	May 18, 2021	Action RecognitionAction Recognition In Videos	CodeCode Available	1	5
Experimental Validation of Ultrasound Beamforming with End-to-End Deep Learning for Single Plane Wave Imaging	Apr 22, 2024	Benchmarking	CodeCode Available	1	5
Detecting beats in the photoplethysmogram: benchmarking open-source algorithms	Jul 19, 2022	BenchmarkingPhotoplethysmography (PPG) beat detection	CodeCode Available	1	5
MultiRes-NetVLAD: Augmenting Place Recognition Training with Low-Resolution Imagery	Feb 18, 2022	BenchmarkingRepresentation Learning	CodeCode Available	1	5
Multi-Stream Cellular Test-Time Adaptation of Real-Time Models Evolving in Dynamic Environments	Apr 27, 2024	Autonomous VehiclesBenchmarking	CodeCode Available	1	5
API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs	Feb 23, 2024	Benchmarkingslot-filling	CodeCode Available	1	5
Developing a Scalable Benchmark for Assessing Large Language Models in Knowledge Graph Engineering	Aug 31, 2023	BenchmarkingDataset Generation	CodeCode Available	1	5
Adversarial Prompt Evaluation: Systematic Benchmarking of Guardrails Against Prompt Input Attacks on LLMs	Feb 21, 2025	Benchmarking	CodeCode Available	1	5
Benchmarking Fish Dataset and Evaluation Metric in Keypoint Detection -- Towards Precise Fish Morphological Assessment in Aquaculture Breeding	May 21, 2024	BenchmarkingKeypoint Detection	CodeCode Available	1	5
Explainable Benchmarking for Iterative Optimization Heuristics	Jan 31, 2024	BenchmarkingEvolutionary Algorithms	CodeCode Available	1	5
DialogueLLM: Context and Emotion Knowledge-Tuned Large Language Models for Emotion Recognition in Conversations	Oct 17, 2023	BenchmarkingEmotion Recognition	CodeCode Available	1	5
NAS-Bench-360: Benchmarking Neural Architecture Search on Diverse Tasks	Oct 12, 2021	Benchmarkingimage-classification	CodeCode Available	1	5
NAS-Bench-Graph: Benchmarking Graph Neural Architecture Search	Jun 18, 2022	BenchmarkingGraph Neural Network	CodeCode Available	1	5
Benchmarking Neural Network Robustness to Common Corruptions and Surface Variations	Jul 4, 2018	Adversarial DefenseBenchmarking	CodeCode Available	1	5
Benchmarking for Biomedical Natural Language Processing Tasks with a Domain Specific ALBERT	Jul 9, 2021	BenchmarkingDocument Classification	CodeCode Available	1	5
DIG In: Evaluating Disparities in Image Generations with Indicators for Geographic Diversity	Aug 11, 2023	BenchmarkingDiversity	CodeCode Available	1	5
DiffuSETS: 12-lead ECG Generation Conditioned on Clinical Text Reports and Patient-Specific Information	Jan 10, 2025	BenchmarkingData Augmentation	CodeCode Available	1	5
Protein Structure Tokenization: Benchmarking and New Recipe	Feb 28, 2025	BenchmarkingLanguage Modeling	CodeCode Available	1	5
Benchmarking Neural Network Generalization for Grammar Induction	Aug 16, 2023	Benchmarking	CodeCode Available	1	5

Show:10 25 50

← PrevPage 40 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified