Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 701–725 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Depth-Driven Geometric Prompt Learning for Laparoscopic Liver Landmark Detection	Jun 25, 2024	BenchmarkingPrompt Learning	CodeCode Available	1	5
AllClear: A Comprehensive Dataset and Benchmark for Cloud Removal in Satellite Imagery	Oct 31, 2024	BenchmarkingCloud Removal	CodeCode Available	1	5
Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers	Jul 3, 2020	BenchmarkingDeep Learning	CodeCode Available	1	5
A Ladder of Causal Distances	May 5, 2020	BenchmarkingCausal Discovery	CodeCode Available	1	5
Automatic sleep stage classification with deep residual networks in a mixed-cohort setting	Aug 21, 2020	Automatic Sleep Stage ClassificationBenchmarking	CodeCode Available	1	5
Benchmarking Multimodal Variational Autoencoders: CdSprites+ Dataset and Toolkit	Sep 7, 2022	Benchmarking	CodeCode Available	1	5
Benchmarking Neural Network Robustness to Common Corruptions and Surface Variations	Jul 4, 2018	Adversarial DefenseBenchmarking	CodeCode Available	1	5
Benchmarking Retrieval-Augmented Multimomal Generation for Document Question Answering	May 22, 2025	BenchmarkingEvidence Selection	CodeCode Available	1	5
ATOMMIC: An Advanced Toolbox for Multitask Medical Imaging Consistency to facilitate Artificial Intelligence applications from acquisition to analysis in Magnetic Resonance Imaging	Apr 30, 2024	BenchmarkingImage Reconstruction	CodeCode Available	1	5
Autonomous Microscopy Experiments through Large Language Model Agents	Dec 18, 2024	BenchmarkingExperimental Design	CodeCode Available	1	5
Atom-Level Optical Chemical Structure Recognition with Limited Supervision	Apr 2, 2024	Benchmarking	CodeCode Available	1	5
Demystifying Learning Rate Policies for High Accuracy Training of Deep Neural Networks	Aug 18, 2019	BenchmarkingImage Classification	CodeCode Available	1	5
Benchmarking Object Detectors under Real-World Distribution Shifts in Satellite Imagery	Mar 24, 2025	BenchmarkingHumanitarian	CodeCode Available	1	5
Descending through a Crowded Valley — Benchmarking Deep Learning Optimizers	Jan 1, 2021	BenchmarkingDeep Learning	CodeCode Available	1	5
A Critical Assessment of State-of-the-Art in Entity Alignment	Oct 30, 2020	BenchmarkingEntity Alignment	CodeCode Available	1	5
Benchmarking Offline Reinforcement Learning on Real-Robot Hardware	Jul 28, 2023	Benchmarkingreinforcement-learning	CodeCode Available	1	5
DFGC 2021: A DeepFake Game Competition	Jun 2, 2021	BenchmarkingDeepFake Detection	CodeCode Available	1	5
DeID-GPT: Zero-shot Medical Text De-Identification by GPT-4	Mar 20, 2023	BenchmarkingDe-identification	CodeCode Available	1	5
Benchmarking Language Models for Code Syntax Understanding	Oct 26, 2022	Benchmarking	CodeCode Available	1	5
Deluca -- A Differentiable Control Library: Environments, Methods, and Benchmarking	Feb 19, 2021	BenchmarkingOpenAI Gym	CodeCode Available	1	5
BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models	Jun 2, 2023	BenchmarkingLanguage Acquisition	CodeCode Available	1	5
Benchmarking: Past, Present and Future	Aug 1, 2021	BenchmarkingReading Comprehension	CodeCode Available	1	5
Benchmarking Large Language Models for Automated Verilog RTL Code Generation	Dec 13, 2022	BenchmarkingCode Generation	CodeCode Available	1	5
Element-aware Summarization with Large Language Models: Expert-aligned Evaluation and Chain-of-Thought Method	May 22, 2023	BenchmarkingHallucination	CodeCode Available	1	5
A Japanese Dataset for Subjective and Objective Sentiment Polarity Classification in Micro Blog Domain	Jun 1, 2022	BenchmarkingEmotion Recognition	CodeCode Available	1	5

Show:10 25 50

← PrevPage 29 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified