Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1176–1200 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Graph Neural Network-Based Anomaly Detection for River Network Systems	Apr 19, 2023	Anomaly DetectionBenchmarking	CodeCode Available	1	5
Benchmarking Skeleton-based Motion Encoder Models for Clinical Applications: Estimating Parkinson's Disease Severity in Walking Sequences	May 28, 2024	BenchmarkingFeature Engineering	CodeCode Available	1	5
BLADE: Benchmarking Language Model Agents for Data-Driven Science	Aug 19, 2024	BenchmarkingDecision Making	CodeCode Available	1	5
Benchmarking Simulation-Based Inference	Jan 12, 2021	Benchmarking	CodeCode Available	1	5
Benchmarking Visual Localization for Autonomous Navigation	Mar 24, 2022	Autonomous NavigationBenchmarking	CodeCode Available	1	5
A skeletonization algorithm for gradient-based optimization	Sep 5, 2023	BenchmarkingDeep Learning	CodeCode Available	1	5
Benchmarking Multi-Scene Fire and Smoke Detection	Oct 22, 2024	Benchmarking	CodeCode Available	1	5
AutoAdvExBench: Benchmarking autonomous exploitation of adversarial example defenses	Mar 3, 2025	Benchmarking	CodeCode Available	1	5
Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions	May 27, 2022	BenchmarkingFew-Shot Image Classification	CodeCode Available	1	5
Boosting Healthcare LLMs Through Retrieved Context	Sep 23, 2024	BenchmarkingMultiple-choice	CodeCode Available	1	5
Boosting Neural Image Compression for Machines Using Latent Space Masking	Dec 15, 2021	BenchmarkingImage Compression	CodeCode Available	1	5
GraphArena: Benchmarking Large Language Models on Graph Computational Problems	Jun 29, 2024	BenchmarkingHallucination	CodeCode Available	1	5
Graph Robustness Benchmark: Benchmarking the Adversarial Robustness of Graph Machine Learning	Nov 8, 2021	Adversarial RobustnessBenchmarking	CodeCode Available	1	5
BRIDGE: Benchmarking Large Language Models for Understanding Real-world Clinical Practice Text	Apr 28, 2025	Benchmarking	CodeCode Available	1	5
Grounding Descriptions in Images informs Zero-Shot Visual Recognition	Dec 5, 2024	AttributeBenchmarking	CodeCode Available	1	5
AI Accelerator Survey and Trends	Sep 18, 2021	BenchmarkingComputational Efficiency	CodeCode Available	1	5
ISLES 2022: A multi-center magnetic resonance imaging stroke lesion segmentation dataset	Jun 14, 2022	BenchmarkingIschemic Stroke Lesion Segmentation	CodeCode Available	1	5
Benchmarking Neural Network Generalization for Grammar Induction	Aug 16, 2023	Benchmarking	CodeCode Available	1	5
Benchmarking Neural Network Robustness to Common Corruptions and Surface Variations	Jul 4, 2018	Adversarial DefenseBenchmarking	CodeCode Available	1	5
Benchmarking Segmentation Models with Mask-Preserved Attribute Editing	Mar 2, 2024	AttributeBenchmarking	CodeCode Available	1	5
Are Large Language Models Really Good Logical Reasoners? A Comprehensive Evaluation and Beyond	Jun 16, 2023	BenchmarkingEvidence Selection	CodeCode Available	1	5
Benchmarking Large Language Models for Persian: A Preliminary Study Focusing on ChatGPT	Apr 3, 2024	BenchmarkingGeneral Knowledge	CodeCode Available	1	5
GNNX-BENCH: Unravelling the Utility of Perturbation-based GNN Explainers through In-depth Benchmarking	Oct 3, 2023	Benchmarkingcounterfactual	CodeCode Available	1	5
GoMatching++: Parameter- and Data-Efficient Arbitrary-Shaped Video Text Spotting and Benchmarking	May 28, 2025	BenchmarkingText Spotting	CodeCode Available	1	5
GraCoRe: Benchmarking Graph Comprehension and Complex Reasoning in Large Language Models	Jul 3, 2024	Benchmarking	CodeCode Available	1	5

Show:10 25 50

← PrevPage 48 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified