Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 551–575 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Benchmarking Image Retrieval for Visual Localization	Nov 24, 2020	Autonomous DrivingBenchmarking	CodeCode Available	1	5
A Benchmarking Study of Kolmogorov-Arnold Networks on Tabular Data	Jun 20, 2024	BenchmarkingKolmogorov-Arnold Networks	CodeCode Available	1	5
Cross-Modal Bidirectional Interaction Model for Referring Remote Sensing Image Segmentation	Oct 11, 2024	BenchmarkingImage Segmentation	CodeCode Available	1	5
Curious Hierarchical Actor-Critic Reinforcement Learning	May 7, 2020	BenchmarkingHierarchical Reinforcement Learning	CodeCode Available	1	5
A Benchmarking Study of Embedding-based Entity Alignment for Knowledge Graphs	Mar 10, 2020	BenchmarkingEntity Alignment	CodeCode Available	1	5
Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation	Dec 26, 2019	BenchmarkingDomain Adaptation	CodeCode Available	1	5
Benchmarking Graph Neural Networks for FMRI analysis	Nov 16, 2022	Benchmarking	CodeCode Available	1	5
A Dataset for Answering Time-Sensitive Questions	Aug 13, 2021	Benchmarking	CodeCode Available	1	5
Benchmarking Graph Neural Networks on Dynamic Link Prediction	Sep 29, 2021	BenchmarkingDynamic Link Prediction	CodeCode Available	1	5
CovDocker: Benchmarking Covalent Drug Design with Tasks, Datasets, and Solutions	Jun 26, 2025	BenchmarkingDrug Design	CodeCode Available	1	5
CHOICE: Benchmarking the Remote Sensing Capabilities of Large Vision-Language Models	Nov 27, 2024	BenchmarkingEarth Observation	CodeCode Available	1	5
CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks	Feb 4, 2023	Adversarial AttackAdversarial Robustness	CodeCode Available	1	5
Benchmarking Geospatial Question Answering Engines using the Dataset GeoQuestions1089	Nov 6, 2023	BenchmarkingKnowledge Base Question Answering	CodeCode Available	1	5
Benchmarking Implicit Neural Representation and Geometric Rendering in Real-Time RGB-D SLAM	Mar 28, 2024	Benchmarking	CodeCode Available	1	5
CounselBench: A Large-Scale Expert Evaluation and Adversarial Benchmark of Large Language Models in Mental Health Counseling	Jun 10, 2025	Benchmarking	CodeCode Available	1	5
COVID-19 event extraction from Twitter via extractive question answering with continuous prompts	Mar 19, 2023	BenchmarkingEvent Extraction	CodeCode Available	1	5
Benchmarking Generated Poses: How Rational is Structure-based Drug Design with Generative Models?	Aug 14, 2023	BenchmarkingDrug Design	CodeCode Available	1	5
Contemporary Symbolic Regression Methods and their Relative Performance	Jul 29, 2021	Benchmarkingparameter estimation	CodeCode Available	1	5
Constellation Dataset: Benchmarking High-Altitude Object Detection for an Urban Intersection	Apr 25, 2024	Benchmarkingobject-detection	CodeCode Available	1	5
Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable Summarization	Nov 15, 2023	BenchmarkingInstruction Following	CodeCode Available	1	5
ConsumerBench: Benchmarking Generative AI Applications on End-User Devices	Jun 21, 2025	BenchmarkingCPU	CodeCode Available	1	5
Controlgym: Large-Scale Control Environments for Benchmarking Reinforcement Learning Algorithms	Nov 30, 2023	BenchmarkingOpenAI Gym	CodeCode Available	1	5
Benchmarking for Biomedical Natural Language Processing Tasks with a Domain Specific ALBERT	Jul 9, 2021	BenchmarkingDocument Classification	CodeCode Available	1	5
ComplexBench-Edit: Benchmarking Complex Instruction-Driven Image Editing via Compositional Dependencies	Jun 15, 2025	Benchmarking	CodeCode Available	1	5
CompanyKG: A Large-Scale Heterogeneous Graph for Company Similarity Quantification	Jun 18, 2023	BenchmarkingRetrieval	CodeCode Available	1	5

Show:10 25 50

← PrevPage 23 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified