Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 751–775 of 5548 papers

Title	Date	Tasks	Status	Hype
M4-SAR: A Multi-Resolution, Multi-Polarization, Multi-Scene, Multi-Source Dataset and Benchmark for Optical-SAR Fusion Object Detection	May 16, 2025	Benchmarkingobject-detection	CodeCode Available	1
Benchmarking Classical and Learning-Based Multibeam Point Cloud Registration	May 10, 2024	BenchmarkingPoint Cloud Registration	CodeCode Available	1
A Multifaceted Benchmarking of Synthetic Electronic Health Record Generation Models	Aug 2, 2022	BenchmarkingSynthetic Data Generation	CodeCode Available	1
BeHonest: Benchmarking Honesty in Large Language Models	Jun 19, 2024	BenchmarkingMisinformation	CodeCode Available	1
ClearPose: Large-scale Transparent Object Dataset and Benchmark	Mar 8, 2022	BenchmarkingDepth Completion	CodeCode Available	1
AdaPool: Exponential Adaptive Pooling for Information-Retaining Downsampling	Nov 1, 2021	Benchmarkingobject-detection	CodeCode Available	1
CHILI: Chemically-Informed Large-scale Inorganic Nanomaterials Dataset for Advancing Graph Machine Learning	Feb 20, 2024	Atomic number classificationBenchmarking	CodeCode Available	1
A Japanese Dataset for Subjective and Objective Sentiment Polarity Classification in Micro Blog Domain	Jun 1, 2022	BenchmarkingEmotion Recognition	CodeCode Available	1
CIBench: Evaluating Your LLMs with a Code Interpreter Plugin	Jul 15, 2024	Benchmarking	CodeCode Available	1
Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study	Dec 30, 2021	AttributeBenchmarking	CodeCode Available	1
A multi-schematic classifier-independent oversampling approach for imbalanced datasets	Jul 15, 2021	Benchmarking	CodeCode Available	1
Emotion and Intent Joint Understanding in Multimodal Conversation: A Benchmarking Dataset	Jul 3, 2024	BenchmarkingDiversity	CodeCode Available	1
End-to-end Emotion-Cause Pair Extraction via Learning to Link	Feb 25, 2020	BenchmarkingEmotion Cause Extraction	CodeCode Available	1
Bencher: Simple and Reproducible Benchmarking for Black-Box Optimization	May 27, 2025	Benchmarking	CodeCode Available	1
Enhancing Biomedical Relation Extraction with Directionality	Jan 23, 2025	BenchmarkingDocument-level Relation Extraction	CodeCode Available	1
BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models	Dec 5, 2023	BenchmarkingVisual Question Answering	CodeCode Available	1
Geometric Deep Learning for Structure-Based Drug Design: A Survey	Jun 20, 2023	BenchmarkingDeep Learning	CodeCode Available	1
A Comprehensive Study of the Robustness for LiDAR-based 3D Object Detectors against Adversarial Attacks	Dec 20, 2022	3D Object DetectionBenchmarking	CodeCode Available	1
CheXphoto: 10,000+ Photos and Transformations of Chest X-rays for Benchmarking Deep Learning Robustness	Jul 13, 2020	Benchmarking	CodeCode Available	1
CIDEr: Consensus-based Image Description Evaluation	Nov 20, 2014	Action RecognitionAttribute	CodeCode Available	1
ClimART: A Benchmark Dataset for Emulating Atmospheric Radiative Transfer in Weather and Climate Models	Nov 29, 2021	BenchmarkingPhysical Simulations	CodeCode Available	1
Evaluating Adversarial Attacks on ImageNet: A Reality Check on Misclassification Classes	Nov 22, 2021	Benchmarking	CodeCode Available	1
A Systematic Benchmarking Analysis of Transfer Learning for Medical Image Analysis	Aug 12, 2021	BenchmarkingMedical Image Analysis	CodeCode Available	1
Evaluating histopathology transfer learning with ChampKit	Jun 14, 2022	BenchmarkingCell Detection	CodeCode Available	1
On the Detectability of ChatGPT Content: Benchmarking, Methodology, and Evaluation through the Lens of Academic Writing	Jun 7, 2023	BenchmarkingPrompt Engineering	CodeCode Available	1

Show:10 25 50

← PrevPage 31 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified