Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 801–825 of 5548 papers

Title	Date	Tasks	Status	Hype
A SWAT-based Reinforcement Learning Framework for Crop Management	Feb 10, 2023	BenchmarkingDecision Making	CodeCode Available	1
Continual Learning with Foundation Models: An Empirical Study of Latent Replay	Apr 30, 2022	BenchmarkingContinual Learning	CodeCode Available	1
FreeMan: Towards Benchmarking 3D Human Pose Estimation under Real-World Conditions	Sep 10, 2023	3D Human Pose Estimation3D Pose Estimation	CodeCode Available	1
ClimART: A Benchmark Dataset for Emulating Atmospheric Radiative Transfer in Weather and Climate Models	Nov 29, 2021	BenchmarkingPhysical Simulations	CodeCode Available	1
Benchmarking AI scientists in omics data-driven biological research	May 13, 2025	BenchmarkingMultiple-choice	CodeCode Available	1
FTNet: Feature Transverse Network for Thermal Image Semantic Segmentation	Oct 26, 2021	BenchmarkingScene Segmentation	CodeCode Available	1
Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations	Mar 21, 2024	BenchmarkingMemorization	CodeCode Available	1
Benchmarking Algorithms for Federated Domain Generalization	Jul 11, 2023	BenchmarkingDiversity	CodeCode Available	1
Benchmarking Algorithms for Submodular Optimization Problems Using IOHProfiler	Feb 2, 2023	BenchmarkingEvolutionary Algorithms	CodeCode Available	1
GAMA: a General Automated Machine learning Assistant	Jul 9, 2020	AutoMLBenchmarking	CodeCode Available	1
ClearPose: Large-scale Transparent Object Dataset and Benchmark	Mar 8, 2022	BenchmarkingDepth Completion	CodeCode Available	1
Coarse-to-Fine Q-attention with Learned Path Ranking	Apr 4, 2022	Benchmarking	CodeCode Available	1
Benchmarking and Analysis of Unsupervised Object Segmentation from Real-world Single Images	Dec 8, 2023	BenchmarkingObject	CodeCode Available	1
Benchmarking and Analyzing 3D-aware Image Synthesis with a Modularized Codebase	Jun 21, 2023	3D-Aware Image SynthesisBenchmarking	CodeCode Available	1
Benchmarking and Analyzing 3D Human Pose and Shape Estimation Beyond Algorithms	Sep 21, 2022	3D human pose and shape estimationBenchmarking	CodeCode Available	1
A Benchmarking Study of Kolmogorov-Arnold Networks on Tabular Data	Jun 20, 2024	BenchmarkingKolmogorov-Arnold Networks	CodeCode Available	1
Collab-Overcooked: Benchmarking and Evaluating Large Language Models as Collaborative Agents	Feb 27, 2025	Benchmarking	CodeCode Available	1
Benchmarking and Analyzing Point Cloud Classification under Corruptions	Feb 7, 2022	BenchmarkingClassification	CodeCode Available	1
Benchmarking and Analyzing Robust Point Cloud Recognition: Bag of Tricks for Defending Adversarial Examples	Jul 31, 2023	Adversarial RobustnessBenchmarking	CodeCode Available	1
Generative and reproducible benchmarks for comprehensive evaluation of machine learning classifiers	Jul 14, 2021	BenchmarkingBIG-bench Machine Learning	CodeCode Available	1
Generative Evaluation of Complex Reasoning in Large Language Models	Apr 3, 2025	BenchmarkingMemorization	CodeCode Available	1
Generative Wind Power Curve Modeling Via Machine Vision: A Self-learning Deep Convolutional Network Based Method	Aug 19, 2021	BenchmarkingSynthetic Data Generation	CodeCode Available	1
Depth-Driven Geometric Prompt Learning for Laparoscopic Liver Landmark Detection	Jun 25, 2024	BenchmarkingPrompt Learning	CodeCode Available	1
Evaluating Multimodal Representations on Visual Semantic Textual Similarity	Apr 4, 2020	BenchmarkingImage Captioning	CodeCode Available	1
CHILI: Chemically-Informed Large-scale Inorganic Nanomaterials Dataset for Advancing Graph Machine Learning	Feb 20, 2024	Atomic number classificationBenchmarking	CodeCode Available	1

Show:10 25 50

← PrevPage 33 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified