Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1801–1825 of 5548 papers

Title	Date	Tasks	Status	Score
CleanPatrick: A Benchmark for Image Data Cleaning	May 16, 2025	BenchmarkingLabel Error Detection	CodeCode Available	5
Comparative Analysis: Violence Recognition from Videos using Transfer Learning	Aug 26, 2024	Action RecognitionBenchmarking	CodeCode Available	5
BubGAN: Bubble Generative Adversarial Networks for Synthesizing Realistic Bubbly Flow Images	Sep 7, 2018	Benchmarking	CodeCode Available	5
Integrating Expert Knowledge into Logical Programs via LLMs	Feb 17, 2025	BenchmarkingLogical Reasoning	CodeCode Available	5
bsnsing: A decision tree induction method based on recursive optimal boolean rule composition	May 30, 2022	Benchmarking	CodeCode Available	5
BSBench: will your LLM find the largest prime number?	Jun 5, 2025	Benchmarking	CodeCode Available	5
Adaptive Shrinkage Estimation For Personalized Deep Kernel Regression In Modeling Brain Trajectories	Apr 10, 2025	Additive modelsBenchmarking	CodeCode Available	5
InstaIndoor and Multi-modal Deep Learning for Indoor Scene Recognition	Dec 23, 2021	BenchmarkingDeep Learning	CodeCode Available	5
Towards Learning Universal, Regional, and Local Hydrological Behaviors via Machine-Learning Applied to Large-Sample Datasets	Jul 19, 2019	BenchmarkingBIG-bench Machine Learning	CodeCode Available	5
Bridging the Generalisation Gap: Synthetic Data Generation for Multi-Site Clinical Model Validation	Apr 29, 2025	BenchmarkingFairness	CodeCode Available	5
Adaptive Power System Emergency Control using Deep Reinforcement Learning	Mar 9, 2019	BenchmarkingDeep Reinforcement Learning	CodeCode Available	5
BRI3L: A Brightness Illusion Image Dataset for Identification and Localization of Regions of Illusory Perception	Feb 7, 2024	Benchmarking	CodeCode Available	5
Benchmarking Abstract and Reasoning Abilities Through A Theoretical Perspective	May 28, 2025	BenchmarkingMemorization	CodeCode Available	5
InDL: A New Dataset and Benchmark for In-Diagram Logic Interpretation based on Visual Illusion	May 28, 2023	BenchmarkingDecision Making	CodeCode Available	5
Benchmarking 6DOF Outdoor Visual Localization in Changing Conditions	Jul 28, 2017	Autonomous VehiclesBenchmarking	CodeCode Available	5
IndiBias: A Benchmark Dataset to Measure Social Biases in Language Models for Indian Context	Mar 29, 2024	BenchmarkingSentence	CodeCode Available	5
BoxingGym: Benchmarking Progress in Automated Experimental Design and Model Discovery	Jan 2, 2025	BenchmarkingExperimental Design	CodeCode Available	5
AnaloBench: Benchmarking the Identification of Abstract and Long-context Analogies	Feb 19, 2024	Benchmarking	CodeCode Available	5
Improving Generalization of Neural Vehicle Routing Problem Solvers Through the Lens of Model Architecture	Jun 10, 2024	BenchmarkingDecoder	CodeCode Available	5
Improving Pretrained Models for Zero-shot Multi-label Text Classification through Reinforced Label Hierarchy Reasoning	Apr 4, 2021	BenchmarkingMulti Label Text Classification	CodeCode Available	5
MixMAS: A Framework for Sampling-Based Mixer Architecture Search for Multimodal Fusion and Learning	Dec 24, 2024	Benchmarking	CodeCode Available	5
LMEMs for post-hoc analysis of HPO Benchmarking	Aug 5, 2024	BenchmarkingHyperparameter Optimization	CodeCode Available	5
Improvements & Evaluations on the MLCommons CloudMask Benchmark	Mar 7, 2024	Benchmarking	CodeCode Available	5
Improving the Perturbation-Based Explanation of Deepfake Detectors Through the Use of Adversarially-Generated Samples	Feb 6, 2025	BenchmarkingDeepFake Detection	CodeCode Available	5
Individual Fairness Guarantees for Neural Networks	May 11, 2022	BenchmarkingFairness	CodeCode Available	5

Show:10 25 50

← PrevPage 73 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified