Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 5201–5225 of 5548 papers

Title	Date	Tasks	Status
2017 Robotic Instrument Segmentation Challenge	Feb 18, 2019	BenchmarkingPerson Re-Identification	CodeCode Available
AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias	Oct 3, 2018	BenchmarkingDecision Making	CodeCode Available
Benchmarking Intersectional Biases in NLP	Jul 1, 2022	BenchmarkingBIG-bench Machine Learning	CodeCode Available
Benchmarking Commercial Intent Detection Services with Practice-Driven Evaluations	Dec 7, 2020	BenchmarkingGoal-Oriented Dialog	CodeCode Available
Towards Fair and Privacy-Preserving Federated Deep Models	Jun 4, 2019	BenchmarkingDeep Learning	CodeCode Available
SPDEBench: An Extensive Benchmark for Learning Regular and Singular Stochastic PDEs	May 24, 2025	Benchmarking	CodeCode Available
Deep Neural Network Benchmarks for Selective Classification	Jan 23, 2024	BenchmarkingClassification	CodeCode Available
Abstraction Alignment: Comparing Model-Learned and Human-Encoded Conceptual Relationships	Jul 17, 2024	Benchmarking	CodeCode Available
Arabic Speech Recognition by End-to-End, Modular Systems and Human	Jan 21, 2021	Arabic Speech RecognitionAutomatic Speech Recognition	CodeCode Available
Benchmarking Image Perturbations for Testing Automated Driving Assistance Systems	Jan 21, 2025	Autonomous VehiclesBenchmarking	CodeCode Available
Deep Metric Learning Meets Deep Clustering: An Novel Unsupervised Approach for Feature Embedding	Sep 9, 2020	BenchmarkingClustering	CodeCode Available
Deepened Graph Auto-Encoders Help Stabilize and Enhance Link Prediction	Mar 21, 2021	BenchmarkingClustering	CodeCode Available
Oral Imaging for Malocclusion Issues Assessments: OMNI Dataset, Deep Learning Baselines and Benchmarking	May 21, 2025	BenchmarkingDiagnostic	CodeCode Available
Orchestrator-Agent Trust: A Modular Agentic AI Visual Classification System with Trust-Aware Orchestration and RAG-Based Reasoning	Jul 9, 2025	BenchmarkingImage Retrieval	CodeCode Available
ORCHID: A Chinese Debate Corpus for Target-Independent Stance Detection and Argumentative Dialogue Summarization	Oct 17, 2024	BenchmarkingStance Detection	CodeCode Available
Benchmarking Human and Automated Prompting in the Segment Anything Model	Oct 29, 2024	BenchmarkingImage Segmentation	CodeCode Available
Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?	Jun 1, 2023	BenchmarkingDecoder	CodeCode Available
Deep Emotion Recognition in Textual Conversations: A Survey	Nov 16, 2022	BenchmarkingEmotion Recognition	CodeCode Available
Neural Style Transfer Improves 3D Cardiovascular MR Image Segmentation on Inconsistent Data	Sep 20, 2019	BenchmarkingEnsemble Learning	CodeCode Available
OSS-Bench: Benchmark Generator for Coding LLMs	May 18, 2025	Benchmarking	CodeCode Available
DeepDrug3D: Classification of ligand-binding pockets in proteins with a convolutional neural network	Feb 4, 2019	BenchmarkingSpecificity	CodeCode Available
deepCR: Cosmic Ray Rejection with Deep Learning	Jul 22, 2019	BenchmarkingCPU	CodeCode Available
A quantum-classical reinforcement learning model to play Atari games	Dec 11, 2024	Atari GamesBenchmarking	CodeCode Available
Towards Ground-truth-free Evaluation of Any Segmentation in Medical Images	Sep 23, 2024	BenchmarkingSegmentation	CodeCode Available
Deep Attention Driven Reinforcement Learning (DAD-RL) for Autonomous Decision-Making in Dynamic Environment	Jul 12, 2024	BenchmarkingDecision Making	CodeCode Available

Show:10 25 50

← PrevPage 209 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified