Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2601–2650 of 5548 papers

Title	Date	Tasks	Status	Score
Fine-grained Hand Gesture Recognition in Multi-viewpoint Hand Hygiene	Sep 7, 2021	BenchmarkingFine-Grained Image Recognition	CodeCode Available	5
Benchmarking Hierarchical Script Knowledge	Jun 1, 2019	Benchmarking	CodeCode Available	5
Delta-Influence: Unlearning Poisons via Influence Functions	Nov 20, 2024	AttributeBenchmarking	CodeCode Available	5
Aesthetic Image Captioning From Weakly-Labelled Photographs	Aug 29, 2019	Aesthetic Image CaptioningBenchmarking	CodeCode Available	5
Defense-friendly Images in Adversarial Attacks: Dataset and Metrics for Perturbation Difficulty	Nov 5, 2020	Adversarial AttackBenchmarking	CodeCode Available	5
Fully Automatic Segmentation of Gross Target Volume and Organs-at-Risk for Radiotherapy Planning of Nasopharyngeal Carcinoma	Oct 4, 2023	BenchmarkingSegmentation	CodeCode Available	5
DefAn: Definitive Answer Dataset for LLMs Hallucination Evaluation	Jun 13, 2024	BenchmarkingHallucination	CodeCode Available	5
GenCeption: Evaluate Multimodal LLMs with Unlabeled Unimodal Data	Feb 22, 2024	Benchmarking	CodeCode Available	5
From raw affiliations to organization identifiers	May 12, 2025	BenchmarkingMetadata quality	CodeCode Available	5
Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem	Mar 6, 2024	BenchmarkingHallucination	CodeCode Available	5
From Variability to Stability: Advancing RecSys Benchmarking Practices	Feb 15, 2024	BenchmarkingCollaborative Filtering	CodeCode Available	5
Benchmarking Graph Representations and Graph Neural Networks for Multivariate Time Series Classification	Jan 14, 2025	BenchmarkingGraph Representation Learning	CodeCode Available	5
A projected nonlinear state-space model for forecasting time series signals	Nov 22, 2023	BenchmarkingComputational Efficiency	CodeCode Available	5
From MNIST to ImageNet and Back: Benchmarking Continual Curriculum Learning	Mar 16, 2023	BenchmarkingContinual Learning	CodeCode Available	5
From Past to Present: A Survey of Malicious URL Detection Techniques, Datasets and Code Repositories	Apr 23, 2025	Benchmarking	CodeCode Available	5
Deep Reinforcement Learning for General Video Game AI	Jun 6, 2018	Atari GamesBenchmarking	CodeCode Available	5
From Modern CNNs to Vision Transformers: Assessing the Performance, Robustness, and Classification Strategies of Deep Learning Models in Histopathology	Apr 11, 2022	BenchmarkingCancer Classification	CodeCode Available	5
Benchmarking Robust Self-Supervised Learning Across Diverse Downstream Tasks	Jul 17, 2024	Adversarial RobustnessBenchmarking	CodeCode Available	5
From Knowledge to Reasoning: Evaluating LLMs for Ionic Liquids Research in Chemical and Biological Engineering	May 11, 2025	BenchmarkingGeneral Knowledge	CodeCode Available	5
FRAMES-VQA: Benchmarking Fine-Tuning Robustness across Multi-Modal Shifts in Visual Question Answering	May 27, 2025	BenchmarkingQuestion Answering	CodeCode Available	5
2017 Robotic Instrument Segmentation Challenge	Feb 18, 2019	BenchmarkingPerson Re-Identification	CodeCode Available	5
Okapi: Generalising Better by Making Statistical Matches Match	Nov 7, 2022	BenchmarkingBinary Classification	CodeCode Available	5
FR-MRInet: A Deep Convolutional Encoder-Decoder for Brain Tumor Segmentation with Relu-RGB and Sliding-window	Jul 26, 2018	BenchmarkingBrain Tumor Segmentation	CodeCode Available	5
DeepPatent2: A Large-Scale Benchmarking Corpus for Technical Drawing Understanding	Nov 7, 2023	3D ReconstructionBenchmarking	CodeCode Available	5
A predictive analytics approach for stroke prediction using machine learning and neural networks	Mar 1, 2022	BenchmarkingBIG-bench Machine Learning	CodeCode Available	5
DeepOBS: A Deep Learning Optimizer Benchmark Suite	Mar 13, 2019	BenchmarkingDeep Learning	CodeCode Available	5
Deep Neural Network Benchmarks for Selective Classification	Jan 23, 2024	BenchmarkingClassification	CodeCode Available	5
From Bytes to Borsch: Fine-Tuning Gemma and Mistral for the Ukrainian Language Representation	Apr 14, 2024	BenchmarkingDiversity	CodeCode Available	5
fMRI-S4: learning short- and long-range dynamic fMRI dependencies using 1D Convolutions and State Space Models	Aug 8, 2022	BenchmarkingState Space Models	CodeCode Available	5
GenderBench: Evaluation Suite for Gender Biases in LLMs	May 17, 2025	Benchmarking	CodeCode Available	5
GPT4Graph: Can Large Language Models Understand Graph Structured Data ? An Empirical Evaluation and Benchmarking	May 24, 2023	BenchmarkingGraph Mining	CodeCode Available	5
HRIBench: Benchmarking Vision-Language Models for Real-Time Human Perception in Human-Robot Interaction	Jun 25, 2025	BenchmarkingPerson Identification	CodeCode Available	5
KhabarChin: Automatic Detection of Important News in the Persian Language	Dec 6, 2023	ArticlesBenchmarking	CodeCode Available	5
Deep Nets: What have they ever done for Vision?	May 10, 2018	Benchmarking	—Unverified	0
Deeply Supervised Depth Map Super-Resolution as Novel View Synthesis	Aug 27, 2018	BenchmarkingBlocking	—Unverified	0
Deep Learning vs. Gradient Boosting: Benchmarking state-of-the-art machine learning algorithms for credit scoring	May 21, 2022	BenchmarkingBinary Classification	—Unverified	0
Benchmarking Graph Learning for Drug-Drug Interaction Prediction	Oct 24, 2024	BenchmarkingGraph Learning	—Unverified	0
Deep Learning of Intrinsically Motivated Options in the Arcade Learning Environment	Sep 29, 2021	Atari GamesBenchmarking	—Unverified	0
Benchmarking GPUs on SVBRDF Extractor Model	Oct 19, 2023	BenchmarkingGPU	—Unverified	0
Deep Learning Models for UAV-Assisted Bridge Inspection: A YOLO Benchmark Analysis	Nov 7, 2024	BenchmarkingModel Selection	—Unverified	0
Deep Learning Logo Detection with Data Expansion by Synthesising Context	Dec 29, 2016	BenchmarkingDeep Learning	—Unverified	0
Benchmarking GPU and TPU Performance with Graph Neural Networks	Oct 21, 2022	BenchmarkingGPU	—Unverified	0
A practical generalization metric for deep networks benchmarking	Sep 2, 2024	BenchmarkingDiversity	—Unverified	0
Deep Learning for Virtual Screening: Five Reasons to Use ROC Cost Functions	Jun 25, 2020	BenchmarkingDrug Discovery	—Unverified	0
Optimal Design of Volt/VAR Control Rules of Inverters using Deep Learning	Nov 17, 2022	BenchmarkingUnity	—Unverified	0
Benchmarking GPT-4 on Algorithmic Problems: A Systematic Evaluation of Prompting Strategies	Feb 27, 2024	BenchmarkingSystematic Generalization	—Unverified	0
Deep learning for molecular design - a review of the state of the art	Mar 11, 2019	Benchmarkingreinforcement-learning	—Unverified	0
Deep learning for extracting protein-protein interactions from biomedical literature	Jun 5, 2017	BenchmarkingCross-corpus	—Unverified	0
Approaches for benchmarking single-cell gene regulatory network inference methods	Jul 17, 2023	Benchmarking	—Unverified	0
Deep learning for action spotting in association football videos	Oct 2, 2024	Action SpottingBenchmarking	—Unverified	0

Show:10 25 50

← PrevPage 53 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified