Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2601–2625 of 5548 papers

Title	Date	Tasks	Status	Score
Fine-grained Hand Gesture Recognition in Multi-viewpoint Hand Hygiene	Sep 7, 2021	BenchmarkingFine-Grained Image Recognition	CodeCode Available	5
FR-MRInet: A Deep Convolutional Encoder-Decoder for Brain Tumor Segmentation with Relu-RGB and Sliding-window	Jul 26, 2018	BenchmarkingBrain Tumor Segmentation	CodeCode Available	5
From Bytes to Borsch: Fine-Tuning Gemma and Mistral for the Ukrainian Language Representation	Apr 14, 2024	BenchmarkingDiversity	CodeCode Available	5
Aesthetic Image Captioning From Weakly-Labelled Photographs	Aug 29, 2019	Aesthetic Image CaptioningBenchmarking	CodeCode Available	5
Defense-friendly Images in Adversarial Attacks: Dataset and Metrics for Perturbation Difficulty	Nov 5, 2020	Adversarial AttackBenchmarking	CodeCode Available	5
DefAn: Definitive Answer Dataset for LLMs Hallucination Evaluation	Jun 13, 2024	BenchmarkingHallucination	CodeCode Available	5
FRAMES-VQA: Benchmarking Fine-Tuning Robustness across Multi-Modal Shifts in Visual Question Answering	May 27, 2025	BenchmarkingQuestion Answering	CodeCode Available	5
From Modern CNNs to Vision Transformers: Assessing the Performance, Robustness, and Classification Strategies of Deep Learning Models in Histopathology	Apr 11, 2022	BenchmarkingCancer Classification	CodeCode Available	5
Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem	Mar 6, 2024	BenchmarkingHallucination	CodeCode Available	5
Benchmarking Graph Representations and Graph Neural Networks for Multivariate Time Series Classification	Jan 14, 2025	BenchmarkingGraph Representation Learning	CodeCode Available	5
A projected nonlinear state-space model for forecasting time series signals	Nov 22, 2023	BenchmarkingComputational Efficiency	CodeCode Available	5
First-frame Supervised Video Polyp Segmentation via Propagative and Semantic Dual-teacher Network	Dec 21, 2024	BenchmarkingTransfer Learning	CodeCode Available	5
FORLORN: A Framework for Comparing Offline Methods and Reinforcement Learning for Optimization of RAN Parameters	Sep 8, 2022	Benchmarkingcontinuous-control	CodeCode Available	5
Deep Reinforcement Learning for General Video Game AI	Jun 6, 2018	Atari GamesBenchmarking	CodeCode Available	5
Forecasting Future International Events: A Reliable Dataset for Text-Based Event Modeling	Nov 21, 2024	ArticlesBenchmarking	CodeCode Available	5
Forecasting time series with constraints	Feb 14, 2025	Additive modelsBenchmarking	CodeCode Available	5
FlexMol: A Flexible Toolkit for Benchmarking Molecular Relational Learning	Oct 19, 2024	BenchmarkingDrug Discovery	CodeCode Available	5
Benchmarking Robust Self-Supervised Learning Across Diverse Downstream Tasks	Jul 17, 2024	Adversarial RobustnessBenchmarking	CodeCode Available	5
2017 Robotic Instrument Segmentation Challenge	Feb 18, 2019	BenchmarkingPerson Re-Identification	CodeCode Available	5
fMRI-S4: learning short- and long-range dynamic fMRI dependencies using 1D Convolutions and State Space Models	Aug 8, 2022	BenchmarkingState Space Models	CodeCode Available	5
DeepPatent2: A Large-Scale Benchmarking Corpus for Technical Drawing Understanding	Nov 7, 2023	3D ReconstructionBenchmarking	CodeCode Available	5
A predictive analytics approach for stroke prediction using machine learning and neural networks	Mar 1, 2022	BenchmarkingBIG-bench Machine Learning	CodeCode Available	5
Fluorescence Reference Target Quantitative Analysis Library	Apr 22, 2025	Benchmarking	CodeCode Available	5
DeepOBS: A Deep Learning Optimizer Benchmark Suite	Mar 13, 2019	BenchmarkingDeep Learning	CodeCode Available	5
Deep Neural Network Benchmarks for Selective Classification	Jan 23, 2024	BenchmarkingClassification	CodeCode Available	5

Show:10 25 50

← PrevPage 105 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified