Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 5351–5400 of 5548 papers

Title	Date	Tasks	Status
Affine Non-negative Collaborative Representation Based Pattern Classification	Jul 10, 2020	BenchmarkingClassification	CodeCode Available
Subgroup analysis methods for time-to-event outcomes in heterogeneous randomized controlled trials	Jan 22, 2024	BenchmarkingSynthetic Data Generation	CodeCode Available
A Benchmarking Dataset with 2440 Organic Molecules for Volume Distribution at Steady State	Nov 10, 2022	Benchmarkingfeature selection	CodeCode Available
Constructing Confidence Intervals for 'the' Generalization Error -- a Comprehensive Benchmark Study	Sep 27, 2024	Benchmarkingtabular-regression	CodeCode Available
Subjective Visual Quality Assessment for High-Fidelity Learning-Based Image Compression	Apr 7, 2025	BenchmarkingImage Compression	CodeCode Available
Constructing a Psychometric Testbed for Fair Natural Language Processing	Nov 1, 2021	BenchmarkingFairness	CodeCode Available
Benchmarking down-scaled (not so large) pre-trained language models	May 11, 2021	Benchmarking	CodeCode Available
VHAKG: A Multi-modal Knowledge Graph Based on Synchronized Multi-view Videos of Daily Activities	Aug 27, 2024	BenchmarkingKnowledge Graphs	CodeCode Available
Constrained Reinforcement Learning for Safe Heat Pump Control	Sep 29, 2024	Benchmarkingreinforcement-learning	CodeCode Available
Benchmarking Domain Generalization Algorithms in Computational Pathology	Sep 25, 2024	BenchmarkingData Augmentation	CodeCode Available
When Multi-Task Learning Meets Partial Supervision: A Computer Vision Review	Jul 25, 2023	BenchmarkingMulti-Task Learning	CodeCode Available
XFEVER: Exploring Fact Verification across Languages	Oct 25, 2023	BenchmarkingFact Verification	CodeCode Available
Anomaly Detection in Large-Scale Cloud Systems: An Industry Case and Dataset	Nov 13, 2024	Anomaly DetectionBenchmarking	CodeCode Available
ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms	Jul 15, 2018	Benchmarking	CodeCode Available
Benchmarking Distributional Alignment of Large Language Models	Nov 8, 2024	Benchmarking	CodeCode Available
ConQRet: Benchmarking Fine-Grained Evaluation of Retrieval Augmented Argumentation with LLM Judges	Dec 6, 2024	BenchmarkingRetrieval	CodeCode Available
PPM: Automated Generation of Diverse Programming Problems for Benchmarking Code Generation Models	Jan 28, 2024	BenchmarkingCode Generation	CodeCode Available
PQA: Zero-shot Protein Question Answering for Free-form Scientific Enquiry with Large Language Models	Feb 21, 2024	BenchmarkingForm	CodeCode Available
VideoMarkBench: Benchmarking Robustness of Video Watermarking	May 27, 2025	Benchmarking	CodeCode Available
Connectivity Matters: Neural Network Pruning Through the Lens of Effective Sparsity	Jul 5, 2021	BenchmarkingNetwork Pruning	CodeCode Available
ANNA: Abstractive Text-to-Image Synthesis with Filtered News Captions	Jan 5, 2023	ArticlesBenchmarking	CodeCode Available
Precise Benchmarking of Explainable AI Attribution Methods	Aug 6, 2023	Benchmarkingimage-classification	CodeCode Available
Trade-offs in Privacy-Preserving Eye Tracking through Iris Obfuscation: A Benchmarking Study	Apr 14, 2025	BenchmarkingGaze Estimation	CodeCode Available
Connecting the Dots: Graph Neural Network Powered Ensemble and Classification of Medical Images	Nov 13, 2023	BenchmarkingClassification	CodeCode Available
PredictaBoard: Benchmarking LLM Score Predictability	Feb 20, 2025	BenchmarkingCommon Sense Reasoning	CodeCode Available
Benchmarking Differentially Private Residual Networks for Medical Imagery	May 27, 2020	Benchmarking	CodeCode Available
An Integrated Framework for Multi-Granular Explanation of Video Summarization	May 16, 2024	BenchmarkingPanoptic Segmentation	CodeCode Available
Benchmarking Dependence Measures to Prevent Shortcut Learning in Medical Imaging	Jul 26, 2024	Benchmarking	CodeCode Available
GNN-Suite: a Graph Neural Network Benchmarking Framework for Biomedical Informatics	May 15, 2025	BenchmarkingGraph Neural Network	CodeCode Available
Benchmarking Deep Spiking Neural Networks on Neuromorphic Hardware	Apr 3, 2020	BenchmarkingCPU	CodeCode Available
SurvUnc: A Meta-Model Based Uncertainty Quantification Framework for Survival Analysis	May 20, 2025	BenchmarkingModel Optimization	CodeCode Available
Aesthetic Image Captioning From Weakly-Labelled Photographs	Aug 29, 2019	Aesthetic Image CaptioningBenchmarking	CodeCode Available
Benchmarking Deep Learning Models on NVIDIA Jetson Nano for Real-Time Systems: An Empirical Investigation	Jun 25, 2024	Action DetectionBenchmarking	CodeCode Available
An implementation of the "Guess who?" game using CLIP	Nov 30, 2021	Benchmarking	CodeCode Available
ADVIO: An authentic dataset for visual-inertial odometry	Jul 25, 2018	Benchmarking	CodeCode Available
CONGRA: Benchmarking Automatic Conflict Resolution	Sep 21, 2024	Benchmarking	CodeCode Available
When the Music Stops: Tip-of-the-Tongue Retrieval for Music	May 23, 2023	BenchmarkingLanguage Modeling	CodeCode Available
Benchmarking Robustness of Deep Learning Classifiers Using Two-Factor Perturbation	Mar 2, 2021	BenchmarkingDeep Learning	CodeCode Available
Present and Future Generalization of Synthetic Image Detectors	Sep 21, 2024	BenchmarkingDiversity	CodeCode Available
SweetRS: Dataset for a recommender systems of sweets	Sep 10, 2017	BenchmarkingMatrix Completion	CodeCode Available
PRGFlow: Benchmarking SWAP-Aware Unified Deep Visual Inertial Odometry	Jun 11, 2020	BenchmarkingTranslation	CodeCode Available
An extensible Benchmarking Graph-Mesh dataset for studying Steady-State Incompressible Navier-Stokes Equations	Jun 29, 2022	Benchmarking	CodeCode Available
Benchmarking Deep Learning and Vision Foundation Models for Atypical vs. Normal Mitosis Classification with Cross-Dataset Evaluation	Jun 26, 2025	BenchmarkingTransfer Learning	CodeCode Available
Vi(E)va LLM! A Conceptual Stack for Evaluating and Interpreting Generative AI-based Visualizations	Feb 3, 2024	Benchmarking	CodeCode Available
Conditional out-of-sample generation for unpaired data using trVAE	Oct 4, 2019	BenchmarkingDecoder	CodeCode Available
Deep Jansen-Rit Parameter Inference for Model-Driven Analysis of Brain Activity	Jun 7, 2024	BenchmarkingEEG	CodeCode Available
Adversarial Metric Attack and Defense for Person Re-identification	Jan 30, 2019	Adversarial AttackBenchmarking	CodeCode Available
Conditional diffusions for amortized neural posterior estimation	Oct 24, 2024	Bayesian InferenceBenchmarking	CodeCode Available
Where are we now? A large benchmark study of recent symbolic regression methods	Apr 25, 2018	BenchmarkingBIG-bench Machine Learning	CodeCode Available
Transcending the Attention Paradigm: Representation Learning from Geospatial Social Media Data	Oct 9, 2023	BenchmarkingLanguage Modeling	CodeCode Available

Show:10 25 50

← PrevPage 108 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified