Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 5351–5375 of 5548 papers

Title	Date	Tasks	Status
Affine Non-negative Collaborative Representation Based Pattern Classification	Jul 10, 2020	BenchmarkingClassification	CodeCode Available
Subgroup analysis methods for time-to-event outcomes in heterogeneous randomized controlled trials	Jan 22, 2024	BenchmarkingSynthetic Data Generation	CodeCode Available
A Benchmarking Dataset with 2440 Organic Molecules for Volume Distribution at Steady State	Nov 10, 2022	Benchmarkingfeature selection	CodeCode Available
Constructing Confidence Intervals for 'the' Generalization Error -- a Comprehensive Benchmark Study	Sep 27, 2024	Benchmarkingtabular-regression	CodeCode Available
Subjective Visual Quality Assessment for High-Fidelity Learning-Based Image Compression	Apr 7, 2025	BenchmarkingImage Compression	CodeCode Available
Constructing a Psychometric Testbed for Fair Natural Language Processing	Nov 1, 2021	BenchmarkingFairness	CodeCode Available
Benchmarking down-scaled (not so large) pre-trained language models	May 11, 2021	Benchmarking	CodeCode Available
VHAKG: A Multi-modal Knowledge Graph Based on Synchronized Multi-view Videos of Daily Activities	Aug 27, 2024	BenchmarkingKnowledge Graphs	CodeCode Available
Constrained Reinforcement Learning for Safe Heat Pump Control	Sep 29, 2024	Benchmarkingreinforcement-learning	CodeCode Available
Benchmarking Domain Generalization Algorithms in Computational Pathology	Sep 25, 2024	BenchmarkingData Augmentation	CodeCode Available
When Multi-Task Learning Meets Partial Supervision: A Computer Vision Review	Jul 25, 2023	BenchmarkingMulti-Task Learning	CodeCode Available
XFEVER: Exploring Fact Verification across Languages	Oct 25, 2023	BenchmarkingFact Verification	CodeCode Available
Anomaly Detection in Large-Scale Cloud Systems: An Industry Case and Dataset	Nov 13, 2024	Anomaly DetectionBenchmarking	CodeCode Available
ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms	Jul 15, 2018	Benchmarking	CodeCode Available
Benchmarking Distributional Alignment of Large Language Models	Nov 8, 2024	Benchmarking	CodeCode Available
ConQRet: Benchmarking Fine-Grained Evaluation of Retrieval Augmented Argumentation with LLM Judges	Dec 6, 2024	BenchmarkingRetrieval	CodeCode Available
PPM: Automated Generation of Diverse Programming Problems for Benchmarking Code Generation Models	Jan 28, 2024	BenchmarkingCode Generation	CodeCode Available
PQA: Zero-shot Protein Question Answering for Free-form Scientific Enquiry with Large Language Models	Feb 21, 2024	BenchmarkingForm	CodeCode Available
VideoMarkBench: Benchmarking Robustness of Video Watermarking	May 27, 2025	Benchmarking	CodeCode Available
Connectivity Matters: Neural Network Pruning Through the Lens of Effective Sparsity	Jul 5, 2021	BenchmarkingNetwork Pruning	CodeCode Available
ANNA: Abstractive Text-to-Image Synthesis with Filtered News Captions	Jan 5, 2023	ArticlesBenchmarking	CodeCode Available
Precise Benchmarking of Explainable AI Attribution Methods	Aug 6, 2023	Benchmarkingimage-classification	CodeCode Available
Trade-offs in Privacy-Preserving Eye Tracking through Iris Obfuscation: A Benchmarking Study	Apr 14, 2025	BenchmarkingGaze Estimation	CodeCode Available
Connecting the Dots: Graph Neural Network Powered Ensemble and Classification of Medical Images	Nov 13, 2023	BenchmarkingClassification	CodeCode Available
PredictaBoard: Benchmarking LLM Score Predictability	Feb 20, 2025	BenchmarkingCommon Sense Reasoning	CodeCode Available

Show:10 25 50

← PrevPage 215 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified