Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 951–975 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Benchmarking Differential Privacy and Federated Learning for BERT Models	Jun 26, 2021	BenchmarkingFederated Learning	CodeCode Available	1	5
Accelerated and interpretable oblique random survival forests	Aug 1, 2022	BenchmarkingComputational Efficiency	CodeCode Available	1	5
Decoding the Underlying Meaning of Multimodal Hateful Memes	May 28, 2023	BenchmarkingHateful Meme Classification	CodeCode Available	1	5
Benchmarking Distribution Shift in Tabular Data with TableShift	Dec 10, 2023	BenchmarkingBinary Classification	CodeCode Available	1	5
Decoding the Enigma: Benchmarking Humans and AIs on the Many Facets of Working Memory	Jul 20, 2023	BenchmarkingDecision Making	CodeCode Available	1	5
Mitigating Gender Bias in Captioning Systems	Jun 15, 2020	BenchmarkingGender Prediction	CodeCode Available	1	5
Decentralized Arena: Towards Democratic and Scalable Automatic Evaluation of Language Models	May 19, 2025	BenchmarkingChatbot	CodeCode Available	1	5
dEchorate: a Calibrated Room Impulse Response Database for Echo-aware Signal Processing	Apr 27, 2021	BenchmarkingRetrieval	CodeCode Available	1	5
EventEA: Benchmarking Entity Alignment for Event-centric Knowledge Graphs	Nov 5, 2022	AttributeBenchmarking	CodeCode Available	1	5
Benchmarking Offline Reinforcement Learning on Real-Robot Hardware	Jul 28, 2023	Benchmarkingreinforcement-learning	CodeCode Available	1	5
3DYoga90: A Hierarchical Video Dataset for Yoga Pose Understanding	Oct 16, 2023	Action RecognitionBenchmarking	CodeCode Available	1	5
Benchmarking Econometric and Machine Learning Methodologies in Nowcasting	May 6, 2022	BenchmarkingBIG-bench Machine Learning	CodeCode Available	1	5
Event Probability Mask (EPM) and Event Denoising Convolutional Neural Network (EDnCNN) for Neuromorphic Cameras	Mar 18, 2020	BenchmarkingDenoising	CodeCode Available	1	5
Experimental Validation of Ultrasound Beamforming with End-to-End Deep Learning for Single Plane Wave Imaging	Apr 22, 2024	Benchmarking	CodeCode Available	1	5
MMTU: A Massive Multi-Task Table Understanding and Reasoning Benchmark	Jun 5, 2025	Benchmarking	CodeCode Available	1	5
Benchmarking Embedding Aggregation Methods in Computational Pathology: A Clinical Data Perspective	Jul 10, 2024	BenchmarkingDiagnostic	CodeCode Available	1	5
Failure Detection in Medical Image Classification: A Reality Check and Benchmarking Testbed	May 27, 2022	BenchmarkingBinary Classification	CodeCode Available	1	5
FedScale: Benchmarking Model and System Performance of Federated Learning at Scale	May 24, 2021	BenchmarkingFederated Learning	CodeCode Available	1	5
Deep Learning-Based Synchronization for Uplink NB-IoT	May 22, 2022	BenchmarkingDeep Learning	CodeCode Available	1	5
Deep Learning for ECG Analysis: Benchmarks and Insights from PTB-XL	Apr 28, 2020	AllBenchmarking	CodeCode Available	1	5
Working Memory Capacity of ChatGPT: An Empirical Study	Apr 30, 2023	BenchmarkingLanguage Modeling	CodeCode Available	1	5
Benchmarking Natural Language Understanding Services for building Conversational Agents	Mar 13, 2019	BenchmarkingGeneral Classification	CodeCode Available	1	5
Monash University, UEA, UCR Time Series Extrinsic Regression Archive	Jun 19, 2020	BenchmarkingMissing Values	CodeCode Available	1	5
MONICA: Benchmarking on Long-tailed Medical Image Classification	Oct 2, 2024	BenchmarkingClassification	CodeCode Available	1	5
Benchmarking Neural Network Generalization for Grammar Induction	Aug 16, 2023	Benchmarking	CodeCode Available	1	5

Show:10 25 50

← PrevPage 39 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified