Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1551–1600 of 5548 papers

Title	Date	Tasks	Status	Score
Benchmarking Dynamic SLO Compliance in Distributed Computing Continuum Systems	Mar 5, 2025	BenchmarkingCPU	CodeCode Available	5
Answer Consolidation: Formulation and Benchmarking	Apr 29, 2022	BenchmarkingQuestion Answering	CodeCode Available	5
Benchmarking down-scaled (not so large) pre-trained language models	Sep 1, 2021	Benchmarking	CodeCode Available	5
Benchmarking down-scaled (not so large) pre-trained language models	May 11, 2021	Benchmarking	CodeCode Available	5
Large Scale Clustering with Variational EM for Gaussian Mixture Models	Oct 1, 2018	BenchmarkingClustering	CodeCode Available	5
Learned Bayesian Cramér-Rao Bound for Unknown Measurement Models Using Score Neural Networks	Feb 2, 2025	Benchmarking	CodeCode Available	5
Learning an Event Sequence Embedding for Dense Event-Based Deep Stereo	Oct 1, 2019	Benchmarking	CodeCode Available	5
Laughing Heads: Can Transformers Detect What Makes a Sentence Funny?	May 19, 2021	BenchmarkingSentence	CodeCode Available	5
Benchmarking Domain Generalization Algorithms in Computational Pathology	Sep 25, 2024	BenchmarkingData Augmentation	CodeCode Available	5
Large-scale Ridesharing DARP Instances Based on Real Travel Demand	May 30, 2023	Benchmarking	CodeCode Available	5
Benchmarking Distributional Alignment of Large Language Models	Nov 8, 2024	Benchmarking	CodeCode Available	5
A novel evaluation methodology for supervised Feature Ranking algorithms	Jul 9, 2022	BenchmarkingFeature Importance	CodeCode Available	5
Benchmarking Differentially Private Residual Networks for Medical Imagery	May 27, 2020	Benchmarking	CodeCode Available	5
LaRA: Benchmarking Retrieval-Augmented Generation and Long-Context LLMs - No Silver Bullet for LC or RAG Routing	Feb 14, 2025	BenchmarkingRAG	CodeCode Available	5
Benchmarking Dependence Measures to Prevent Shortcut Learning in Medical Imaging	Jul 26, 2024	Benchmarking	CodeCode Available	5
Language-based Image Colorization: A Benchmark and Beyond	Mar 19, 2025	BenchmarkingColorization	CodeCode Available	5
LANTERN: A Machine Learning Framework for Lipid Nanoparticle Transfection Efficiency Prediction	Jul 3, 2025	Benchmarking	CodeCode Available	5
Selecting the motion ground truth for loose-fitting wearables: benchmarking optical MoCap methods	Jul 21, 2023	Benchmarking	CodeCode Available	5
Benchmarking Deep Spiking Neural Networks on Neuromorphic Hardware	Apr 3, 2020	BenchmarkingCPU	CodeCode Available	5
Laparoscopic Image Desmoking Using the U-Net with New Loss Function and Integrated Differentiable Wiener Filter	May 27, 2025	Benchmarking	CodeCode Available	5
Large Language Models for Outpatient Referral: Problem Definition, Benchmarking and Challenges	Mar 11, 2025	Benchmarking	CodeCode Available	5
An Optical Control Environment for Benchmarking Reinforcement Learning Algorithms	Mar 23, 2022	BenchmarkingDeep Reinforcement Learning	CodeCode Available	5
LABCAT: Locally adaptive Bayesian optimization using principal-component-aligned trust regions	Nov 19, 2023	Bayesian OptimizationBenchmarking	CodeCode Available	5
Benchmarking Deep Learning Models on NVIDIA Jetson Nano for Real-Time Systems: An Empirical Investigation	Jun 25, 2024	Action DetectionBenchmarking	CodeCode Available	5
An open unified deep graph learning framework for discovering drug leads	Dec 6, 2022	BenchmarkingDrug Discovery	CodeCode Available	5
LaCViT: A Label-aware Contrastive Fine-tuning Framework for Vision Transformers	Mar 31, 2023	Benchmarkingimage-classification	CodeCode Available	5
Advancing and Benchmarking Personalized Tool Invocation for LLMs	May 7, 2025	BenchmarkingWorld Knowledge	CodeCode Available	5
Benchmarking Robustness of Deep Learning Classifiers Using Two-Factor Perturbation	Mar 2, 2021	BenchmarkingDeep Learning	CodeCode Available	5
Can LLMs replace Neil deGrasse Tyson? Evaluating the Reliability of LLMs as Science Communicators	Sep 21, 2024	Benchmarking	CodeCode Available	5
Anomaly Detection in Large-Scale Cloud Systems: An Industry Case and Dataset	Nov 13, 2024	Anomaly DetectionBenchmarking	CodeCode Available	5
SCoRE: Benchmarking Long-Chain Reasoning in Commonsense Scenarios	Mar 8, 2025	BenchmarkingDiagnostic	CodeCode Available	5
Benchmarking Deep Learning and Vision Foundation Models for Atypical vs. Normal Mitosis Classification with Cross-Dataset Evaluation	Jun 26, 2025	BenchmarkingTransfer Learning	CodeCode Available	5
Deep Jansen-Rit Parameter Inference for Model-Driven Analysis of Brain Activity	Jun 7, 2024	BenchmarkingEEG	CodeCode Available	5
Leak Proof CMap; a framework for training and evaluation of cell line agnostic L1000 similarity methods	Apr 29, 2024	BenchmarkingDrug Discovery	CodeCode Available	5
Abstraction Alignment: Comparing Model-Learned and Human-Encoded Conceptual Relationships	Jul 17, 2024	Benchmarking	CodeCode Available	5
Keep Security! Benchmarking Security Policy Preservation in Large Language Model Contexts Against Indirect Attacks in Question Answering	May 21, 2025	BenchmarkingLanguage Modeling	CodeCode Available	5
ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms	Jul 15, 2018	Benchmarking	CodeCode Available	5
KhabarChin: Automatic Detection of Important News in the Persian Language	Dec 6, 2023	ArticlesBenchmarking	CodeCode Available	5
Knowing-how & Knowing-that: A New Task for Machine Comprehension of User Manuals	Jun 7, 2023	BenchmarkingMachine Reading Comprehension	CodeCode Available	5
Benchmarking datasets for Anomaly-based Network Intrusion Detection: KDD CUP 99 alternatives	Nov 13, 2018	BenchmarkingIntrusion Detection	CodeCode Available	5
ANNA: Abstractive Text-to-Image Synthesis with Filtered News Captions	Jan 5, 2023	ArticlesBenchmarking	CodeCode Available	5
Benchmarking Data Heterogeneity Evaluation Approaches for Personalized Federated Learning	Oct 9, 2024	BenchmarkingFairness	CodeCode Available	5
AdvancedHMC.jl: A robust, modular and efficient implementation of advanced HMC algorithms	Oct 16, 2019	Bayesian InferenceBenchmarking	CodeCode Available	5
KamNet: An Integrated Spatiotemporal Deep Neural Network for Rare Event Search in KamLAND-Zen	Mar 3, 2022	Benchmarking	CodeCode Available	5
Benchmarking Data Efficiency in Δ-ML and Multifidelity Models for Quantum Chemistry	Oct 15, 2024	Benchmarking	CodeCode Available	5
An Integrated Framework for Multi-Granular Explanation of Video Summarization	May 16, 2024	BenchmarkingPanoptic Segmentation	CodeCode Available	5
HumaniBench: A Human-Centric Framework for Large Multimodal Models Evaluation	May 16, 2025	BenchmarkingEthics	CodeCode Available	5
KArSL: Arabic Sign Language Database	Jan 1, 2021	BenchmarkingSign Language Recognition	CodeCode Available	5
Knowledge-Driven Slot Constraints for Goal-Oriented Dialogue Systems	Jun 1, 2021	BenchmarkingGoal-Oriented Dialogue Systems	CodeCode Available	5
Joint Multi-Scale Tone Mapping and Denoising for HDR Image Enhancement	Mar 16, 2023	BenchmarkingDemosaicking	CodeCode Available	5

Show:10 25 50

← PrevPage 32 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified