Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4401–4450 of 5548 papers

Title	Date	Tasks	Status
LAVIS: A Library for Language-Vision Intelligence	Sep 15, 2022	BenchmarkingImage Captioning	—Unverified
LayoutXLM vs. GNN: An Empirical Evaluation of Relation Extraction for Documents	May 9, 2022	BenchmarkingGraph Neural Network	—Unverified
LCFO: Long Context and Long Form Output Dataset and Benchmarking	Dec 11, 2024	BenchmarkingForm	—Unverified
LEAF: A Benchmark for Federated Settings	May 16, 2019	Autonomous VehiclesBenchmarking	—Unverified
Leaf Segmentation and Counting with Deep Learning: on Model Certainty, Test-Time Augmentation, Trade-Offs	Dec 21, 2020	BenchmarkingPlant Phenotyping	—Unverified
Learning a CNN-based End-to-End Controller for a Formula SAE Racecar	Jul 12, 2017	BenchmarkingGPU	—Unverified
Learning a quantum computer's capability	Apr 20, 2023	Benchmarking	—Unverified
Learning a Representation with the Block-Diagonal Structure for Pattern Classification	Nov 23, 2019	BenchmarkingClassification	—Unverified
Learning a Saliency Evaluation Metric Using Crowdsourced Perceptual Judgments	Jun 27, 2018	Benchmarking	—Unverified
Learning Best Paths in Quantum Networks	Jun 14, 2025	Benchmarking	—Unverified
Learning Disentangled Audio Representations through Controlled Synthesis	Feb 16, 2024	BenchmarkingDisentanglement	—Unverified
Learning Disentangled Speech Representations	Nov 4, 2023	BenchmarkingDisentanglement	—Unverified
LABCAT: Locally adaptive Bayesian optimization using principal-component-aligned trust regions	Nov 19, 2023	Bayesian OptimizationBenchmarking	CodeCode Available
SCoRE: Benchmarking Long-Chain Reasoning in Commonsense Scenarios	Mar 8, 2025	BenchmarkingDiagnostic	CodeCode Available
Benchmark data and method for real-time people counting in cluttered scenes using depth sensors	Apr 12, 2018	Benchmarking	CodeCode Available
Reassessing Layer Pruning in LLMs: New Insights and Methods	Nov 23, 2024	BenchmarkingGPU	CodeCode Available
LaCViT: A Label-aware Contrastive Fine-tuning Framework for Vision Transformers	Mar 31, 2023	Benchmarkingimage-classification	CodeCode Available
Re-Benchmarking Pool-Based Active Learning for Binary Classification	Jun 15, 2023	Active LearningBenchmarking	CodeCode Available
Knowledge Enhanced Conditional Imputation for Healthcare Time-series	Dec 27, 2023	BenchmarkingImputation	CodeCode Available
Selecting the motion ground truth for loose-fitting wearables: benchmarking optical MoCap methods	Jul 21, 2023	Benchmarking	CodeCode Available
Knowledge-Driven Slot Constraints for Goal-Oriented Dialogue Systems	Jun 1, 2021	BenchmarkingGoal-Oriented Dialogue Systems	CodeCode Available
CEBench: A Benchmarking Toolkit for the Cost-Effectiveness of LLM Pipelines	Jun 20, 2024	BenchmarkingDecision Making	CodeCode Available
Causality-enhanced Decision-Making for Autonomous Mobile Robots in Dynamic Environments	Apr 16, 2025	BenchmarkingCausal Inference	CodeCode Available
Capsule Vision 2024 Challenge: Multi-Class Abnormality Classification for Video Capsule Endoscopy	Aug 9, 2024	BenchmarkingMedical Image Analysis	CodeCode Available
Language-based Image Colorization: A Benchmark and Beyond	Mar 19, 2025	BenchmarkingColorization	CodeCode Available
TF1-EN-3M: Three Million Synthetic Moral Fables for Training Small, Open Language Models	Apr 29, 2025	BenchmarkingDataset Generation	CodeCode Available
BenchENAS: A Benchmarking Platform for Evolutionary Neural Architecture Search	Dec 1, 2022	BenchmarkingGPU	CodeCode Available
Knowing-how & Knowing-that: A New Task for Machine Comprehension of User Manuals	Jun 7, 2023	BenchmarkingMachine Reading Comprehension	CodeCode Available
TFW2V: An Enhanced Document Similarity Method for the Morphologically Rich Finnish Language	Dec 23, 2021	BenchmarkingClustering	CodeCode Available
Can Tree Based Approaches Surpass Deep Learning in Anomaly Detection? A Benchmarking Study	Feb 11, 2024	Anomaly DetectionBenchmarking	CodeCode Available
LANTERN: A Machine Learning Framework for Lipid Nanoparticle Transfection Efficiency Prediction	Jul 3, 2025	Benchmarking	CodeCode Available
Laparoscopic Image Desmoking Using the U-Net with New Loss Function and Integrated Differentiable Wiener Filter	May 27, 2025	Benchmarking	CodeCode Available
LaRA: Benchmarking Retrieval-Augmented Generation and Long-Context LLMs - No Silver Bullet for LC or RAG Routing	Feb 14, 2025	BenchmarkingRAG	CodeCode Available
Can LLMs Grasp Implicit Cultural Values? Benchmarking LLMs' Metacognitive Cultural Intelligence with CQ-Bench	Apr 1, 2025	Benchmarking	CodeCode Available
Recurrent Quantum Neural Networks	Jun 25, 2020	BenchmarkingBIG-bench Machine Learning	CodeCode Available
KhabarChin: Automatic Detection of Important News in the Persian Language	Dec 6, 2023	ArticlesBenchmarking	CodeCode Available
Can geometric combinatorics improve RNA branching predictions?	Mar 26, 2025	Benchmarking	CodeCode Available
BenchENAS: A Benchmarking Platform for Evolutionary Neural Architecture Search	Aug 9, 2021	BenchmarkingGPU	CodeCode Available
Can a single neuron learn predictive uncertainty?	Jun 7, 2021	BenchmarkingConformal Prediction	CodeCode Available
Keep Security! Benchmarking Security Policy Preservation in Large Language Model Contexts Against Indirect Attacks in Question Answering	May 21, 2025	BenchmarkingLanguage Modeling	CodeCode Available
Large Language Models for Outpatient Referral: Problem Definition, Benchmarking and Challenges	Mar 11, 2025	Benchmarking	CodeCode Available
Reference Matters: Benchmarking Factual Error Correction for Dialogue Summarization with Fine-grained Evaluation Framework	Jun 8, 2023	Benchmarking	CodeCode Available
KArSL: Arabic Sign Language Database	Jan 1, 2021	BenchmarkingSign Language Recognition	CodeCode Available
Can AI Validate Science? Benchmarking LLMs for Accurate Scientific Claim Evidence Reasoning	Jun 9, 2025	BenchmarkingDiagnostic	CodeCode Available
JavaBench: A Benchmark of Object-Oriented Code Generation for Evaluating Large Language Models	Jun 10, 2024	BenchmarkingCode Generation	CodeCode Available
TGB-Seq Benchmark: Challenging Temporal GNNs with Complex Sequential Dynamics	Feb 5, 2025	BenchmarkingLink Prediction	CodeCode Available
Refining Joint Text and Source Code Embeddings for Retrieval Task with Parameter-Efficient Fine-Tuning	May 7, 2024	BenchmarkingContrastive Learning	CodeCode Available
KamNet: An Integrated Spatiotemporal Deep Neural Network for Rare Event Search in KamLAND-Zen	Mar 3, 2022	Benchmarking	CodeCode Available
Joint Multi-Scale Tone Mapping and Denoising for HDR Image Enhancement	Mar 16, 2023	BenchmarkingDemosaicking	CodeCode Available
Ref-Long: Benchmarking the Long-context Referencing Capability of Long-context Language Models	Jul 13, 2025	AttributeBenchmarking	CodeCode Available

Show:10 25 50

← PrevPage 89 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified