Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4401–4425 of 5548 papers

Title	Date	Tasks	Status
LAVIS: A Library for Language-Vision Intelligence	Sep 15, 2022	BenchmarkingImage Captioning	—Unverified
LayoutXLM vs. GNN: An Empirical Evaluation of Relation Extraction for Documents	May 9, 2022	BenchmarkingGraph Neural Network	—Unverified
LCFO: Long Context and Long Form Output Dataset and Benchmarking	Dec 11, 2024	BenchmarkingForm	—Unverified
LEAF: A Benchmark for Federated Settings	May 16, 2019	Autonomous VehiclesBenchmarking	—Unverified
Leaf Segmentation and Counting with Deep Learning: on Model Certainty, Test-Time Augmentation, Trade-Offs	Dec 21, 2020	BenchmarkingPlant Phenotyping	—Unverified
Learning a CNN-based End-to-End Controller for a Formula SAE Racecar	Jul 12, 2017	BenchmarkingGPU	—Unverified
Learning a quantum computer's capability	Apr 20, 2023	Benchmarking	—Unverified
Learning a Representation with the Block-Diagonal Structure for Pattern Classification	Nov 23, 2019	BenchmarkingClassification	—Unverified
Learning a Saliency Evaluation Metric Using Crowdsourced Perceptual Judgments	Jun 27, 2018	Benchmarking	—Unverified
Learning Best Paths in Quantum Networks	Jun 14, 2025	Benchmarking	—Unverified
Learning Disentangled Audio Representations through Controlled Synthesis	Feb 16, 2024	BenchmarkingDisentanglement	—Unverified
Learning Disentangled Speech Representations	Nov 4, 2023	BenchmarkingDisentanglement	—Unverified
LABCAT: Locally adaptive Bayesian optimization using principal-component-aligned trust regions	Nov 19, 2023	Bayesian OptimizationBenchmarking	CodeCode Available
SCoRE: Benchmarking Long-Chain Reasoning in Commonsense Scenarios	Mar 8, 2025	BenchmarkingDiagnostic	CodeCode Available
Benchmark data and method for real-time people counting in cluttered scenes using depth sensors	Apr 12, 2018	Benchmarking	CodeCode Available
Reassessing Layer Pruning in LLMs: New Insights and Methods	Nov 23, 2024	BenchmarkingGPU	CodeCode Available
LaCViT: A Label-aware Contrastive Fine-tuning Framework for Vision Transformers	Mar 31, 2023	Benchmarkingimage-classification	CodeCode Available
Re-Benchmarking Pool-Based Active Learning for Binary Classification	Jun 15, 2023	Active LearningBenchmarking	CodeCode Available
Knowledge Enhanced Conditional Imputation for Healthcare Time-series	Dec 27, 2023	BenchmarkingImputation	CodeCode Available
Selecting the motion ground truth for loose-fitting wearables: benchmarking optical MoCap methods	Jul 21, 2023	Benchmarking	CodeCode Available
Knowledge-Driven Slot Constraints for Goal-Oriented Dialogue Systems	Jun 1, 2021	BenchmarkingGoal-Oriented Dialogue Systems	CodeCode Available
CEBench: A Benchmarking Toolkit for the Cost-Effectiveness of LLM Pipelines	Jun 20, 2024	BenchmarkingDecision Making	CodeCode Available
Causality-enhanced Decision-Making for Autonomous Mobile Robots in Dynamic Environments	Apr 16, 2025	BenchmarkingCausal Inference	CodeCode Available
Capsule Vision 2024 Challenge: Multi-Class Abnormality Classification for Video Capsule Endoscopy	Aug 9, 2024	BenchmarkingMedical Image Analysis	CodeCode Available
Language-based Image Colorization: A Benchmark and Beyond	Mar 19, 2025	BenchmarkingColorization	CodeCode Available

Show:10 25 50

← PrevPage 177 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified