Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2776–2800 of 5548 papers

Title	Date	Tasks	Status
A Hong Kong Sign Language Corpus Collected from Sign-interpreted TV News	May 2, 2024	BenchmarkingSign Language Recognition	—Unverified
GiCCS: A German in-Context Conversational Similarity Benchmark	Dec 16, 2022	BenchmarkingSemantic Textual Similarity	—Unverified
GIMMICK -- Globally Inclusive Multimodal Multitask Cultural Knowledge Benchmarking	Feb 19, 2025	Benchmarking	—Unverified
GIQ: Benchmarking 3D Geometric Reasoning of Vision Foundation Models with Simulated and Real Polyhedra	Jun 9, 2025	3D ReconstructionBenchmarking	—Unverified
Beyond Single-Model Views for Deep Learning: Optimization versus Generalizability of Stochastic Optimization Algorithms	Mar 1, 2024	BenchmarkingStochastic Optimization	—Unverified
Beyond Self-Talk: A Communication-Centric Survey of LLM-Based Multi-Agent Systems	Feb 20, 2025	BenchmarkingDecision Making	—Unverified
The Benchmark Lottery	Jul 14, 2021	BenchmarkingBIG-bench Machine Learning	—Unverified
Global Rice Multi-Class Segmentation Dataset (RiceSEG): A Comprehensive and Diverse High-Resolution RGB-Annotated Images for the Development and Benchmarking of Rice Segmentation Algorithms	Apr 2, 2025	BenchmarkingSemantic Segmentation	—Unverified
Global Wheat Head Dataset 2021: more diversity to improve the benchmarking of wheat head localization methods	May 17, 2021	BenchmarkingDiversity	—Unverified
Beyond Monocular Deraining: Stereo Image Deraining via Semantic Understanding	Aug 1, 2020	BenchmarkingRain Removal	—Unverified
GLOVER++: Unleashing the Potential of Affordance Learning from Human Behaviors for Robotic Manipulation	May 17, 2025	Benchmarking	—Unverified
GNNBENCH: Fair and Productive Benchmarking for Single-GPU GNN System	Apr 5, 2024	BenchmarkingGPU	—Unverified
A Benchmark for Multi-speaker Anonymization	Jul 8, 2024	BenchmarkingDisentanglement	—Unverified
Beyond Monocular Deraining: Parallel Stereo Deraining Network Via Semantic Prior	May 9, 2021	BenchmarkingRain Removal	—Unverified
Beyond Metrics: A Critical Analysis of the Variability in Large Language Model Evaluation Frameworks	Jul 29, 2024	BenchmarkingLanguage Model Evaluation	—Unverified
GNUMAP: A Parameter-Free Approach to Unsupervised Dimensionality Reduction via Graph Neural Networks	Jul 30, 2024	BenchmarkingContrastive Learning	—Unverified
Goal-Driven Sequential Data Abstraction	Jul 29, 2019	BenchmarkingGeneral Reinforcement Learning	—Unverified
A Holistic Framework Towards Vision-based Traffic Signal Control with Microscopic Simulation	Mar 11, 2024	BenchmarkingTraffic Signal Control	—Unverified
Domain Adaptation with Joint Learning for Generic, Optical Car Part Recognition and Detection Systems (Go-CaRD)	Jun 15, 2020	BenchmarkingDomain Adaptation	—Unverified
Beyond Emotion: A Multi-Modal Dataset for Human Desire Understanding	Jul 1, 2022	Benchmarking	—Unverified
The Brain Tumor Segmentation (BraTS-METS) Challenge 2023: Brain Metastasis Segmentation on Pre-treatment MRI	Jun 1, 2023	BenchmarkingBrain Tumor Segmentation	—Unverified
GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models	Apr 10, 2024	BenchmarkingDenoising	—Unverified
GreenPCO: An Unsupervised Lightweight Point Cloud Odometry Method	Dec 8, 2021	BenchmarkingObject	—Unverified
Ahead-of-Time P-Tuning	May 18, 2023	Benchmarkingparameter-efficient fine-tuning	—Unverified
Beyond Emotion: A Multi-Modal Dataset for Human Desire Understanding	Jan 16, 2022	Benchmarking	—Unverified

Show:10 25 50

← PrevPage 112 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified