Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4601–4625 of 5548 papers

Title	Date	Tasks	Status
The eBible Corpus: Data and Model Benchmarks for Bible Translation for Low-Resource Languages	Apr 19, 2023	BenchmarkingMachine Translation	CodeCode Available
LoopDB: A Loop Closure Dataset for Large Scale Simultaneous Localization and Mapping	Jun 7, 2025	BenchmarkingSimultaneous Localization and Mapping	CodeCode Available
Bilingual BSARD: Extending Statutory Article Retrieval to Dutch	Dec 10, 2024	ArticlesBenchmarking	CodeCode Available
Hyperparameter-Free Losses for Model-Based Monocular Reconstruction	Aug 16, 2019	3D ReconstructionBenchmarking	CodeCode Available
Lost in Benchmarks? Rethinking Large Language Model Benchmarking with Item Response Theory	May 21, 2025	BenchmarkingLanguage Modeling	CodeCode Available
Hyperopt-Sklearn: Automatic Hyperparameter Configuration for Scikit-Learn	Jan 1, 2014	AutoMLBenchmarking	CodeCode Available
Hyperbolic Benchmarking Unveils Network Topology-Feature Relationship in GNN Performance	Jun 4, 2024	BenchmarkingDrug Discovery	CodeCode Available
Bias Reduction via Cooperative Bargaining in Synthetic Graph Dataset Generation	May 27, 2022	BenchmarkingDataset Generation	CodeCode Available
Low Complexity Hybrid Beamforming for mmWave Full-Duplex Integrated Access and Backhaul	Sep 5, 2022	Benchmarking	CodeCode Available
Bias Analysis and Mitigation in the Evaluation of Authorship Verification	Jul 1, 2019	Authorship VerificationBenchmarking	CodeCode Available
Beyond Supervised vs. Unsupervised: Representative Benchmarking and Analysis of Image Representation Learning	Jun 16, 2022	BenchmarkingClustering	CodeCode Available
Balancing policy constraint and ensemble size in uncertainty-based offline reinforcement learning	Mar 26, 2023	Behavioural cloningBenchmarking	CodeCode Available
AnaloBench: Benchmarking the Identification of Abstract and Long-context Analogies	Feb 19, 2024	Benchmarking	CodeCode Available
Hybrid Random Features	Oct 8, 2021	Benchmarking	CodeCode Available
Beyond Slow Signs in High-fidelity Model Extraction	Jun 14, 2024	Benchmarkingmodel	CodeCode Available
Hybrid Machine Learning Models of Classifying Residential Requests for Smart Dispatching	Dec 22, 2019	BenchmarkingBIG-bench Machine Learning	CodeCode Available
BaDLAD: A Large Multi-Domain Bengali Document Layout Analysis Dataset	Mar 9, 2023	BenchmarkingDeep Learning	CodeCode Available
HuSc3D: Human Sculpture dataset for 3D object reconstruction	Jun 9, 2025	3D Object Reconstruction3D Reconstruction	CodeCode Available
LVLM-Compress-Bench: Benchmarking the Broader Impact of Large Vision-Language Model Compression	Mar 6, 2025	BenchmarkingCommon Sense Reasoning	CodeCode Available
HSSBench: Benchmarking Humanities and Social Sciences Ability for Multimodal Large Language Models	Jun 4, 2025	BenchmarkingGeneral Knowledge	CodeCode Available
Beyond Optimism: Exploration With Partially Observable Rewards	Jun 20, 2024	BenchmarkingReinforcement Learning (RL)	CodeCode Available
M3Dsynth: A dataset of medical 3D images with AI-generated local manipulations	Sep 14, 2023	BenchmarkingComputed Tomography (CT)	CodeCode Available
M4Fog: A Global Multi-Regional, Multi-Modal, and Multi-Stage Dataset for Marine Fog Detection and Forecasting to Bridge Ocean and Atmosphere	Jun 19, 2024	BenchmarkingSpatio-Temporal Forecasting	CodeCode Available
The Elusive Pursuit of Reproducing PATE-GAN: Benchmarking, Auditing, Debugging	Jun 20, 2024	Benchmarking	CodeCode Available
Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari	Feb 24, 2018	Atari GamesBenchmarking	CodeCode Available

Show:10 25 50

← PrevPage 185 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified