Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4601–4650 of 5548 papers

Title	Date	Tasks	Status
The eBible Corpus: Data and Model Benchmarks for Bible Translation for Low-Resource Languages	Apr 19, 2023	BenchmarkingMachine Translation	CodeCode Available
LoopDB: A Loop Closure Dataset for Large Scale Simultaneous Localization and Mapping	Jun 7, 2025	BenchmarkingSimultaneous Localization and Mapping	CodeCode Available
Bilingual BSARD: Extending Statutory Article Retrieval to Dutch	Dec 10, 2024	ArticlesBenchmarking	CodeCode Available
Hyperparameter-Free Losses for Model-Based Monocular Reconstruction	Aug 16, 2019	3D ReconstructionBenchmarking	CodeCode Available
Lost in Benchmarks? Rethinking Large Language Model Benchmarking with Item Response Theory	May 21, 2025	BenchmarkingLanguage Modeling	CodeCode Available
Hyperopt-Sklearn: Automatic Hyperparameter Configuration for Scikit-Learn	Jan 1, 2014	AutoMLBenchmarking	CodeCode Available
Hyperbolic Benchmarking Unveils Network Topology-Feature Relationship in GNN Performance	Jun 4, 2024	BenchmarkingDrug Discovery	CodeCode Available
Bias Reduction via Cooperative Bargaining in Synthetic Graph Dataset Generation	May 27, 2022	BenchmarkingDataset Generation	CodeCode Available
Low Complexity Hybrid Beamforming for mmWave Full-Duplex Integrated Access and Backhaul	Sep 5, 2022	Benchmarking	CodeCode Available
Bias Analysis and Mitigation in the Evaluation of Authorship Verification	Jul 1, 2019	Authorship VerificationBenchmarking	CodeCode Available
Beyond Supervised vs. Unsupervised: Representative Benchmarking and Analysis of Image Representation Learning	Jun 16, 2022	BenchmarkingClustering	CodeCode Available
Balancing policy constraint and ensemble size in uncertainty-based offline reinforcement learning	Mar 26, 2023	Behavioural cloningBenchmarking	CodeCode Available
AnaloBench: Benchmarking the Identification of Abstract and Long-context Analogies	Feb 19, 2024	Benchmarking	CodeCode Available
Hybrid Random Features	Oct 8, 2021	Benchmarking	CodeCode Available
Beyond Slow Signs in High-fidelity Model Extraction	Jun 14, 2024	Benchmarkingmodel	CodeCode Available
Hybrid Machine Learning Models of Classifying Residential Requests for Smart Dispatching	Dec 22, 2019	BenchmarkingBIG-bench Machine Learning	CodeCode Available
BaDLAD: A Large Multi-Domain Bengali Document Layout Analysis Dataset	Mar 9, 2023	BenchmarkingDeep Learning	CodeCode Available
HuSc3D: Human Sculpture dataset for 3D object reconstruction	Jun 9, 2025	3D Object Reconstruction3D Reconstruction	CodeCode Available
LVLM-Compress-Bench: Benchmarking the Broader Impact of Large Vision-Language Model Compression	Mar 6, 2025	BenchmarkingCommon Sense Reasoning	CodeCode Available
HSSBench: Benchmarking Humanities and Social Sciences Ability for Multimodal Large Language Models	Jun 4, 2025	BenchmarkingGeneral Knowledge	CodeCode Available
Beyond Optimism: Exploration With Partially Observable Rewards	Jun 20, 2024	BenchmarkingReinforcement Learning (RL)	CodeCode Available
M3Dsynth: A dataset of medical 3D images with AI-generated local manipulations	Sep 14, 2023	BenchmarkingComputed Tomography (CT)	CodeCode Available
M4Fog: A Global Multi-Regional, Multi-Modal, and Multi-Stage Dataset for Marine Fog Detection and Forecasting to Bridge Ocean and Atmosphere	Jun 19, 2024	BenchmarkingSpatio-Temporal Forecasting	CodeCode Available
The Elusive Pursuit of Reproducing PATE-GAN: Benchmarking, Auditing, Debugging	Jun 20, 2024	Benchmarking	CodeCode Available
Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari	Feb 24, 2018	Atari GamesBenchmarking	CodeCode Available
Machine-assisted quantitizing designs: augmenting humanities and social sciences with artificial intelligence	Sep 24, 2023	BenchmarkingChange Detection	CodeCode Available
Beyond Marginal Uncertainty: How Accurately can Bayesian Regression Models Estimate Posterior Predictive Correlations?	Nov 6, 2020	Active LearningBenchmarking	CodeCode Available
Machine learning classification of non-Markovian noise disturbing quantum dynamics	Jan 8, 2021	BenchmarkingBIG-bench Machine Learning	CodeCode Available
Machine Learning Automation Toolbox (MLaut)	Jan 11, 2019	BenchmarkingBIG-bench Machine Learning	CodeCode Available
3D fluorescence microscopy data synthesis for segmentation and benchmarking	Jul 21, 2021	Benchmarking	CodeCode Available
Machine Learning Cryptanalysis of a Quantum Random Number Generator	May 7, 2019	BenchmarkingBIG-bench Machine Learning	CodeCode Available
Visual-RAG: Benchmarking Text-to-Image Retrieval Augmented Generation for Visual Knowledge Intensive Queries	Feb 23, 2025	BenchmarkingImage Retrieval	CodeCode Available
Visual-Inertial SLAM for Unstructured Outdoor Environments: Benchmarking the Benefits and Computational Costs of Loop Closing	Aug 3, 2024	Autonomous NavigationBenchmarking	CodeCode Available
Machine-learning for photoplethysmography analysis: Benchmarking feature, image, and signal-based approaches	Feb 27, 2025	BenchmarkingPhotoplethysmography (PPG)	CodeCode Available
Beyond Document Page Classification: Design, Datasets, and Challenges	Aug 24, 2023	BenchmarkingClassification	CodeCode Available
HR-VILAGE-3K3M: A Human Respiratory Viral Immunization Longitudinal Gene Expression Dataset for Systems Immunity	May 19, 2025	Benchmarkingfeature selection	CodeCode Available
VizNet: Towards A Large-Scale Visualization Learning and Benchmarking Repository	May 12, 2019	Benchmarking	CodeCode Available
HRNET: AI on Edge for mask detection and social distancing	Nov 30, 2021	BenchmarkingEdge-computing	CodeCode Available
HRIBench: Benchmarking Vision-Language Models for Real-Time Human Perception in Human-Robot Interaction	Jun 25, 2025	BenchmarkingPerson Identification	CodeCode Available
How to Manage Tiny Machine Learning at Scale: An Industrial Perspective	Feb 18, 2022	BenchmarkingBIG-bench Machine Learning	CodeCode Available
AMQA: An Adversarial Dataset for Benchmarking Bias of LLMs in Medicine and Healthcare	May 26, 2025	BenchmarkingMedical Diagnosis	CodeCode Available
Towards Segment Anything Model (SAM) for Medical Image Segmentation: A Survey	May 5, 2023	BenchmarkingImage Generation	CodeCode Available
How Far Are We from Optimal Reasoning Efficiency?	Jun 8, 2025	16kBenchmarking	CodeCode Available
Magnetic Resonance Imaging Feature-Based Subtyping and Model Ensemble for Enhanced Brain Tumor Segmentation	Dec 5, 2024	BenchmarkingBrain Tumor Segmentation	CodeCode Available
Mahalanobis k-NN: A Statistical Lens for Robust Point-Cloud Registrations	Sep 10, 2024	BenchmarkingPoint Cloud Registration	CodeCode Available
Beyond Atomic Geometry Representations in Materials Science: A Human-in-the-Loop Multimodal Framework	May 30, 2025	Benchmarking	CodeCode Available
Beyond Accuracy: A Consolidated Tool for Visual Question Answering Benchmarking	Oct 11, 2021	BenchmarkingQuestion Answering	CodeCode Available
Malliavin-Mancino estimators implemented with non-uniform fast Fourier transforms	Mar 5, 2020	Benchmarking	CodeCode Available
HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person Scenarios	Jun 11, 2025	Action RecognitionAction Segmentation	CodeCode Available
HOEG: A New Approach for Object-Centric Predictive Process Monitoring	Apr 8, 2024	BenchmarkingGraph Neural Network	CodeCode Available

Show:10 25 50

← PrevPage 93 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified