Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2051–2100 of 5548 papers

Title	Date	Tasks	Status	Score
ImpliRet: Benchmarking the Implicit Fact Retrieval Challenge	Jun 17, 2025	BenchmarkingRetrieval	CodeCode Available	5
Improved Multilingual Language Model Pretraining for Social Media Text via Translation Pair Prediction	Oct 20, 2021	BenchmarkingLanguage Modeling	CodeCode Available	5
Learning from Integral Losses in Physics Informed Neural Networks	May 27, 2023	Benchmarking	CodeCode Available	5
Benchmarking Traditional Machine Learning and Deep Learning Models for Fault Detection in Power Transformers	May 7, 2025	BenchmarkingFault Detection	CodeCode Available	5
Benchmarking TPU, GPU, and CPU Platforms for Deep Learning	Jul 24, 2019	BenchmarkingCPU	CodeCode Available	5
ImmersePro: End-to-End Stereo Video Synthesis Via Implicit Disparity Learning	Sep 30, 2024	BenchmarkingDisparity Estimation	CodeCode Available	5
A Baseline Statistical Method For Robust User-Assisted Multiple Segmentation	Jan 8, 2022	BenchmarkingImage Segmentation	CodeCode Available	5
Benchmarking Top-K Keyword and Top-K Document Processing with T^2K^2 and T^2K^2D^2	Apr 20, 2018	Benchmarking	CodeCode Available	5
Benchmarking tools for a priori identifiability analysis	Jul 20, 2022	Benchmarking	CodeCode Available	5
Automatic benchmarking of large multimodal models via iterative experiment programming	Jun 18, 2024	BenchmarkingLanguage Modeling	CodeCode Available	5
Automated Text-to-Table for Reasoning-Intensive Table QA: Pipeline Design and Benchmarking Insights	May 26, 2025	BenchmarkingQuestion Answering	CodeCode Available	5
Benchmarking time series classification -- Functional data vs machine learning approaches	Nov 18, 2019	Additive modelsBenchmarking	CodeCode Available	5
A Linear Constrained Optimization Benchmark For Probabilistic Search Algorithms: The Rotated Klee-Minty Problem	Jul 26, 2018	BenchmarkingEvolutionary Algorithms	CodeCode Available	5
A Continuous Information Gain Measure to Find the Most Discriminatory Problems for AI Benchmarking	Sep 9, 2018	BenchmarkingGame Design	CodeCode Available	5
Illuminating the Diversity-Fitness Trade-Off in Black-Box Optimization	Aug 29, 2024	BenchmarkingDiversity	CodeCode Available	5
Benchmarking the Robustness of UAV Tracking Against Common Corruptions	Mar 18, 2024	Benchmarking	CodeCode Available	5
Illusory VQA: Benchmarking and Enhancing Multimodal Models on Visual Illusions	Dec 11, 2024	BenchmarkingQuestion Answering	CodeCode Available	5
Immunofluorescence Capillary Imaging Segmentation: Cases Study	Jul 14, 2022	BenchmarkingImage Segmentation	CodeCode Available	5
IHCV: Discovery of Hidden Time-Dependent Control Variables in Non-Linear Dynamical Systems	Apr 5, 2023	Benchmarking	CodeCode Available	5
IJCB 2022 Mobile Behavioral Biometrics Competition (MobileB2C)	Oct 6, 2022	Benchmarking	CodeCode Available	5
Benchmarking the Robustness of Optical Flow Estimation to Corruptions	Nov 22, 2024	Autonomous DrivingBenchmarking	CodeCode Available	5
Automated Detection of Label Errors in Semantic Segmentation Datasets via Deep Learning and Uncertainty Quantification	Jul 13, 2022	BenchmarkingLabel Error Detection	CodeCode Available	5
A Context-Aware Citation Recommendation Model with BERT and Graph Convolutional Networks	Mar 15, 2019	BenchmarkingCitation Recommendation	CodeCode Available	5
Identifying Money Laundering Subgraphs on the Blockchain	Oct 10, 2024	Benchmarking	CodeCode Available	5
Identifying the Smallest Adversarial Load Perturbations that Render DC-OPF Infeasible	Jul 10, 2025	Adversarial AttackBenchmarking	CodeCode Available	5
Automated deep learning segmentation of high-resolution 7 T postmortem MRI for quantitative analysis of structure-pathology correlations in neurodegenerative diseases	Mar 21, 2023	AnatomyBenchmarking	CodeCode Available	5
IceBench: A Benchmark for Deep Learning based Sea Ice Type Classification	Mar 22, 2025	BenchmarkingClassification	CodeCode Available	5
Integrating Large Language Models and Knowledge Graphs for Extraction and Validation of Textual Test Data	Aug 3, 2024	BenchmarkingKnowledge Graphs	CodeCode Available	5
IdeaBench: Benchmarking Large Language Models for Research Idea Generation	Oct 31, 2024	Benchmarkingscientific discovery	CodeCode Available	5
AutoJudger: An Agent-Driven Framework for Efficient Benchmarking of MLLMs	May 27, 2025	BenchmarkingQuestion Selection	CodeCode Available	5
HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMs	Feb 25, 2024	BenchmarkingChatbot	CodeCode Available	5
Identifying and Benchmarking Natural Out-of-Context Prediction Problems	Oct 25, 2021	Benchmarking	CodeCode Available	5
Impact of ImageNet Model Selection on Domain Adaptation	Feb 6, 2020	BenchmarkingDomain Adaptation	CodeCode Available	5
Benchmarking the Linear Algebra Awareness of TensorFlow and PyTorch	Feb 20, 2022	Benchmarking	CodeCode Available	5
Hyperbolic Benchmarking Unveils Network Topology-Feature Relationship in GNN Performance	Jun 4, 2024	BenchmarkingDrug Discovery	CodeCode Available	5
AutoBench-V: Can Large Vision-Language Models Benchmark Themselves?	Oct 28, 2024	BenchmarkingQuestion Answering	CodeCode Available	5
Benchmarking the Hooke-Jeeves Method, MTS-LS1, and BSrr on the Large-scale BBOB Function Set	Apr 28, 2022	Benchmarking	CodeCode Available	5
ALDI++: Automatic and parameter-less discord and outlier detection for building energy load profiles	Mar 13, 2022	BenchmarkingBIG-bench Machine Learning	CodeCode Available	5
Hyperopt-Sklearn: Automatic Hyperparameter Configuration for Scikit-Learn	Jan 1, 2014	AutoMLBenchmarking	CodeCode Available	5
Benchmarking the Hill-Valley Evolutionary Algorithm for the GECCO 2018 Competition on Niching Methods Multimodal Optimization	Jun 30, 2018	Benchmarking	CodeCode Available	5
Hybrid Machine Learning Models of Classifying Residential Requests for Smart Dispatching	Dec 22, 2019	BenchmarkingBIG-bench Machine Learning	CodeCode Available	5
Hybrid Random Features	Oct 8, 2021	Benchmarking	CodeCode Available	5
HuSc3D: Human Sculpture dataset for 3D object reconstruction	Jun 9, 2025	3D Object Reconstruction3D Reconstruction	CodeCode Available	5
Hyperparameter-Free Losses for Model-Based Monocular Reconstruction	Aug 16, 2019	3D ReconstructionBenchmarking	CodeCode Available	5
Benchmarking the Fairness of Image Upsampling Methods	Jan 24, 2024	BenchmarkingDiversity	CodeCode Available	5
AuthNet: A Deep Learning based Authentication Mechanism using Temporal Facial Feature Movements	Dec 4, 2020	BenchmarkingLip password classification	CodeCode Available	5
Authentic Emotion Mapping: Benchmarking Facial Expressions in Real News	Apr 21, 2024	BenchmarkingEmotion Recognition	CodeCode Available	5
Alchemy: A Quantum Chemistry Dataset for Benchmarking AI Models	Jun 22, 2019	BenchmarkingBIG-bench Machine Learning	CodeCode Available	5
HSSBench: Benchmarking Humanities and Social Sciences Ability for Multimodal Large Language Models	Jun 4, 2025	BenchmarkingGeneral Knowledge	CodeCode Available	5
HRNET: AI on Edge for mask detection and social distancing	Nov 30, 2021	BenchmarkingEdge-computing	CodeCode Available	5

Show:10 25 50

← PrevPage 42 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified