Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3101–3150 of 5548 papers

Title	Date	Tasks	Status
Benchmarking Sample Selection Strategies for Batch Reinforcement Learning	Sep 29, 2021	BenchmarkingImitation Learning	—Unverified
InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset	Sep 3, 2018	BenchmarkingSimultaneous Localization and Mapping	—Unverified
InterLoc: LiDAR-based Intersection Localization using Road Segmentation with Automated Evaluation Method	May 1, 2025	BenchmarkingMotion Planning	—Unverified
InternalInspector I^2: Robust Confidence Estimation in LLMs through Internal States	Jun 17, 2024	BenchmarkingContrastive Learning	—Unverified
Interpretable Feature Construction for Time Series Extrinsic Regression	Mar 15, 2021	Benchmarkingregression	—Unverified
Interpretable graph-based models on multimodal biomedical data integration: A technical review and benchmarking	May 3, 2025	BenchmarkingData Integration	—Unverified
Interpretable machine learning applied to on-farm biosecurity and porcine reproductive and respiratory syndrome virus	Jun 11, 2021	BenchmarkingBIG-bench Machine Learning	—Unverified
Benchmarking Safe Deep Reinforcement Learning in Aquatic Navigation	Dec 16, 2021	BenchmarkingDeep Reinforcement Learning	—Unverified
Benchmarking Rotary Position Embeddings for Automatic Speech Recognition	Jan 10, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
The Role of Local Intrinsic Dimensionality in Benchmarking Nearest Neighbor Search	Jul 17, 2019	BenchmarkingDiversity	—Unverified
Benchmarking Robustness of Deep Reinforcement Learning approaches to Online Portfolio Management	Jun 19, 2023	BenchmarkingDeep Reinforcement Learning	—Unverified
Benchmarking Robustness of Deep Learning Classifiers Using Two-Factor Perturbation	Mar 2, 2022	BenchmarkingDeep Learning	—Unverified
Intrinsic uncertainties and where to find them	Jul 6, 2021	Benchmarking	—Unverified
Introducing a new benchmarked dataset for activity monitoring	Jun 18, 2012	BenchmarkingClassification	—Unverified
Introducing CausalBench: A Flexible Benchmark Framework for Causal Analysis and Machine Learning	Sep 12, 2024	BenchmarkingFairness	—Unverified
Benchmarking Robustness of Contrastive Learning Models for Medical Image-Report Retrieval	Jan 15, 2025	BenchmarkingContrastive Learning	—Unverified
Introducing RezoJDM16k: a French KnowledgeGraph DataSet for Link Prediction	Jun 1, 2022	16kBenchmarking	—Unverified
7th AI Driving Olympics: 1st Place Report for Panoptic Tracking	Dec 9, 2021	BenchmarkingPanoptic Segmentation	—Unverified
Benchmarking Robustness of AI-Enabled Multi-sensor Fusion Systems: Challenges and Opportunities	Jun 6, 2023	BenchmarkingDepth Completion	—Unverified
Introduction to Voice Presentation Attack Detection and Recent Advances	Jan 4, 2019	BenchmarkingSpeaker Recognition	—Unverified
Intuitive or Dependent? Investigating LLMs' Behavior Style to Conflicting Prompts	Sep 29, 2023	BenchmarkingDecision Making	—Unverified
InverseBench: Benchmarking Plug-and-Play Diffusion Priors for Inverse Problems in Physical Sciences	Mar 14, 2025	BenchmarkingImage Restoration	—Unverified
A Framework for Benchmarking and Aligning Task-Planning Safety in LLM-Based Embodied Agents	Apr 20, 2025	BenchmarkingTask Planning	—Unverified
Investigating Deep-Learning NLP for Automating the Extraction of Oncology Efficacy Endpoints from Scientific Literature	Nov 3, 2023	Benchmarking	—Unverified
Investigating Energy Efficiency and Performance Trade-offs in LLM Inference Across Tasks and DVFS Settings	Jan 14, 2025	BenchmarkingQuestion Answering	—Unverified
The Russian practice of applying cluster approach in regional development	Jun 8, 2021	Benchmarking	—Unverified
Investigating the Robustness and Properties of Detection Transformers (DETR) Toward Difficult Images	Oct 12, 2023	BenchmarkingDecoder	—Unverified
Benchmarking Robustness of Adaptation Methods on Pre-trained Vision-Language Models	Jun 3, 2023	Benchmarking	—Unverified
Investigating the Vision Transformer Model for Image Retrieval Tasks	Jan 11, 2021	BenchmarkingImage Retrieval	—Unverified
Benchmarking Robustness in Neural Radiance Fields	Jan 10, 2023	BenchmarkingCamera Calibration	—Unverified
The Principle of Unchanged Optimality in Reinforcement Learning Generalization	Jun 2, 2019	Benchmarkingreinforcement-learning	—Unverified
Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting	Apr 30, 2024	BenchmarkingDepth Completion	—Unverified
Benchmarking Robustness and Generalization in Multi-Agent Systems: A Case Study on Neural MMO	Aug 30, 2023	BenchmarkingReinforcement Learning (RL)	—Unverified
Benchmarking Robot Manipulation with the Rubik's Cube	Feb 14, 2022	BenchmarkingRobot Manipulation	—Unverified
Benchmarking Retrieval-Augmented Large Language Models in Biomedical NLP: Application, Robustness, and Self-Awareness	May 13, 2024	Benchmarkingcounterfactual	—Unverified
The Seeker's Dilemma: Realistic Formulation and Benchmarking for Hardware Trojan Detection	Feb 27, 2024	Benchmarking	—Unverified
4Seasons: Benchmarking Visual SLAM and Long-Term Localization for Autonomous Driving in Challenging Conditions	Dec 31, 2022	Autonomous DrivingBenchmarking	—Unverified
IoT-LLM: Enhancing Real-World IoT Task Reasoning with Large Language Models	Oct 3, 2024	BenchmarkingIn-Context Learning	—Unverified
IO-VNBD: Inertial and Odometry Benchmark Dataset for Ground Vehicle Positioning	May 4, 2020	Autonomous VehiclesBenchmarking	—Unverified
The Sparsity Roofline: Understanding the Hardware Limits of Sparse Neural Networks	Sep 30, 2023	Benchmarking	—Unverified
Iris Liveness Detection Competition (LivDet-Iris) -- The 2020 Edition	Sep 1, 2020	Benchmarking	—Unverified
Is Bang-Bang Control All You Need? Solving Continuous Control with Bernoulli Policies	Nov 3, 2021	AllBenchmarking	—Unverified
Is Bio-Inspired Learning Better than Backprop? Benchmarking Bio Learning vs. Backprop	Dec 9, 2022	Benchmarking	—Unverified
Benchmarking Retrieval-Augmented Generation for Chemistry	May 12, 2025	BenchmarkingRAG	—Unverified
A Framework and Benchmarking Study for Counterfactual Generating Methods on Tabular Data	Jul 9, 2021	Benchmarkingcounterfactual	—Unverified
Evaluating Ising Processing Units with Integer Programming	Jul 2, 2017	Benchmarking	—Unverified
Benchmarking Resource Usage for Efficient Distributed Deep Learning	Jan 28, 2022	BenchmarkingDeep Learning	—Unverified
Benchmarking Reinforcement Learning Methods for Dexterous Robotic Manipulation with a Three-Fingered Gripper	Aug 27, 2024	BenchmarkingReinforcement Learning (RL)	—Unverified
ISLES'24: Improving final infarct prediction in ischemic stroke using multimodal imaging and clinical data	Aug 20, 2024	Benchmarking	—Unverified
Benchmarking Reasoning Robustness in Large Language Models	Mar 6, 2025	BenchmarkingMath	—Unverified

Show:10 25 50

← PrevPage 63 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified