Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3101–3125 of 5548 papers

Title	Date	Tasks	Status
Benchmarking Sample Selection Strategies for Batch Reinforcement Learning	Sep 29, 2021	BenchmarkingImitation Learning	—Unverified
InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset	Sep 3, 2018	BenchmarkingSimultaneous Localization and Mapping	—Unverified
InterLoc: LiDAR-based Intersection Localization using Road Segmentation with Automated Evaluation Method	May 1, 2025	BenchmarkingMotion Planning	—Unverified
InternalInspector I^2: Robust Confidence Estimation in LLMs through Internal States	Jun 17, 2024	BenchmarkingContrastive Learning	—Unverified
Interpretable Feature Construction for Time Series Extrinsic Regression	Mar 15, 2021	Benchmarkingregression	—Unverified
Interpretable graph-based models on multimodal biomedical data integration: A technical review and benchmarking	May 3, 2025	BenchmarkingData Integration	—Unverified
Interpretable machine learning applied to on-farm biosecurity and porcine reproductive and respiratory syndrome virus	Jun 11, 2021	BenchmarkingBIG-bench Machine Learning	—Unverified
Benchmarking Safe Deep Reinforcement Learning in Aquatic Navigation	Dec 16, 2021	BenchmarkingDeep Reinforcement Learning	—Unverified
Benchmarking Rotary Position Embeddings for Automatic Speech Recognition	Jan 10, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
The Role of Local Intrinsic Dimensionality in Benchmarking Nearest Neighbor Search	Jul 17, 2019	BenchmarkingDiversity	—Unverified
Benchmarking Robustness of Deep Reinforcement Learning approaches to Online Portfolio Management	Jun 19, 2023	BenchmarkingDeep Reinforcement Learning	—Unverified
Benchmarking Robustness of Deep Learning Classifiers Using Two-Factor Perturbation	Mar 2, 2022	BenchmarkingDeep Learning	—Unverified
Intrinsic uncertainties and where to find them	Jul 6, 2021	Benchmarking	—Unverified
Introducing a new benchmarked dataset for activity monitoring	Jun 18, 2012	BenchmarkingClassification	—Unverified
Introducing CausalBench: A Flexible Benchmark Framework for Causal Analysis and Machine Learning	Sep 12, 2024	BenchmarkingFairness	—Unverified
Benchmarking Robustness of Contrastive Learning Models for Medical Image-Report Retrieval	Jan 15, 2025	BenchmarkingContrastive Learning	—Unverified
Introducing RezoJDM16k: a French KnowledgeGraph DataSet for Link Prediction	Jun 1, 2022	16kBenchmarking	—Unverified
7th AI Driving Olympics: 1st Place Report for Panoptic Tracking	Dec 9, 2021	BenchmarkingPanoptic Segmentation	—Unverified
Benchmarking Robustness of AI-Enabled Multi-sensor Fusion Systems: Challenges and Opportunities	Jun 6, 2023	BenchmarkingDepth Completion	—Unverified
Introduction to Voice Presentation Attack Detection and Recent Advances	Jan 4, 2019	BenchmarkingSpeaker Recognition	—Unverified
Intuitive or Dependent? Investigating LLMs' Behavior Style to Conflicting Prompts	Sep 29, 2023	BenchmarkingDecision Making	—Unverified
InverseBench: Benchmarking Plug-and-Play Diffusion Priors for Inverse Problems in Physical Sciences	Mar 14, 2025	BenchmarkingImage Restoration	—Unverified
A Framework for Benchmarking and Aligning Task-Planning Safety in LLM-Based Embodied Agents	Apr 20, 2025	BenchmarkingTask Planning	—Unverified
Investigating Deep-Learning NLP for Automating the Extraction of Oncology Efficacy Endpoints from Scientific Literature	Nov 3, 2023	Benchmarking	—Unverified
Investigating Energy Efficiency and Performance Trade-offs in LLM Inference Across Tasks and DVFS Settings	Jan 14, 2025	BenchmarkingQuestion Answering	—Unverified

Show:10 25 50

← PrevPage 125 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified