| Benchmarking Adversarial Robustness of Image Shadow Removal with Shadow-adaptive Attacks | Mar 15, 2024 | Adversarial AttackAdversarial Robustness | —Unverified | 0 | 0 |
| OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents | Jun 19, 2025 | Benchmarking | —Unverified | 0 | 0 |
| oTTC: Object Time-to-Contact for Motion Estimation in Autonomous Driving | May 13, 2024 | AttributeAutonomous Driving | —Unverified | 0 | 0 |
| Benchmarking Adversarial Robustness of Compressed Deep Learning Models | Aug 16, 2023 | Adversarial RobustnessBenchmarking | —Unverified | 0 | 0 |
| Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms | May 22, 2025 | Adversarial AttackBenchmarking | —Unverified | 0 | 0 |
| Out of Distribution Performance of State of Art Vision Model | Jan 25, 2023 | Benchmarking | —Unverified | 0 | 0 |
| Benchmarking Adversarial Robustness | Dec 26, 2019 | Adversarial AttackAdversarial Robustness | —Unverified | 0 | 0 |
| Overconfident Oracles: Limitations of In Silico Sequence Design Benchmarking | Feb 24, 2025 | Benchmarking | —Unverified | 0 | 0 |
| Overview and practical recommendations on using Shapley Values for identifying predictive biomarkers via CATE modeling | May 2, 2025 | Benchmarking | —Unverified | 0 | 0 |
| Overview of Todai Robot Project and Evaluation Framework of its NLP-based Problem Solving | May 1, 2014 | Benchmarking | —Unverified | 0 | 0 |
| Benchmarking Adversarially Robust Quantum Machine Learning at Scale | Nov 23, 2022 | Adversarial AttackAdversarial Attack Detection | —Unverified | 0 | 0 |
| OVQA: A Clinically Generated Visual Question Answering Dataset | Jul 7, 2022 | BenchmarkingMedical Visual Question Answering | —Unverified | 0 | 0 |
| Paddy Doctor: A Visual Image Dataset for Automated Paddy Disease Classification and Benchmarking | May 23, 2022 | BenchmarkingClassification | —Unverified | 0 | 0 |
| Benchmarking adversarial attacks and defenses for time-series data | Aug 30, 2020 | Adversarial DefenseBenchmarking | —Unverified | 0 | 0 |
| PalmBench: A Comprehensive Benchmark of Compressed Large Language Models on Mobile Platforms | Oct 5, 2024 | BenchmarkingGPU | —Unverified | 0 | 0 |
| Benchmarking Advanced Text Anonymisation Methods: A Comparative Study on Novel and Traditional Approaches | Apr 22, 2024 | BenchmarkingDiversity | —Unverified | 0 | 0 |
| Benchmarking Adaptive Intelligence and Computer Vision on Human-Robot Collaboration | Sep 30, 2024 | BenchmarkingIntent Detection | —Unverified | 0 | 0 |
| Benchmarking Adaptative Variational Quantum Algorithms on QUBO Instances | Aug 3, 2023 | Benchmarking | —Unverified | 0 | 0 |
| Paradigm Shift in Sustainability Disclosure Analysis: Empowering Stakeholders with CHATREPORT, a Language Model-Based Tool | Jun 27, 2023 | BenchmarkingLanguage Modeling | —Unverified | 0 | 0 |
| Para-Lane: Multi-Lane Dataset Registering Parallel Scans for Benchmarking Novel View Synthesis | Feb 21, 2025 | 3DGSAutonomous Driving | —Unverified | 0 | 0 |
| Benchmarking Active Learning Strategies for Materials Optimization and Discovery | Apr 12, 2022 | Active LearningBenchmarking | —Unverified | 0 | 0 |
| A critical analysis of metrics used for measuring progress in artificial intelligence | Aug 6, 2020 | Benchmarking | —Unverified | 0 | 0 |
| True Online TD-Replan(lambda) Achieving Planning through Replaying | Jan 31, 2025 | Benchmarking | —Unverified | 0 | 0 |
| Benchmarking Active Learning for NILM | Nov 24, 2024 | Active LearningBenchmarking | —Unverified | 0 | 0 |
| Benchmarking Abstractive Summarisation: A Dataset of Human-authored Summaries of Norwegian News Articles | Jan 13, 2025 | ArticlesBenchmarking | —Unverified | 0 | 0 |