| Benchmarking Machine Reading Comprehension: A Psychological Perspective | Apr 4, 2020 | BenchmarkingMachine Reading Comprehension | —Unverified | 0 |
| Pretraining boosts out-of-domain robustness for pose estimation | Sep 24, 2019 | Animal Pose EstimationBenchmarking | —Unverified | 0 |
| Principles and Guidelines for Evaluating Social Robot Navigation Algorithms | Jun 29, 2023 | BenchmarkingRobot Navigation | —Unverified | 0 |
| PRISM: Complete Online Decentralized Multi-Agent Pathfinding with Rapid Information Sharing using Motion Constraints | May 12, 2025 | Benchmarking | —Unverified | 0 |
| Prism: Dynamic and Flexible Benchmarking of LLMs Code Generation with Monte Carlo Tree Search | Apr 7, 2025 | BenchmarkingCode Generation | —Unverified | 0 |
| Privacy-Preserving Language Model Inference with Instance Obfuscation | Feb 13, 2024 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| Privacy Protection in Street-View Panoramas using Depth and Multi-View Imagery | Mar 27, 2019 | BenchmarkingObject | —Unverified | 0 |
| Probabilistic Robustness in Deep Learning: A Concise yet Comprehensive Guide | Feb 20, 2025 | Adversarial RobustnessBenchmarking | —Unverified | 0 |
| ProBench: Benchmarking Large Language Models in Competitive Programming | Feb 28, 2025 | AttributeBenchmarking | —Unverified | 0 |
| Problem-solving benefits of down-sampled lexicase selection | Jun 10, 2021 | Benchmarking | —Unverified | 0 |