| Predicting Football Match Outcomes with eXplainable Machine Learning and the Kelly Index | Nov 28, 2022 | Benchmarking | —Unverified | 0 |
| Predicting Quantum Potentials by Deep Neural Network and Metropolis Sampling | Jun 6, 2021 | Benchmarking | —Unverified | 0 |
| Predicting the Performance of a Computing System with Deep Networks | Feb 27, 2023 | Benchmarking | —Unverified | 0 |
| Predicting the Probability of Collision of a Satellite with Space Debris: A Bayesian Machine Learning Approach | Nov 17, 2023 | BenchmarkingCollision Avoidance | —Unverified | 0 |
| Prediction Accuracy & Reliability: Classification and Object Localization under Distribution Shift | Sep 5, 2024 | Autonomous DrivingBenchmarking | —Unverified | 0 |
| Prediction of cancer driver genes and mutations: the potential of integrative computational frameworks | Mar 30, 2023 | Benchmarking | —Unverified | 0 |
| Prediction of the Influence of Navigation Scan-path on Perceived Quality of Free-Viewpoint Videos | Oct 10, 2018 | BenchmarkingVideo Quality Assessment | —Unverified | 0 |
| Predictive modelling of a novel anti-adhesion therapy to combat bacterial colonisation of burn wounds | Aug 10, 2017 | Benchmarking | —Unverified | 0 |
| Predictive Models from Quantum Computer Benchmarks | May 15, 2023 | Benchmarkingimage-classification | —Unverified | 0 |
| Prepare for Trouble and Make it Double. Supervised and Unsupervised Stacking for AnomalyBased Intrusion Detection | Feb 28, 2022 | BenchmarkingIntrusion Detection | —Unverified | 0 |
| Benchmarking Machine Reading Comprehension: A Psychological Perspective | Apr 4, 2020 | BenchmarkingMachine Reading Comprehension | —Unverified | 0 |
| Pretraining boosts out-of-domain robustness for pose estimation | Sep 24, 2019 | Animal Pose EstimationBenchmarking | —Unverified | 0 |
| Principles and Guidelines for Evaluating Social Robot Navigation Algorithms | Jun 29, 2023 | BenchmarkingRobot Navigation | —Unverified | 0 |
| PRISM: Complete Online Decentralized Multi-Agent Pathfinding with Rapid Information Sharing using Motion Constraints | May 12, 2025 | Benchmarking | —Unverified | 0 |
| Prism: Dynamic and Flexible Benchmarking of LLMs Code Generation with Monte Carlo Tree Search | Apr 7, 2025 | BenchmarkingCode Generation | —Unverified | 0 |
| Privacy-Preserving Language Model Inference with Instance Obfuscation | Feb 13, 2024 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| Privacy Protection in Street-View Panoramas using Depth and Multi-View Imagery | Mar 27, 2019 | BenchmarkingObject | —Unverified | 0 |
| Probabilistic Robustness in Deep Learning: A Concise yet Comprehensive Guide | Feb 20, 2025 | Adversarial RobustnessBenchmarking | —Unverified | 0 |
| ProBench: Benchmarking Large Language Models in Competitive Programming | Feb 28, 2025 | AttributeBenchmarking | —Unverified | 0 |
| Problem-solving benefits of down-sampled lexicase selection | Jun 10, 2021 | Benchmarking | —Unverified | 0 |
| Procedural Content Generation: Better Benchmarks for Transfer Reinforcement Learning | May 31, 2021 | BenchmarkingDeep Learning | —Unverified | 0 |
| Procedural Generalization by Planning with Self-Supervised World Models | Nov 2, 2021 | BenchmarkingMeta-Learning | —Unverified | 0 |
| ProductAgent: Benchmarking Conversational Product Search Agent with Asking Clarification Questions | Jul 1, 2024 | BenchmarkingQuestion Generation | —Unverified | 0 |
| Profit: Benchmarking Personalization and Robustness Trade-off in Federated Prompt Tuning | Oct 6, 2023 | BenchmarkingFederated Learning | —Unverified | 0 |
| Progressive Class-level Distillation | May 30, 2025 | BenchmarkingKnowledge Distillation | —Unverified | 0 |
| Progressive Multi-view Human Mesh Recovery with Self-Supervision | Dec 10, 2022 | BenchmarkingDiversity | —Unverified | 0 |
| Progressive with Purpose: Guiding Progressive Inpainting DNNs through Context and Structure | Sep 21, 2022 | BenchmarkingImage Inpainting | —Unverified | 0 |
| Projective simulation applied to the grid-world and the mountain-car problem | May 21, 2014 | Benchmarkingreinforcement-learning | —Unverified | 0 |
| Project MPG: towards a generalized performance benchmark for LLM capabilities | Oct 28, 2024 | BenchmarkingChatbot | —Unverified | 0 |
| Prompting ChatGPT for Chinese Learning as L2: A CEFR and EBCL Level Study | Jan 25, 2025 | Benchmarking | —Unverified | 0 |
| Prompting Scientific Names for Zero-Shot Species Recognition | Oct 15, 2023 | BenchmarkingZero-Shot Learning | —Unverified | 0 |
| Prompt Sketching for Large Language Models | Nov 8, 2023 | Arithmetic ReasoningBenchmarking | —Unverified | 0 |
| Proof of Humanity: A Multi-Layer Network Framework for Certifying Human-Originated Content in an AI-Dominated Internet | Apr 2, 2025 | Benchmarking | —Unverified | 0 |
| Proof of Thought : Neurosymbolic Program Synthesis allows Robust and Interpretable Reasoning | Sep 25, 2024 | BenchmarkingFormal Logic | —Unverified | 0 |
| ProtIR: Iterative Refinement between Retrievers and Predictors for Protein Function Annotation | Feb 10, 2024 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| Protocol for Executing and Benchmarking Eight Computational Doublet-Detection Methods in Single-Cell RNA Sequencing Data Analysis | Jan 21, 2021 | Benchmarking | —Unverified | 0 |
| Provably Safe Reinforcement Learning: Conceptual Analysis, Survey, and Benchmarking | May 13, 2022 | Benchmarkingreinforcement-learning | —Unverified | 0 |
| ProverbEval: Exploring LLM Evaluation Challenges for Low-resource Language Understanding | Nov 7, 2024 | BenchmarkingMultiple-choice | —Unverified | 0 |
| PsychBench: A comprehensive and professional benchmark for evaluating the performance of LLM-assisted psychiatric clinical practice | Feb 28, 2025 | BenchmarkingDiagnostic | —Unverified | 0 |
| PSYCHE: A Multi-faceted Patient Simulation Framework for Evaluation of Psychiatric Assessment Conversational Agents | Jan 3, 2025 | Benchmarking | —Unverified | 0 |
| Psychoacoustic Challenges Of Speech Enhancement On VoIP Platforms | Oct 11, 2023 | BenchmarkingDenoising | —Unverified | 0 |
| Share, Collaborate, Benchmark: Advancing Travel Demand Research through rigorous open-source collaboration | Jun 9, 2023 | BenchmarkingTime Series | —Unverified | 0 |
| PUB: Plot Understanding Benchmark and Dataset for Evaluating Large Language Models on Synthetic Visual Data Interpretation | Sep 4, 2024 | Benchmarking | —Unverified | 0 |
| Pulse Shape-Aided Multipath Delay Estimation for Fine-Grained WiFi Sensing | Jun 27, 2023 | Benchmarking | —Unverified | 0 |
| PunchBench: Benchmarking MLLMs in Multimodal Punchline Comprehension | Dec 16, 2024 | BenchmarkingImage Captioning | —Unverified | 0 |
| Pushing Boundaries: Exploring Zero Shot Object Classification with Large Multimodal Models | Dec 30, 2023 | Benchmarkingimage-classification | —Unverified | 0 |
| Pushing the Frontiers of Unconstrained Face Detection and Recognition: IARPA Janus Benchmark A | Jun 1, 2015 | BenchmarkingFace Detection | —Unverified | 0 |
| PySTACHIO: Python Single-molecule TrAcking stoiCHiometry Intensity and simulatiOn, a flexible, extensible, beginner-friendly and optimized program for analysis of single-molecule microscopy | Mar 18, 2021 | Art AnalysisBenchmarking | —Unverified | 0 |
| Pythae: Unifying Generative Autoencoders in Python -- A Benchmarking Use Case | Jun 16, 2022 | BenchmarkingDensity Estimation | —Unverified | 0 |
| Python Random Graph Generator | Sep 20, 2017 | BenchmarkingGraph Generation | —Unverified | 0 |