| Perona: Robust Infrastructure Fingerprinting for Resource-Efficient Big Data Analytics | Nov 15, 2022 | Benchmarking | —Unverified | 0 |
| PerSEval: Assessing Personalization in Text Summarizers | Jun 29, 2024 | BenchmarkingHuman Judgment Correlation | —Unverified | 0 |
| Personalised Feedback Framework for Online Education Programmes Using Generative AI | Oct 14, 2024 | BenchmarkingManagement | —Unverified | 0 |
| Personalized Multimodal Large Language Models: A Survey | Dec 3, 2024 | BenchmarkingSurvey | —Unverified | 0 |
| Personalized On-Device E-health Analytics with Decentralized Block Coordinate Descent | Dec 17, 2021 | BenchmarkingDiagnostic | —Unverified | 0 |
| Person Re-Identification by Unsupervised Video Matching | Nov 25, 2016 | BenchmarkingDynamic Time Warping | —Unverified | 0 |
| Person Re-Identification in Identity Regression Space | Jun 25, 2018 | BenchmarkingIncremental Learning | —Unverified | 0 |
| Person Re-identification in the Wild | Apr 9, 2016 | BenchmarkingPedestrian Detection | —Unverified | 0 |
| Person Search by Multi-Scale Matching | Jul 23, 2018 | BenchmarkingHuman Detection | —Unverified | 0 |
| Person Search by Multi-Scale Matching | Sep 1, 2018 | BenchmarkingHuman Detection | —Unverified | 0 |
| Perspective on recent developments and challenges in regulatory and systems genomics | Nov 7, 2024 | Benchmarking | —Unverified | 0 |
| Perspectives on the State and Future of Deep Learning -- 2023 | Dec 7, 2023 | BenchmarkingDeep Learning | —Unverified | 0 |
| Perturbation-based exploration methods in deep reinforcement learning | Nov 10, 2020 | Atari GamesBenchmarking | —Unverified | 0 |
| PGLearn -- An Open-Source Learning Toolkit for Optimal Power Flow | May 28, 2025 | Benchmarking | —Unverified | 0 |
| PGLib-CO2: A Power Grid Library for Computing and Optimizing Carbon Emissions | Jun 17, 2025 | Benchmarking | —Unverified | 0 |
| PhD Thesis on Code Modulated Interferometric Imaging System using Phased Arrays | Jul 19, 2021 | Benchmarking | —Unverified | 0 |
| Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix" Cycle | Jul 18, 2024 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| PhilHumans: Benchmarking Machine Learning for Personal Health | May 4, 2024 | Action AnticipationBenchmarking | —Unverified | 0 |
| PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding | Jan 27, 2025 | BenchmarkingCommon Sense Reasoning | —Unverified | 0 |
| PhySense: Principle-Based Physics Reasoning Benchmarking for Large Language Models | May 30, 2025 | Benchmarking | —Unverified | 0 |
| Physics-Learning AI Datamodel (PLAID) datasets: a collection of physics simulations for machine learning | May 5, 2025 | Benchmarking | —Unverified | 0 |
| PhytoSynth: Leveraging Multi-modal Generative Models for Crop Disease Data Generation with Novel Benchmarking and Prompt Engineering Approach | May 3, 2025 | BenchmarkingImage-to-Image Translation | —Unverified | 0 |
| PieTrack: An MOT solution based on synthetic data training and self-supervised domain adaptation | Jul 22, 2022 | BenchmarkingDomain Adaptation | —Unverified | 0 |
| PISTOL: Dataset Compilation Pipeline for Structural Unlearning of LLMs | Jun 24, 2024 | BenchmarkingMachine Unlearning | —Unverified | 0 |
| Pitfalls of topology-aware image segmentation | Dec 19, 2024 | BenchmarkingImage Segmentation | —Unverified | 0 |
| pix2pockets: Shot Suggestions in 8-Ball Pool from a Single Image in the Wild | Apr 16, 2025 | Benchmarkingobject-detection | —Unverified | 0 |
| PKLot-A robust dataset for parking lot classification | Jul 1, 2015 | BenchmarkingClassification | —Unverified | 0 |
| PLAICraft: Large-Scale Time-Aligned Vision-Speech-Action Dataset for Embodied AI | May 19, 2025 | BenchmarkingMinecraft | —Unverified | 0 |
| Plant in Cupboard, Orange on Rably, Inat Aphone. Benchmarking Incremental Learning of Situation and Language Model using a Text-Simulated Situated Environment | Feb 17, 2025 | BenchmarkingCommon Sense Reasoning | —Unverified | 0 |
| Point Cloud Compression and Objective Quality Assessment: A Survey | Jun 28, 2025 | Autonomous DrivingBenchmarking | —Unverified | 0 |
| Point Cloud Objective Quality: Benchmarking Features and Quality Evaluation | Apr 4, 2025 | AttributeBenchmarking | —Unverified | 0 |
| Polarization and Index Modulations: a Theoretical and Practical Perspective | Mar 20, 2018 | BenchmarkingNavigate | —Unverified | 0 |
| Policy Entropy for Out-of-Distribution Classification | May 25, 2020 | BenchmarkingClassification | —Unverified | 0 |
| Polyp-E: Benchmarking the Robustness of Deep Segmentation Models via Polyp Editing | Oct 22, 2024 | AttributeBenchmarking | —Unverified | 0 |
| Portfolio Benchmarking under Drawdown Constraint and Stochastic Sharpe Ratio | Oct 26, 2016 | Benchmarking | —Unverified | 0 |
| PoseBench: Benchmarking the Robustness of Pose Estimation Models under Corruptions | Jun 20, 2024 | Animal Pose EstimationAutonomous Driving | —Unverified | 0 |
| Pose Estimation for Non-Cooperative Spacecraft Rendezvous Using Convolutional Neural Networks | Sep 19, 2018 | BenchmarkingImage Generation | —Unverified | 0 |
| Position: AI Competitions Provide the Gold Standard for Empirical Rigor in GenAI Evaluation | May 1, 2025 | BenchmarkingPosition | —Unverified | 0 |
| Position: Benchmarking is Limited in Reinforcement Learning Research | Jun 23, 2024 | BenchmarkingPosition | —Unverified | 0 |
| Position: Graph Learning Will Lose Relevance Due To Poor Benchmarks | Feb 20, 2025 | BenchmarkingCombinatorial Optimization | —Unverified | 0 |
| Position: There are no Champions in Long-Term Time Series Forecasting | Feb 19, 2025 | BenchmarkingPosition | —Unverified | 0 |
| Post-FEC BER Benchmarking for Bit-Interleaved Coded Modulation with Probabilistic Shaping | Apr 24, 2020 | Benchmarking | —Unverified | 0 |
| Post-hoc labeling of arbitrary EEG recordings for data-efficient evaluation of neural decoding methods | Nov 22, 2017 | BenchmarkingEEG | —Unverified | 0 |
| Deep Neural Operator Driven Real Time Inference for Nuclear Systems to Enable Digital Twin Solutions | Aug 15, 2023 | BenchmarkingComputational Efficiency | —Unverified | 0 |
| PowerGraph: A power grid benchmark dataset for graph neural networks | Feb 5, 2024 | ArticlesBenchmarking | —Unverified | 0 |
| Power Line Communication vs. Talkative Power Conversion: A Benchmarking Study | Apr 16, 2025 | Benchmarking | —Unverified | 0 |
| Practical Design and Benchmarking of Generative AI Applications for Surgical Billing and Coding | Jan 7, 2025 | BenchmarkingCode Generation | —Unverified | 0 |
| Practical, Fast and Robust Point Cloud Registration for 3D Scene Stitching and Object Localization | Nov 8, 2021 | 3D Feature MatchingBenchmarking | —Unverified | 0 |
| Precise Model Benchmarking with Only a Few Observations | Oct 7, 2024 | Benchmarkingmodel | —Unverified | 0 |
| Predicting credit default probabilities using machine learning techniques in the face of unequal class distributions | Jul 30, 2019 | BenchmarkingBIG-bench Machine Learning | —Unverified | 0 |