| Online Learning with Off-Policy Feedback | Jul 18, 2022 | Decision MakingMulti-Armed Bandits | —Unverified | 0 |
| Online Planning Algorithms for POMDPs | Jan 15, 2014 | Decision MakingDecision Making Under Uncertainty | —Unverified | 0 |
| Online Planning for Decentralized Stochastic Control with Partial History Sharing | Aug 6, 2019 | Decision MakingSequential Decision Making | —Unverified | 0 |
| Online Restless Multi-Armed Bandits with Long-Term Fairness Constraints | Dec 16, 2023 | Decision MakingFairness | —Unverified | 0 |
| Online Sequential Decision-Making with Unknown Delays | Feb 12, 2024 | Decision MakingSequential Decision Making | —Unverified | 0 |
| Online Statistical Inference for Contextual Bandits via Stochastic Gradient Descent | Dec 30, 2022 | Decision MakingMulti-Armed Bandits | —Unverified | 0 |
| Online Statistical Inference in Decision-Making with Matrix Context | Dec 21, 2022 | Decision MakingSequential Decision Making | —Unverified | 0 |
| On Optimal Robustness to Adversarial Corruption in Online Decision Problems | Sep 22, 2021 | Decision MakingSequential Decision Making | —Unverified | 0 |
| Towards Tractable Optimism in Model-Based Reinforcement Learning | Jun 21, 2020 | continuous-controlContinuous Control | —Unverified | 0 |
| On preserving non-discrimination when combining expert advice | Oct 28, 2018 | Decision MakingSequential Decision Making | —Unverified | 0 |
| On Sample-Efficient Offline Reinforcement Learning: Data Diversity, Posterior Sampling, and Beyond | Jan 6, 2024 | Decision MakingDiversity | —Unverified | 0 |
| On the Brittle Foundations of ReAct Prompting for Agentic Large Language Models | May 22, 2024 | Decision MakingSequential Decision Making | —Unverified | 0 |
| On the Expressivity of Multidimensional Markov Reward | Jul 22, 2023 | Decision MakingDecision Making Under Uncertainty | —Unverified | 0 |
| On the Global Convergence of Risk-Averse Policy Gradient Methods with Expected Conditional Risk Measures | Jan 26, 2023 | Decision MakingPolicy Gradient Methods | —Unverified | 0 |
| On the Modeling Capabilities of Large Language Models for Sequential Decision Making | Oct 8, 2024 | Decision MakingDiversity | —Unverified | 0 |
| On the Performance of Empirical Risk Minimization with Smoothed Data | Feb 22, 2024 | Decision MakingSequential Decision Making | —Unverified | 0 |
| On the Relationship Between Structure in Natural Language and Models of Sequential Decision Processes | Jun 12, 2020 | Decision MakingSequential Decision Making | —Unverified | 0 |
| On the Role of Information Structure in Reinforcement Learning for Partially-Observable Sequential Teams and Games | Mar 1, 2024 | Decision Makingreinforcement-learning | —Unverified | 0 |
| Open Problem: Approximate Planning of POMDPs in the class of Memoryless Policies | Aug 17, 2016 | Decision MakingReinforcement Learning | —Unverified | 0 |
| OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates of Multiple Estimators | May 27, 2024 | Decision MakingOffline RL | —Unverified | 0 |
| Robust optimal policies for team Markov games | May 16, 2021 | Decision MakingSequential Decision Making | —Unverified | 0 |
| Optimal Inspection and Maintenance Planning for Deteriorating Structural Components through Dynamic Bayesian Networks and Markov Decision Processes | Sep 9, 2020 | Decision MakingSequential Decision Making | —Unverified | 0 |
| Optimal Learning for Sequential Decision Making for Expensive Cost Functions with Stochastic Binary Feedbacks | Sep 13, 2017 | Decision MakingMulti-Armed Bandits | —Unverified | 0 |
| Optimal Sensing via Multi-armed Bandit Relaxations in Mixed Observability Domains | Mar 15, 2016 | Decision MakingDecision Making Under Uncertainty | —Unverified | 0 |
| Optimal Sequential Decision-Making in Geosteering: A Reinforcement Learning Approach | Oct 7, 2023 | Decision Makingreinforcement-learning | —Unverified | 0 |
| Optimal sequential decision making with probabilistic digital twins | Mar 12, 2021 | Decision MakingDeep Reinforcement Learning | —Unverified | 0 |
| Optimistic MLE -- A Generic Model-based Algorithm for Partially Observable Sequential Decision Making | Sep 29, 2022 | Decision MakingModel-based Reinforcement Learning | —Unverified | 0 |
| Optimization of anemia treatment in hemodialysis patients via reinforcement learning | Sep 14, 2015 | Decision MakingQ-Learning | —Unverified | 0 |
| Optimizing Fantasy Sports Team Selection with Deep Reinforcement Learning | Dec 26, 2024 | Decision MakingDeep Reinforcement Learning | —Unverified | 0 |
| Optimizing Memory Mapping Using Deep Reinforcement Learning | May 11, 2023 | Cloud ComputingDecision Making | —Unverified | 0 |
| Optimizing Sensor Redundancy in Sequential Decision-Making Problems | Dec 10, 2024 | Decision MakingOpenAI Gym | —Unverified | 0 |
| Out-of-Distribution Adaptation in Offline RL: Counterfactual Reasoning via Causal Normalizing Flows | May 6, 2024 | Causal Inferencecounterfactual | —Unverified | 0 |
| PAC Reinforcement Learning with Rich Observations | Feb 8, 2016 | Decision MakingMulti-Armed Bandits | —Unverified | 0 |
| Parameterized MDPs and Reinforcement Learning Problems -- A Maximum Entropy Principle Based Framework | Jun 17, 2020 | Decision MakingQ-Learning | —Unverified | 0 |
| Pareto Inverse Reinforcement Learning for Diverse Expert Policy Generation | Aug 22, 2024 | Autonomous DrivingDecision Making | —Unverified | 0 |
| Partial-Adaptive Submodular Maximization | Nov 1, 2021 | Active LearningDecision Making | —Unverified | 0 |
| Partially Observable Stochastic Games with Neural Perception Mechanisms | Oct 17, 2023 | Decision MakingDecision Making Under Uncertainty | —Unverified | 0 |
| Partial-Monotone Adaptive Submodular Maximization | Jul 26, 2022 | Active LearningDecision Making | —Unverified | 0 |
| Partner-Aware Algorithms in Decentralized Cooperative Bandit Teams | Oct 2, 2021 | Decision MakingSequential Decision Making | —Unverified | 0 |
| Patterns, predictions, and actions: A story about machine learning | Feb 10, 2021 | BIG-bench Machine LearningCausal Inference | —Unverified | 0 |
| PDQN - A Deep Reinforcement Learning Method for Planning with Long Delays: Optimization of Manufacturing Dispatching | Sep 29, 2021 | Decision MakingDeep Reinforcement Learning | —Unverified | 0 |
| Pessimistic Model Selection for Offline Deep Reinforcement Learning | Nov 29, 2021 | Decision MakingDeep Reinforcement Learning | —Unverified | 0 |
| Planning with General Objective Functions: Going Beyond Total Rewards | Dec 1, 2020 | Decision MakingSequential Decision Making | —Unverified | 0 |
| Playing against Nature: causal discovery for decision making under uncertainty | Jul 3, 2018 | Causal DiscoveryDecision Making | —Unverified | 0 |
| POLAR: A Pessimistic Model-based Policy Learning Algorithm for Dynamic Treatment Regimes | Jun 25, 2025 | Sequential Decision Making | —Unverified | 0 |
| Mean-Variance Efficient Reinforcement Learning with Applications to Dynamic Financial Investment | Oct 3, 2020 | Decision MakingDecision Making Under Uncertainty | —Unverified | 0 |
| Policy Gradient With Value Function Approximation For Collective Multiagent Planning | Apr 9, 2018 | Decision MakingReinforcement Learning | —Unverified | 0 |
| Policy-labeled Preference Learning: Is Preference Enough for RLHF? | May 6, 2025 | continuous-controlContinuous Control | —Unverified | 0 |
| Population-based Evaluation in Repeated Rock-Paper-Scissors as a Benchmark for Multiagent Reinforcement Learning | Mar 2, 2023 | Decision MakingLanguage Modeling | —Unverified | 0 |
| Position Paper: Rethinking Privacy in RL for Sequential Decision-making in the Age of LLMs | Apr 15, 2025 | Autonomous VehiclesDecision Making | —Unverified | 0 |