| SOPE: Spectrum of Off-Policy Estimators | Nov 6, 2021 | Decision MakingOff-policy evaluation | CodeCode Available | 0 |
| Regular Decision Processes for Grid Worlds | Nov 5, 2021 | Decision MakingDecision Making Under Uncertainty | —Unverified | 0 |
| Partial-Adaptive Submodular Maximization | Nov 1, 2021 | Active LearningDecision Making | —Unverified | 0 |
| A Law of Iterated Logarithm for Multi-Agent Reinforcement Learning | Oct 27, 2021 | Decision MakingMulti-agent Reinforcement Learning | —Unverified | 0 |
| The Value of Information When Deciding What to Learn | Oct 26, 2021 | Decision MakingSequential Decision Making | —Unverified | 0 |
| HSVI for zs-POSGs using Concavity, Convexity and Lipschitz Properties | Oct 25, 2021 | Decision MakingHeuristic Search | —Unverified | 0 |
| Analysis of Thompson Sampling for Partially Observable Contextual Multi-Armed Bandits | Oct 23, 2021 | Decision MakingMulti-Armed Bandits | —Unverified | 0 |
| ReLAX: Reinforcement Learning Agent eXplainer for Arbitrary Predictive Models | Oct 22, 2021 | counterfactualDecision Making | CodeCode Available | 0 |
| Anti-Concentrated Confidence Bonuses for Scalable Exploration | Oct 21, 2021 | Decision MakingDeep Reinforcement Learning | —Unverified | 0 |
| Show Me the Whole World: Towards Entire Item Space Exploration for Interactive Personalized Recommendations | Oct 19, 2021 | Decision MakingModel Selection | CodeCode Available | 0 |