| DOPL: Direct Online Preference Learning for Restless Bandits with Preference Feedback | Oct 7, 2024 | Multi-Armed BanditsSequential Decision Making | —Unverified | 0 |
| Preference Optimization as Probabilistic Inference | Oct 5, 2024 | Decision MakingSequential Decision Making | —Unverified | 0 |
| Minimax-optimal trust-aware multi-armed bandits | Oct 4, 2024 | Decision MakingMulti-Armed Bandits | —Unverified | 0 |
| Learning a Fast Mixing Exogenous Block MDP using a Single Trajectory | Oct 3, 2024 | Representation LearningSequential Decision Making | CodeCode Available | 0 |
| Adaptive teachers for amortized samplers | Oct 2, 2024 | Decision MakingEfficient Exploration | CodeCode Available | 0 |
| AVID: Adapting Video Diffusion Models to World Models | Oct 1, 2024 | Decision MakingSequential Decision Making | —Unverified | 0 |
| Safe Time-Varying Optimization based on Gaussian Processes with Spatio-Temporal Kernel | Sep 26, 2024 | Bayesian OptimizationChange Detection | —Unverified | 0 |
| Collaborative Comic Generation: Integrating Visual Narrative Theories with AI Models for Enhanced Creativity | Sep 25, 2024 | Decision MakingSequential Decision Making | CodeCode Available | 0 |
| Learning Utilities from Demonstrations in Markov Decision Processes | Sep 25, 2024 | Decision MakingSequential Decision Making | —Unverified | 0 |
| Reference Points, Risk-Taking Behavior, and Competitive Outcomes in Sequential Settings | Sep 20, 2024 | counterfactualDecision Making | —Unverified | 0 |