| Catastrophic-risk-aware reinforcement learning with extreme-value-theory-based policy gradients | Jun 21, 2024 | Decision MakingManagement | CodeCode Available | 0 |
| ARDuP: Active Region Video Diffusion for Universal Policies | Jun 19, 2024 | Decision MakingSequential Decision Making | —Unverified | 0 |
| Learned Graph Rewriting with Equality Saturation: A New Paradigm in Relational Query Rewrite and Beyond | Jun 19, 2024 | Decision Makingreinforcement-learning | —Unverified | 0 |
| Model Adaptation for Time Constrained Embodied Control | Jun 17, 2024 | Autonomous DrivingDecision Making | —Unverified | 0 |
| Constrained Reinforcement Learning with Average Reward Objective: Model-Based and Model-Free Algorithms | Jun 17, 2024 | Autonomous DrivingDecision Making | —Unverified | 0 |
| Efficient Sequential Decision Making with Large Language Models | Jun 17, 2024 | Decision MakingModel Selection | —Unverified | 0 |
| Data-Driven Upper Confidence Bounds with Near-Optimal Regret for Heavy-Tailed Bandits | Jun 9, 2024 | Decision MakingMulti-Armed Bandits | —Unverified | 0 |
| Sound Heuristic Search Value Iteration for Undiscounted POMDPs with Reachability Objectives | Jun 5, 2024 | Decision MakingEfficient Exploration | CodeCode Available | 0 |
| "Give Me an Example Like This": Episodic Active Reinforcement Learning from Demonstrations | Jun 5, 2024 | Active LearningReinforcement Learning (RL) | CodeCode Available | 0 |
| Rectifying Reinforcement Learning for Reward Matching | Jun 4, 2024 | Decision Makingreinforcement-learning | —Unverified | 0 |
| Combining Experimental and Historical Data for Policy Evaluation | Jun 1, 2024 | Data IntegrationDecision Making | CodeCode Available | 0 |
| Reward Machines for Deep RL in Noisy and Uncertain Environments | May 31, 2024 | counterfactualDecision Making | CodeCode Available | 0 |
| Low-rank finetuning for LLMs: A fairness perspective | May 28, 2024 | Decision MakingFairness | —Unverified | 0 |
| Leveraging Offline Data in Linear Latent Bandits | May 27, 2024 | Decision MakingMovie Recommendation | —Unverified | 0 |
| OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates of Multiple Estimators | May 27, 2024 | Decision MakingOffline RL | —Unverified | 0 |
| Variational Offline Multi-agent Skill Discovery | May 26, 2024 | Decision MakingMulti-agent Reinforcement Learning | —Unverified | 0 |
| Inference of Utilities and Time Preference in Sequential Decision-Making | May 24, 2024 | Decision MakingSequential Decision Making | —Unverified | 0 |
| Inverse-RLignment: Large Language Model Alignment from Demonstrations through Inverse Reinforcement Learning | May 24, 2024 | Decision MakingLanguage Modeling | —Unverified | 0 |
| A finite time analysis of distributed Q-learning | May 23, 2024 | Decision MakingMulti-agent Reinforcement Learning | —Unverified | 0 |
| Efficiently Training Deep-Learning Parametric Policies using Lagrangian Duality | May 23, 2024 | Decision MakingDecision Making Under Uncertainty | —Unverified | 0 |
| Understanding the Training and Generalization of Pretrained Transformer for Sequential Decision Making | May 23, 2024 | Decision MakingSequential Decision Making | —Unverified | 0 |
| Reinforcing Language Agents via Policy Optimization with Action Decomposition | May 23, 2024 | Sequential Decision Making | —Unverified | 0 |
| On the Brittle Foundations of ReAct Prompting for Agentic Large Language Models | May 22, 2024 | Decision MakingSequential Decision Making | —Unverified | 0 |
| FLIPHAT: Joint Differential Privacy for High Dimensional Sparse Linear Bandits | May 22, 2024 | Decision MakingSequential Decision Making | CodeCode Available | 0 |
| Optimistic Query Routing in Clustering-based Approximate Maximum Inner Product Search | May 20, 2024 | ClusteringSequential Decision Making | CodeCode Available | 0 |