SOTAVerified

Sequential Decision Making

Papers

Showing 251300 of 1210 papers

TitleStatusHype
Fair Resource Allocation in Weakly Coupled Markov Decision Processes0
SANDWICH: Towards an Offline, Differentiable, Fully-Trainable Wireless Neural Ray-Tracing SurrogateCode0
Collaborative and Federated Black-box Optimization: A Bayesian Optimization Perspective0
Optimal Control of Mechanical Ventilators with Learned Respiratory DynamicsCode0
PageRank Bandits for Link PredictionCode0
EARL-BO: Reinforcement Learning for Multi-Step Lookahead, High-Dimensional Bayesian Optimization0
Quantum Reinforcement Learning-Based Two-Stage Unit Commitment Framework for Enhanced Power Systems Robustness0
Annotation Efficiency: Identifying Hard Samples via Blocked Sparse Linear Bandits0
Robust Thompson Sampling Algorithms Against Reward Poisoning Attacks0
Reinforcement Learning for Aligning Large Language Models Agents with Interactive Environments: Quantifying and Mitigating Prompt Overfitting0
Learning Versatile Skills with Curriculum MaskingCode0
Convex Markov Games: A New Frontier for Multi-Agent Reinforcement Learning0
Hierarchical Upper Confidence Bounds for Constrained Online Learning0
SAC-GLAM: Improving Online RL for LLM agents with Soft Actor-Critic and Hindsight Relabeling0
Counterfactual Effect Decomposition in Multi-Agent Sequential Decision MakingCode0
Communication-Control Codesign for Large-Scale Wireless Networked Control Systems0
Burning RED: Unlocking Subtask-Driven Reinforcement Learning and Risk-Awareness in Average-Reward Markov Decision Processes0
Offline Hierarchical Reinforcement Learning via Inverse Optimization0
Efficient Reinforcement Learning with Large Language Model Priors0
On the Modeling Capabilities of Large Language Models for Sequential Decision Making0
DOPL: Direct Online Preference Learning for Restless Bandits with Preference Feedback0
Preference Optimization as Probabilistic Inference0
Minimax-optimal trust-aware multi-armed bandits0
Learning a Fast Mixing Exogenous Block MDP using a Single TrajectoryCode0
Adaptive teachers for amortized samplersCode0
AVID: Adapting Video Diffusion Models to World ModelsCode0
Safe Time-Varying Optimization based on Gaussian Processes with Spatio-Temporal Kernel0
Collaborative Comic Generation: Integrating Visual Narrative Theories with AI Models for Enhanced CreativityCode0
Learning Utilities from Demonstrations in Markov Decision Processes0
Reference Points, Risk-Taking Behavior, and Competitive Outcomes in Sequential Settings0
Visual Language Tracking with Multi-modal Interaction: A Robust Benchmark0
Hierarchical Reinforcement Learning for Temporal Abstraction of Listwise Recommendation0
HierLLM: Hierarchical Large Language Model for Question Recommendation0
Forward KL Regularized Preference Optimization for Aligning Diffusion Policies0
Bridging Rested and Restless Bandits with Graph-Triggering: Rising and Rotting0
An Introduction to Quantum Reinforcement Learning (QRL)0
Sliding-Window Thompson Sampling for Non-Stationary Settings0
A naive aggregation algorithm for improving generalization in a class of learning problems0
InfraLib: Enabling Reinforcement Learning and Decision-Making for Large-Scale Infrastructure Management0
A Sequential Decision-Making Model for Perimeter Identification0
Temporal Elections: Welfare, Strategyproofness, and Proportionality0
How to Measure Human-AI Prediction Accuracy in Explainable AI Systems0
Pareto Inverse Reinforcement Learning for Diverse Expert Policy Generation0
An End-to-End Reinforcement Learning Based Approach for Micro-View Order-Dispatching in Ride-Hailing0
Contextual Bandits for Unbounded Context Distributions0
Enhancing Heterogeneous Multi-Agent Cooperation in Decentralized MARL via GNN-driven Intrinsic RewardsCode0
Meta Clustering of Neural Bandits0
Structure and Reduction of MCTS for Explainable-AI0
Non-maximizing policies that fulfill multi-criterion aspirations in expectation0
Few-shot Scooping Under Domain Shift via Simulated Maximal Deployment Gaps0
Show:102550
← PrevPage 6 of 25Next →

No leaderboard results yet.