SOTAVerified

Sequential Decision Making

Papers

Showing 151200 of 1210 papers

TitleStatusHype
Integrated Sensing and Communications for Low-Altitude Economy: A Deep Reinforcement Learning Approach0
Technical Report on Reinforcement Learning Control on the Lucas-Nülle Inverted Pendulum0
Time-Series-Informed Closed-loop Learning for Sequential Decision Making and Control0
Selective Reviews of Bandit Problems in AI via a Statistical View0
Failure Probability Estimation for Black-Box Autonomous Systems using State-Dependent Importance Sampling Proposals0
Decision Transformer vs. Decision Mamba: Analysing the Complexity of Sequential Decision Making in Atari GamesCode0
STEVE-Audio: Expanding the Goal Conditioning Modalities of Embodied Agents in Minecraft0
Market Making without Regret0
On adaptivity and minimax optimality of two-sided nearest neighborsCode0
Robust Markov Decision Processes: A Place Where AI and Formal Methods Meet0
Towards Sample-Efficiency and Generalization of Transfer and Inverse Reinforcement Learning: A Comprehensive Literature Review0
Fair Resource Allocation in Weakly Coupled Markov Decision Processes0
SANDWICH: Towards an Offline, Differentiable, Fully-Trainable Wireless Neural Ray-Tracing SurrogateCode0
Optimal Control of Mechanical Ventilators with Learned Respiratory DynamicsCode0
Collaborative and Federated Black-box Optimization: A Bayesian Optimization Perspective0
PageRank Bandits for Link PredictionCode0
LogiCity: Advancing Neuro-Symbolic AI with Abstract Urban SimulationCode1
EARL-BO: Reinforcement Learning for Multi-Step Lookahead, High-Dimensional Bayesian Optimization0
Quantum Reinforcement Learning-Based Two-Stage Unit Commitment Framework for Enhanced Power Systems Robustness0
Annotation Efficiency: Identifying Hard Samples via Blocked Sparse Linear Bandits0
Reinforcement Learning for Aligning Large Language Models Agents with Interactive Environments: Quantifying and Mitigating Prompt Overfitting0
Robust Thompson Sampling Algorithms Against Reward Poisoning Attacks0
Learning Versatile Skills with Curriculum MaskingCode0
Convex Markov Games: A New Frontier for Multi-Agent Reinforcement Learning0
Hierarchical Upper Confidence Bounds for Constrained Online Learning0
Counterfactual Effect Decomposition in Multi-Agent Sequential Decision MakingCode0
SAC-GLAM: Improving Online RL for LLM agents with Soft Actor-Critic and Hindsight Relabeling0
Communication-Control Codesign for Large-Scale Wireless Networked Control Systems0
Burning RED: Unlocking Subtask-Driven Reinforcement Learning and Risk-Awareness in Average-Reward Markov Decision Processes0
Efficient Reinforcement Learning with Large Language Model Priors0
Offline Hierarchical Reinforcement Learning via Inverse Optimization0
On the Modeling Capabilities of Large Language Models for Sequential Decision Making0
DataEnvGym: Data Generation Agents in Teacher Environments with Student FeedbackCode1
DOPL: Direct Online Preference Learning for Restless Bandits with Preference Feedback0
Preference Optimization as Probabilistic Inference0
Minimax-optimal trust-aware multi-armed bandits0
Learning a Fast Mixing Exogenous Block MDP using a Single TrajectoryCode0
Adaptive teachers for amortized samplersCode0
AVID: Adapting Video Diffusion Models to World ModelsCode0
Safe Time-Varying Optimization based on Gaussian Processes with Spatio-Temporal Kernel0
Collaborative Comic Generation: Integrating Visual Narrative Theories with AI Models for Enhanced CreativityCode0
Learning Utilities from Demonstrations in Markov Decision Processes0
Reference Points, Risk-Taking Behavior, and Competitive Outcomes in Sequential Settings0
Learning Discrete World Models for Heuristic SearchCode1
Visual Language Tracking with Multi-modal Interaction: A Robust Benchmark0
Hierarchical Reinforcement Learning for Temporal Abstraction of Listwise Recommendation0
HierLLM: Hierarchical Large Language Model for Question Recommendation0
Bridging Rested and Restless Bandits with Graph-Triggering: Rising and Rotting0
Forward KL Regularized Preference Optimization for Aligning Diffusion Policies0
An Introduction to Quantum Reinforcement Learning (QRL)0
Show:102550
← PrevPage 4 of 25Next →

No leaderboard results yet.