SOTAVerified

Sequential Decision Making

Papers

Showing 11511200 of 1210 papers

TitleStatusHype
Neural Contextual Bandits without RegretCode0
Interactively Learning Preference Constraints in Linear BanditsCode0
Interactively Teaching an Inverse Reinforcement Learner with Limited FeedbackCode0
Interactive Machine Comprehension with Information Seeking AgentsCode0
Risk-Sensitive Stochastic Optimal Control as Rao-Blackwellized Markovian Score ClimbingCode0
TraCE: Trajectory Counterfactual Explanation ScoresCode0
Show Me the Whole World: Towards Entire Item Space Exploration for Interactive Personalized RecommendationsCode0
AutoGMap: Learning to Map Large-scale Sparse Graphs on Memristive CrossbarsCode0
Noise-Adaptive Confidence Sets for Linear Bandits and Application to Bayesian OptimizationCode0
Value-Distributional Model-Based Reinforcement LearningCode0
RLTutor: Reinforcement Learning Based Adaptive Tutoring System by Modeling Virtual Student with Fewer InteractionsCode0
Non-monotonic Resource Utilization in the Bandits with Knapsacks ProblemCode0
Deep Reinforcement Learning for Surgical Gesture Segmentation and ClassificationCode0
Nonmyopic Global Optimisation via Approximate Dynamic ProgrammingCode0
Simple Modification of the Upper Confidence Bound Algorithm by Generalized Weighted AveragesCode0
Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning?Code0
Robust Active Measuring under Model UncertaintyCode0
Deep Reinforcement Learning for Personalized Diagnostic Decision Pathways Using Electronic Health Records: A Comparative Study on Anemia and Systemic Lupus ErythematosusCode0
Deep Reinforcement Learning for Imbalanced ClassificationCode0
Bounded rationality for relaxing best response and mutual consistency: The Quantal Hierarchy model of decision-makingCode0
Anderson Acceleration for Partially Observable Markov Decision Processes: A Maximum Entropy ApproachCode0
Robust Anytime Learning of Markov Decision ProcessesCode0
Adaptive Action Duration with Contextual Bandits for Deep Reinforcement Learning in Dynamic EnvironmentsCode0
Common Benchmarks Undervalue the Generalization Power of Programmatic PoliciesCode0
Accelerate Model Parallel Training by Using Efficient Graph Traversal Order in Device PlacementCode0
LaGR-SEQ: Language-Guided Reinforcement Learning with Sample-Efficient QueryingCode0
Combining Experimental and Historical Data for Policy EvaluationCode0
Quantization-Free Autoregressive Action TransformerCode0
Deep Reinforcement Learning Algorithms for Option HedgingCode0
Deep Q-Network for Angry BirdsCode0
Offline Dynamic Inventory and Pricing Strategy: Addressing Censored and Dependent DemandCode0
Robust Reinforcement Learning Under Minimax Regret for Green SecurityCode0
Active Sampling for MRI-based Sequential Decision MakingCode0
Quizbowl: The Case for Incremental Question AnsweringCode0
Skill Disentanglement for Imitation Learning from Suboptimal DemonstrationsCode0
Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson SamplingCode0
Read, Watch, and Move: Reinforcement Learning for Temporally Grounding Natural Language Descriptions in VideosCode0
Off-Policy Evaluation for Action-Dependent Non-Stationary EnvironmentsCode0
An Adaptable Budget Planner for Enhancing Budget-Constrained Auto-Bidding in Online AdvertisingCode0
Learning a Fast Mixing Exogenous Block MDP using a Single TrajectoryCode0
Off-Policy Optimization of Portfolio Allocation Policies under ConstraintsCode0
Off-policy Policy Evaluation For Sequential Decisions Under Unobserved ConfoundingCode0
Hierarchical Reinforcement Learning with AI Planning ModelsCode0
TooBadRL: Trigger Optimization to Boost Effectiveness of Backdoor Attacks on Deep Reinforcement LearningCode0
Learning Coordination Policies over Heterogeneous Graphs for Human-Robot Teams via Recurrent Neural Schedule PropagationCode0
Temporal Shift Reinforcement LearningCode0
Learning Discrete State Abstractions With Deep Variational InferenceCode0
Decision Transformer vs. Decision Mamba: Analysing the Complexity of Sequential Decision Making in Atari GamesCode0
Decision Making in Non-Stationary Environments with Policy-Augmented SearchCode0
Learning Dynamic Selection and Pricing of Out-of-Home DeliveriesCode0
Show:102550
← PrevPage 24 of 25Next →

No leaderboard results yet.