SOTAVerified

Sequential Decision Making

Papers

Showing 51100 of 1210 papers

TitleStatusHype
Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions ModelingCode1
Skill Set Optimization: Reinforcing Language Model Behavior via Transferable SkillsCode1
Layered and Staged Monte Carlo Tree Search for SMT Strategy SynthesisCode1
Deep Reinforcement Learning for Entity AlignmentCode1
Can Increasing Input Dimensionality Improve Deep Reinforcement Learning?Code1
LLINBO: Trustworthy LLM-in-the-Loop Bayesian OptimizationCode1
The Sandbox Environment for Generalizable Agent Research (SEGAR)Code1
Thinking Fast and Slow with Deep Learning and Tree SearchCode1
Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision MakingCode1
Training a Generally Curious AgentCode1
Learning Discrete World Models for Heuristic SearchCode1
Unified Models of Human Behavioral Agents in Bandits, Contextual Bandits and RLCode1
IQ-Learn: Inverse soft-Q Learning for ImitationCode1
Can language agents be alternatives to PPO? A Preliminary Empirical Study On OpenAI GymCode1
Large Language Model as a Policy Teacher for Training Reinforcement Learning AgentsCode1
Learning Dynamic Belief Graphs to Generalize on Text-Based GamesCode1
Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised LearningCode1
Extracting Reward Functions from Diffusion ModelsCode1
How Can LLM Guide RL? A Value-Based ApproachCode1
Enabling Intelligent Interactions between an Agent and an LLM: A Reinforcement Learning ApproachCode1
Efficient Symptom Inquiring and Diagnosis via Adaptive Alignment of Reinforcement Learning and ClassificationCode1
Breadcrumbs to the Goal: Goal-Conditioned Exploration from Human-in-the-Loop FeedbackCode1
Hybrid Multi-agent Deep Reinforcement Learning for Autonomous Mobility on Demand SystemsCode1
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy OptimizationCode1
Dynamic Multi-Robot Task Allocation under Uncertainty and Temporal ConstraintsCode1
AdaPlanner: Adaptive Planning from Feedback with Language ModelsCode1
Bridging POMDPs and Bayesian decision making for robust maintenance planning under model uncertainty: An application to railway systemsCode1
Large Language Models for Planning: A Comprehensive and Systematic SurveyCode1
Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value FunctionsCode1
Effective Reinforcement Learning through Evolutionary Surrogate-Assisted PrescriptionCode1
RELIEF: Reinforcement Learning Empowered Graph Feature Prompt TuningCode1
LogiCity: Advancing Neuro-Symbolic AI with Abstract Urban SimulationCode1
Dynamic Causal Bayesian OptimizationCode1
Masked Trajectory Models for Prediction, Representation, and ControlCode1
Efficient Nonmyopic Bayesian Optimization via One-Shot Multi-Step TreesCode1
An Alternative Softmax Operator for Reinforcement LearningCode1
Multi-task Causal Learning with Gaussian ProcessesCode1
ContainerGym: A Real-World Reinforcement Learning Benchmark for Resource AllocationCode1
Occupancy Anticipation for Efficient Exploration and NavigationCode1
On Generalization Across Environments In Multi-Objective Reinforcement LearningCode1
Independent Reinforcement Learning for Weakly Cooperative Multiagent Traffic Control ProblemCode1
Premier-TACO is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive LossCode1
An empirical evaluation of active inference in multi-armed banditsCode1
Counterfactual Explanations in Sequential Decision Making Under UncertaintyCode1
Adaptive Stress Testing of Trajectory Predictions in Flight Management SystemsCode1
Decision Mamba: Reinforcement Learning via Sequence Modeling with Selective State SpacesCode1
Learning Multi-Level Hierarchies with HindsightCode1
Reinforcement learning with combinatorial actions for coupled restless banditsCode1
Reinforced Sequential Decision-Making for Sepsis Treatment: The POSNEGDM Framework with Mortality Classifier and TransformerCode1
Discrete-Time Distribution Steering using Monte Carlo Tree SearchCode0
Show:102550
← PrevPage 2 of 25Next →

No leaderboard results yet.