SOTAVerified

Sequential Decision Making

Papers

Showing 51100 of 1210 papers

TitleStatusHype
Extracting Reward Functions from Diffusion ModelsCode1
AdaPlanner: Adaptive Planning from Feedback with Language ModelsCode1
Masked Trajectory Models for Prediction, Representation, and ControlCode1
X-RLflow: Graph Reinforcement Learning for Neural Network Subgraphs TransformationCode1
TempoRL: laser pulse temporal shape optimization with Deep Reinforcement LearningCode1
Variational Information Pursuit for Interpretable PredictionsCode1
Risk-Sensitive Policy with Distributional Reinforcement LearningCode1
Bridging POMDPs and Bayesian decision making for robust maintenance planning under model uncertainty: An application to railway systemsCode1
Hybrid Multi-agent Deep Reinforcement Learning for Autonomous Mobility on Demand SystemsCode1
UniMASK: Unified Inference in Sequential Decision ProblemsCode1
Markup-to-Image Diffusion Models with Scheduled SamplingCode1
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy OptimizationCode1
Transformer Neural Processes: Uncertainty-Aware Meta Learning Via Sequence ModelingCode1
Comparing Deep Reinforcement Learning Algorithms in Two-Echelon Supply ChainsCode1
The Sandbox Environment for Generalizable Agent Research (SEGAR)Code1
Curriculum-based Reinforcement Learning for Distribution System Critical Load RestorationCode1
Deep Reinforcement Learning for Entity AlignmentCode1
Efficient Symptom Inquiring and Diagnosis via Adaptive Alignment of Reinforcement Learning and ClassificationCode1
RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement LearningCode1
Object-Aware Regularization for Addressing Causal Confusion in Imitation LearningCode1
Dynamic Causal Bayesian OptimizationCode1
Medical Dead-ends and Learning to Identify High-risk States and TreatmentsCode1
Counterfactual Explanations in Sequential Decision Making Under UncertaintyCode1
IQ-Learn: Inverse soft-Q Learning for ImitationCode1
The Medkit-Learn(ing) Environment: Medical Decision Modelling through SimulationCode1
Independent Reinforcement Learning for Weakly Cooperative Multiagent Traffic Control ProblemCode1
Mixed Policy Gradient: off-policy reinforcement learning driven jointly by data and modelCode1
An empirical evaluation of active inference in multi-armed banditsCode1
TimeSHAP: Explaining Recurrent Models through Sequence PerturbationsCode1
Adaptive Stress Testing of Trajectory Predictions in Flight Management SystemsCode1
Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and BaselinesCode1
Multi-task Causal Learning with Gaussian ProcessesCode1
CertRL: Formalizing Convergence Proofs for Value and Policy Iteration in CoqCode1
Occupancy Anticipation for Efficient Exploration and NavigationCode1
Efficient Nonmyopic Bayesian Optimization via One-Shot Multi-Step TreesCode1
Dynamic Multi-Robot Task Allocation under Uncertainty and Temporal ConstraintsCode1
Unified Models of Human Behavioral Agents in Bandits, Contextual Bandits and RLCode1
Can Increasing Input Dimensionality Improve Deep Reinforcement Learning?Code1
Learning Dynamic Belief Graphs to Generalize on Text-Based GamesCode1
PDDLGym: Gym Environments from PDDL ProblemsCode1
Effective Reinforcement Learning through Evolutionary Surrogate-Assisted PrescriptionCode1
Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision MakingCode1
Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value FunctionsCode1
Reinforcement Learning for Temporal Logic Control Synthesis with Probabilistic Satisfaction GuaranteesCode1
Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions ModelingCode1
Learning Multi-Level Hierarchies with HindsightCode1
SkipNet: Learning Dynamic Routing in Convolutional NetworksCode1
Thinking Fast and Slow with Deep Learning and Tree SearchCode1
An Alternative Softmax Operator for Reinforcement LearningCode1
AirLLM: Diffusion Policy-based Adaptive LoRA for Remote Fine-Tuning of LLM over the Air0
Show:102550
← PrevPage 2 of 25Next →

No leaderboard results yet.