SOTAVerified

Sequential Decision Making

Papers

Showing 5175 of 1210 papers

TitleStatusHype
Decision Stacks: Flexible Reinforcement Learning via Modular Generative ModelsCode1
Masked Trajectory Models for Prediction, Representation, and ControlCode1
Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision MakingCode1
Dynamic Multi-Robot Task Allocation under Uncertainty and Temporal ConstraintsCode1
Curriculum-based Reinforcement Learning for Distribution System Critical Load RestorationCode1
Object-Aware Regularization for Addressing Causal Confusion in Imitation LearningCode1
On Generalization Across Environments In Multi-Objective Reinforcement LearningCode1
Out of the Cage: How Stochastic Parrots Win in Cyber Security EnvironmentsCode1
Comparing Exploration-Exploitation Strategies of LLMs and Humans: Insights from Standard Multi-armed Bandit TasksCode1
PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in ControlCode1
Re-ReST: Reflection-Reinforced Self-Training for Language AgentsCode1
DataEnvGym: Data Generation Agents in Teacher Environments with Student FeedbackCode1
Bridging POMDPs and Bayesian decision making for robust maintenance planning under model uncertainty: An application to railway systemsCode1
Can Increasing Input Dimensionality Improve Deep Reinforcement Learning?Code1
Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value FunctionsCode1
RELIEF: Reinforcement Learning Empowered Graph Feature Prompt TuningCode1
CertRL: Formalizing Convergence Proofs for Value and Policy Iteration in CoqCode1
Co-Activation Graph Analysis of Safety-Verified and Explainable Deep Reinforcement Learning PoliciesCode1
ContainerGym: A Real-World Reinforcement Learning Benchmark for Resource AllocationCode1
Counterfactual Explanations in Sequential Decision Making Under UncertaintyCode1
Breadcrumbs to the Goal: Goal-Conditioned Exploration from Human-in-the-Loop FeedbackCode1
Can language agents be alternatives to PPO? A Preliminary Empirical Study On OpenAI GymCode1
Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions ModelingCode1
Comparing Deep Reinforcement Learning Algorithms in Two-Echelon Supply ChainsCode1
Decision Mamba: Reinforcement Learning via Sequence Modeling with Selective State SpacesCode1
Show:102550
← PrevPage 3 of 49Next →

No leaderboard results yet.