SOTAVerified

Sequential Decision Making

Papers

Showing 51100 of 1210 papers

TitleStatusHype
Dynamic Causal Bayesian OptimizationCode1
RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement LearningCode1
Simplified Temporal Consistency Reinforcement LearningCode1
Skill Set Optimization: Reinforcing Language Model Behavior via Transferable SkillsCode1
Efficient Symptom Inquiring and Diagnosis via Adaptive Alignment of Reinforcement Learning and ClassificationCode1
Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and BaselinesCode1
The Medkit-Learn(ing) Environment: Medical Decision Modelling through SimulationCode1
The Sandbox Environment for Generalizable Agent Research (SEGAR)Code1
TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned DecisionCode1
Training a Generally Curious AgentCode1
Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision MakingCode1
Unified Models of Human Behavioral Agents in Bandits, Contextual Bandits and RLCode1
Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions ModelingCode1
Enabling Intelligent Interactions between an Agent and an LLM: A Reinforcement Learning ApproachCode1
Counterfactual Explanations in Sequential Decision Making Under UncertaintyCode1
DataEnvGym: Data Generation Agents in Teacher Environments with Student FeedbackCode1
Comparing Exploration-Exploitation Strategies of LLMs and Humans: Insights from Standard Multi-armed Bandit TasksCode1
Co-Activation Graph Analysis of Safety-Verified and Explainable Deep Reinforcement Learning PoliciesCode1
ContainerGym: A Real-World Reinforcement Learning Benchmark for Resource AllocationCode1
Curriculum-based Reinforcement Learning for Distribution System Critical Load RestorationCode1
Decision Mamba: Reinforcement Learning via Sequence Modeling with Selective State SpacesCode1
Can Increasing Input Dimensionality Improve Deep Reinforcement Learning?Code1
Comparing Deep Reinforcement Learning Algorithms in Two-Echelon Supply ChainsCode1
Deep Reinforcement Learning for Entity AlignmentCode1
Bridging POMDPs and Bayesian decision making for robust maintenance planning under model uncertainty: An application to railway systemsCode1
AdaPlanner: Adaptive Planning from Feedback with Language ModelsCode1
Dynamic Multi-Robot Task Allocation under Uncertainty and Temporal ConstraintsCode1
Effective Reinforcement Learning through Evolutionary Surrogate-Assisted PrescriptionCode1
Extracting Reward Functions from Diffusion ModelsCode1
Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised LearningCode1
Hybrid Multi-agent Deep Reinforcement Learning for Autonomous Mobility on Demand SystemsCode1
Independent Reinforcement Learning for Weakly Cooperative Multiagent Traffic Control ProblemCode1
Can language agents be alternatives to PPO? A Preliminary Empirical Study On OpenAI GymCode1
Large Language Model as a Policy Teacher for Training Reinforcement Learning AgentsCode1
Layered and Staged Monte Carlo Tree Search for SMT Strategy SynthesisCode1
An Alternative Softmax Operator for Reinforcement LearningCode1
Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value FunctionsCode1
RELIEF: Reinforcement Learning Empowered Graph Feature Prompt TuningCode1
Markup-to-Image Diffusion Models with Scheduled SamplingCode1
Masked Trajectory Models for Prediction, Representation, and ControlCode1
Breadcrumbs to the Goal: Goal-Conditioned Exploration from Human-in-the-Loop FeedbackCode1
CertRL: Formalizing Convergence Proofs for Value and Policy Iteration in CoqCode1
An empirical evaluation of active inference in multi-armed banditsCode1
Multi-task Causal Learning with Gaussian ProcessesCode1
Decision Stacks: Flexible Reinforcement Learning via Modular Generative ModelsCode1
PDDLGym: Gym Environments from PDDL ProblemsCode1
PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in ControlCode1
Pursuing Overall Welfare in Federated Learning through Sequential Decision MakingCode1
Reinforced Sequential Decision-Making for Sepsis Treatment: The POSNEGDM Framework with Mortality Classifier and TransformerCode1
A Generative Machine Learning Approach to Policy Optimization in Pursuit-Evasion Games0
Show:102550
← PrevPage 2 of 25Next →

No leaderboard results yet.