SOTAVerified

Sequential Decision Making

Papers

Showing 2650 of 1210 papers

TitleStatusHype
Learning Discrete World Models for Heuristic SearchCode1
RELIEF: Reinforcement Learning Empowered Graph Feature Prompt TuningCode1
Re-ReST: Reflection-Reinforced Self-Training for Language AgentsCode1
Pursuing Overall Welfare in Federated Learning through Sequential Decision MakingCode1
Rethinking Transformers in Solving POMDPsCode1
Decision Mamba: Reinforcement Learning via Sequence Modeling with Selective State SpacesCode1
Reinforced Sequential Decision-Making for Sepsis Treatment: The POSNEGDM Framework with Mortality Classifier and TransformerCode1
TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned DecisionCode1
How Can LLM Guide RL? A Value-Based ApproachCode1
PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in ControlCode1
Premier-TACO is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive LossCode1
Sym-Q: Adaptive Symbolic Regression via Sequential Decision-MakingCode1
Skill Set Optimization: Reinforcing Language Model Behavior via Transferable SkillsCode1
Layered and Staged Monte Carlo Tree Search for SMT Strategy SynthesisCode1
LLF-Bench: Benchmark for Interactive Learning from Language FeedbackCode1
Can language agents be alternatives to PPO? A Preliminary Empirical Study On OpenAI GymCode1
Large Language Model as a Policy Teacher for Training Reinforcement Learning AgentsCode1
Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised LearningCode1
Out of the Cage: How Stochastic Parrots Win in Cyber Security EnvironmentsCode1
Breadcrumbs to the Goal: Goal-Conditioned Exploration from Human-in-the-Loop FeedbackCode1
ContainerGym: A Real-World Reinforcement Learning Benchmark for Resource AllocationCode1
Sampling from Gaussian Process Posteriors using Stochastic Gradient DescentCode1
Simplified Temporal Consistency Reinforcement LearningCode1
Decision Stacks: Flexible Reinforcement Learning via Modular Generative ModelsCode1
Enabling Intelligent Interactions between an Agent and an LLM: A Reinforcement Learning ApproachCode1
Show:102550
← PrevPage 2 of 49Next →

No leaderboard results yet.