SOTAVerified

Sequential Decision Making

Papers

Showing 251275 of 1210 papers

TitleStatusHype
Sound Heuristic Search Value Iteration for Undiscounted POMDPs with Reachability ObjectivesCode0
"Give Me an Example Like This": Episodic Active Reinforcement Learning from DemonstrationsCode0
Rectifying Reinforcement Learning for Reward Matching0
Re-ReST: Reflection-Reinforced Self-Training for Language AgentsCode1
Combining Experimental and Historical Data for Policy EvaluationCode0
Reward Machines for Deep RL in Noisy and Uncertain EnvironmentsCode0
Pursuing Overall Welfare in Federated Learning through Sequential Decision MakingCode1
Low-rank finetuning for LLMs: A fairness perspective0
OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates of Multiple Estimators0
Leveraging Offline Data in Linear Latent Bandits0
Rethinking Transformers in Solving POMDPsCode1
Variational Offline Multi-agent Skill Discovery0
Inference of Utilities and Time Preference in Sequential Decision-Making0
Inverse-RLignment: Large Language Model Alignment from Demonstrations through Inverse Reinforcement Learning0
Reinforcing Language Agents via Policy Optimization with Action Decomposition0
Efficiently Training Deep-Learning Parametric Policies using Lagrangian Duality0
A finite time analysis of distributed Q-learning0
Understanding the Training and Generalization of Pretrained Transformer for Sequential Decision Making0
FLIPHAT: Joint Differential Privacy for High Dimensional Sparse Linear BanditsCode0
On the Brittle Foundations of ReAct Prompting for Agentic Large Language Models0
A Unified Linear Programming Framework for Offline Reward Learning from Human Demonstrations and Feedback0
Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning?Code0
Optimistic Query Routing in Clustering-based Approximate Maximum Inner Product SearchCode0
CPS-LLM: Large Language Model based Safe Usage Plan Generator for Human-in-the-Loop Human-in-the-Plant Cyber-Physical System0
AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments0
Show:102550
← PrevPage 11 of 49Next →

No leaderboard results yet.