SOTAVerified

Sequential Decision Making

Papers

Showing 301350 of 1210 papers

TitleStatusHype
Fast Value Tracking for Deep Reinforcement Learning0
Offline Imitation of Badminton Player Behavior via Experiential Contexts and Brownian MotionCode0
Supervised Fine-Tuning as Inverse Reinforcement Learning0
State-Separated SARSA: A Practical Sequential Decision-Making Algorithm with Recovering Rewards0
Distributed Multi-Objective Dynamic Offloading Scheduling for Air-Ground Cooperative MEC0
Regret Minimization via Saddle Point Optimization0
AutoGuide: Automated Generation and Selection of Context-Aware Guidelines for Large Language Model Agents0
Reinforced Sequential Decision-Making for Sepsis Treatment: The POSNEGDM Framework with Mortality Classifier and TransformerCode1
CoRAL: Collaborative Retrieval-Augmented Large Language Models Improve Long-tail Recommendation0
LinearAPT: An Adaptive Algorithm for the Fixed-Budget Thresholding Linear Bandit Problem0
TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned DecisionCode1
Inverse Design of Photonic Crystal Surface Emitting Lasers is a Sequence Modeling Problem0
Cooperative Bayesian Optimization for Imperfect Agents0
A Survey on Applications of Reinforcement Learning in Spatial Resource Allocation0
Language Guided Exploration for RL Agents in Text Environments0
On the Role of Information Structure in Reinforcement Learning for Partially-Observable Sequential Teams and Games0
Adaptive Learning Rate for Follow-the-Regularized-Leader: Competitive Analysis and Best-of-Both-Worlds0
Successfully Guiding Humans with Imperfect Instructions by Highlighting Potential Errors and Suggesting Corrections0
How Can LLM Guide RL? A Value-Based ApproachCode1
Reward Design for Justifiable Sequential Decision-MakingCode0
Information-Theoretic Safe Bayesian Optimization0
On the Performance of Empirical Risk Minimization with Smoothed Data0
BeTAIL: Behavior Transformer Adversarial Imitation Learning from Human Racing Gameplay0
Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future DirectionsCode2
Toward TransfORmers: Revolutionizing the Solution of Mixed Integer Programs with Transformers0
Align Your Intents: Offline Imitation Learning via Optimal Transport0
Self-evolving Autoencoder Embedded Q-Network0
Probability Tools for Sequential Random Projection0
PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in ControlCode1
Jack of All Trades, Master of Some, a Multi-Purpose Transformer AgentCode2
Epistemic Exploration for Generalizable Planning and Learning in Non-Stationary SettingsCode0
Online Sequential Decision-Making with Unknown Delays0
Noise-Adaptive Confidence Sets for Linear Bandits and Application to Bayesian OptimizationCode0
Auxiliary Reward Generation with Transition Distance Representation Learning0
Premier-TACO is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive LossCode1
Offline Risk-sensitive RL with Partial Observability to Enhance Performance in Human-Robot Teaming0
Sym-Q: Adaptive Symbolic Regression via Sequential Decision-MakingCode1
Logical Specifications-guided Dynamic Task Sampling for Reinforcement Learning AgentsCode0
A Reinforcement Learning Approach for Dynamic Rebalancing in Bike-Sharing System0
Skill Set Optimization: Reinforcing Language Model Behavior via Transferable SkillsCode1
Multi-Agent Reinforcement Learning for Offloading Cellular Communications with Cooperating UAVs0
Vertical Symbolic Regression via Deep Policy GradientCode0
Zero-Shot Reinforcement Learning via Function EncodersCode0
Layered and Staged Monte Carlo Tree Search for SMT Strategy SynthesisCode1
Regularized Q-Learning with Linear Function Approximation0
Long-Term Fair Decision Making through Deep Generative ModelsCode0
Stochastic Dynamic Power Dispatch with High Generalization and Few-Shot Adaption via Contextual Meta Graph Reinforcement Learning0
Learning Non-myopic Power Allocation in Constrained ScenariosCode0
LLMs for Relational Reasoning: How Far are We?0
Towards Off-Policy Reinforcement Learning for Ranking Policies with Human Feedback0
Show:102550
← PrevPage 7 of 25Next →

No leaderboard results yet.