SOTAVerified

Sequential Decision Making

Papers

Showing 150 of 1210 papers

TitleStatusHype
AirLLM: Diffusion Policy-based Adaptive LoRA for Remote Fine-Tuning of LLM over the Air0
LLM-Stackelberg Games: Conjectural Reasoning Equilibria and Their Applications to Spearphishing0
A Survey of Continual Reinforcement Learning0
Flow-Based Single-Step Completion for Efficient and Expressive Policy Learning0
POLAR: A Pessimistic Model-based Policy Learning Algorithm for Dynamic Treatment Regimes0
Efficient Strategy Synthesis for MDPs via Hierarchical Block Decomposition0
UProp: Investigating the Uncertainty Propagation of LLMs in Multi-Step Agentic Decision-MakingCode0
Multi-Armed Bandits With Machine Learning-Generated Surrogate Rewards0
Adaptive Action Duration with Contextual Bandits for Deep Reinforcement Learning in Dynamic EnvironmentsCode0
Common Benchmarks Undervalue the Generalization Power of Programmatic PoliciesCode0
Leveraging In-Context Learning for Language Model Agents0
Revisiting Clustering of Neural Bandits: Selective Reinitialization for Mitigating Loss of Plasticity0
Towards Responsible AI: Advances in Safety, Fairness, and Accountability of Autonomous Systems0
TooBadRL: Trigger Optimization to Boost Effectiveness of Backdoor Attacks on Deep Reinforcement LearningCode0
How to Provably Improve Return Conditioned Supervised Learning?0
QForce-RL: Quantized FPGA-Optimized Reinforcement Learning Compute Engine0
Contextual Experience Replay for Self-Improvement of Language Agents0
AutoQD: Automatic Discovery of Diverse Behaviors with Quality-Diversity Optimization0
TextAtari: 100K Frames Game Playing with Language AgentsCode0
Active Layer-Contrastive Decoding Reduces Hallucination in Large Language Model Generation0
Emergent Risk Awareness in Rational Agents under Resource Constraints0
Adaptive Frontier Exploration on Graphs with Applications to Network-Based Disease Testing0
Variational Deep Learning via Implicit Regularization0
Large Language Models for Planning: A Comprehensive and Systematic SurveyCode1
DDO: Dual-Decision Optimization via Multi-Agent Collaboration for LLM-Based Medical Consultation0
Automata Learning of Preferences over Temporal Logic Formulas from Pairwise Comparisons0
Reward Is Enough: LLMs Are In-Context Reinforcement Learners0
Web-Shepherd: Advancing PRMs for Reinforcing Web AgentsCode2
Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function ApproximationCode0
LLINBO: Trustworthy LLM-in-the-Loop Bayesian OptimizationCode1
Vid2World: Crafting Video Diffusion Models to Interactive World Models0
OMGPT: A Sequence Modeling Framework for Data-driven Operational Decision Making0
Generalization Guarantees for Learning Branch-and-Cut Policies in Integer Programming0
Deep Symbolic Optimization: Reinforcement Learning for Symbolic Mathematics0
Batched Nonparametric Bandits via k-Nearest Neighbor UCB0
Comparing Exploration-Exploitation Strategies of LLMs and Humans: Insights from Standard Multi-armed Bandit TasksCode1
Counterfactual Strategies for Markov Decision Processes0
Sequential Treatment Effect Estimation with Unmeasured Confounders0
rfPG: Robust Finite-Memory Policy Gradients for Hidden-Model POMDPs0
A Practical Introduction to Deep Reinforcement Learning0
Explainable Reinforcement Learning Agents Using World Models0
A Multi-Agent Reinforcement Learning Approach for Cooperative Air-Ground-Human Crowdsensing in Emergency Rescue0
Constrained Online Decision-Making: A Unified Framework0
RL-DAUNCE: Reinforcement Learning-Driven Data Assimilation with Uncertainty-Aware Constrained Ensembles0
Active Sampling for MRI-based Sequential Decision MakingCode0
Policy-labeled Preference Learning: Is Preference Enough for RLHF?0
MDPs with a State Sensing Cost0
D3HRL: A Distributed Hierarchical Reinforcement Learning Approach Based on Causal Discovery and Spurious Correlation Detection0
Bayesian learning of the optimal action-value function in a Markov decision process0
A Minimax-MDP Framework with Future-imposed Conditions for Learning-augmented Problems0
Show:102550
← PrevPage 1 of 25Next →

No leaderboard results yet.