SOTAVerified

Sequential Decision Making

Papers

Showing 151200 of 1210 papers

TitleStatusHype
FLIPHAT: Joint Differential Privacy for High Dimensional Sparse Linear BanditsCode0
Anderson Acceleration for Partially Observable Markov Decision Processes: A Maximum Entropy ApproachCode0
Finding Counterfactually Optimal Action Sequences in Continuous State SpacesCode0
Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With RenegingCode0
Explainable Knowledge Graph Embedding: Inference Reconciliation for Knowledge Inferences Supporting Robot ActionsCode0
Evolutionary Multi-Armed Bandits with Genetic Thompson SamplingCode0
Algorithms for Fairness in Sequential Decision MakingCode0
Adapting Static Fairness to Sequential Decision-Making: Bias Mitigation Strategies towards Equal Long-term Benefit RateCode0
A Biologically Plausible Benchmark for Contextual Bandit Algorithms in Precision Oncology Using in vitro DataCode0
Enforcing Almost-Sure Reachability in POMDPsCode0
Fast reinforcement learning with generalized policy updatesCode0
Hindsight and Sequential Rationality of Correlated PlayCode0
Efficient Sequence Labeling with Actor-Critic TrainingCode0
Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games: CorrectionsCode0
Efficient Symbolic Policy Learning with Differentiable Symbolic ExpressionCode0
Dynamic Real-time Multimodal Routing with Hierarchical Hybrid PlanningCode0
Dynamic Simplex: Balancing Safety and Performance in Autonomous Cyber Physical SystemsCode0
An Adaptable Budget Planner for Enhancing Budget-Constrained Auto-Bidding in Online AdvertisingCode0
Batch Bayesian optimisation via density-ratio estimation with guaranteesCode0
Ecole: A Library for Learning Inside MILP SolversCode0
Doubly Robust Off-policy Value Evaluation for Reinforcement LearningCode0
Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form GamesCode0
Doubly Inhomogeneous Reinforcement LearningCode0
Doubly Robust Policy Evaluation and OptimizationCode0
Distance Weighted Supervised Learning for Offline Interaction DataCode0
End-to-End Goal-Driven Web NavigationCode0
Enhancing the Accuracy and Fairness of Human Decision MakingCode0
Epistemic Exploration for Generalizable Planning and Learning in Non-Stationary SettingsCode0
Bellman Meets Hawkes: Model-Based Reinforcement Learning via Temporal Point ProcessesCode0
Best Arm Identification for Stochastic Rising BanditsCode0
Scalable Exploration via Ensemble++Code0
Exploring the Inquiry-Diagnosis Relationship with Advanced Patient SimulatorsCode0
Differentially Private Regret Minimization in Episodic Markov Decision ProcessesCode0
Federated Online Clustering of BanditsCode0
Act as You Learn: Adaptive Decision-Making in Non-Stationary Markov Decision ProcessesCode0
FLARE: Fingerprinting Deep Reinforcement Learning Agents using Universal Adversarial MasksCode0
Back to the Future -- Sequential Alignment of Text RepresentationsCode0
Frequentist Uncertainty in Recurrent Neural Networks via Blockwise Influence FunctionsCode0
β-Multivariational Autoencoder for Entangled Representation Learning in Video FramesCode0
Adaptive Sequence SubmodularityCode0
Differential Privacy in Cooperative Multiagent PlanningCode0
AVID: Adapting Video Diffusion Models to World ModelsCode0
Reinforcement Learning applied to Insurance Portfolio PursuitCode0
Did we personalize? Assessing personalization by an online reinforcement learning algorithm using resamplingCode0
Adaptive teachers for amortized samplersCode0
Discrete-Time Distribution Steering using Monte Carlo Tree SearchCode0
Bridging by Word: Image Grounded Vocabulary Construction for Visual CaptioningCode0
Dynamical Linear BanditsCode0
Hindsight-DICE: Stable Credit Assignment for Deep Reinforcement LearningCode0
Autoregressive BanditsCode0
Show:102550
← PrevPage 4 of 25Next →

No leaderboard results yet.