Balancing Act: Prioritization Strategies for LLM-Designed Restless Bandit Rewards Aug 22, 2024 Language Modeling Language Modelling
— Unverified 0Multi-agent Multi-armed Bandits with Stochastic Sharable Arm Capacities Aug 20, 2024 Multi-Armed Bandits
— Unverified 0GINO-Q: Learning an Asymptotically Optimal Index Policy for Restless Multi-armed Bandits Aug 19, 2024 Multi-Armed Bandits Q-Learning
— Unverified 0Contextual Bandits for Unbounded Context Distributions Aug 19, 2024 Decision Making Multi-Armed Bandits
— Unverified 0Reciprocal Learning Aug 12, 2024 Active Learning Multi-Armed Bandits
— Unverified 0Hierarchical Multi-Armed Bandits for the Concurrent Intelligent Tutoring of Concepts and Problems of Varying Difficulty Levels Aug 10, 2024 Knowledge Tracing Multi-Armed Bandits
Code Code Available 0Mitigating Exposure Bias in Online Learning to Rank Recommendation: A Novel Reward Model for Cascading Bandits Aug 8, 2024 Exposure Fairness Fairness
Code Code Available 0Combining Diverse Information for Coordinated Action: Stochastic Bandit Algorithms for Heterogeneous Agents Aug 6, 2024 Multi-Armed Bandits Sensitivity
Code Code Available 0Empathic Responding for Digital Interpersonal Emotion Regulation via Content Recommendation Aug 5, 2024 Multi-Armed Bandits
— Unverified 0Online Learning for Autonomous Management of Intent-based 6G Networks Jul 25, 2024 Efficient Exploration Management
— Unverified 0Identifiable latent bandits: Combining observational data and exploration for personalized healthcare Jul 23, 2024 Decision Making Multi-Armed Bandits
— Unverified 0Scalable Exploration via Ensemble++ Jul 18, 2024 Computational Efficiency Decision Making
Code Code Available 0Satisficing Exploration for Deep Reinforcement Learning Jul 16, 2024 Deep Reinforcement Learning Multi-Armed Bandits
— Unverified 0On Speeding Up Language Model Evaluation Jul 8, 2024 Language Model Evaluation Language Modeling
— Unverified 0Open Problem: Tight Bounds for Kernelized Multi-Armed Bandits with Bernoulli Rewards Jul 8, 2024 Multi-Armed Bandits
— Unverified 0Honor Among Bandits: No-Regret Learning for Online Fair Division Jul 1, 2024 Fairness Multi-Armed Bandits
— Unverified 0A Contextual Combinatorial Bandit Approach to Negotiation Jun 30, 2024 Multi-Armed Bandits
— Unverified 0Classical Bandit Algorithms for Entanglement Detection in Parameterized Qubit States Jun 28, 2024 Multi-Armed Bandits
— Unverified 0Jump Starting Bandits with LLM-Generated Prior Knowledge Jun 27, 2024 Multi-Armed Bandits Recommendation Systems
Code Code Available 0EduQate: Generating Adaptive Curricula through RMABs in Education Settings Jun 20, 2024 Multi-Armed Bandits Q-Learning
— Unverified 0BEACON: Balancing Convenience and Nutrition in Meals With Long-Term Group Recommendations and Reasoning on Multimodal Recipes Jun 19, 2024 Multi-Armed Bandits Nutrition
— Unverified 0Towards Bayesian Data Selection Jun 18, 2024 Active Learning Additive models
— Unverified 0Improving Reward-Conditioned Policies for Multi-Armed Bandits using Normalized Weight Functions Jun 16, 2024 Multi-Armed Bandits Policy Gradient Methods
— Unverified 0Linear Contextual Bandits with Hybrid Payoff: Revisited Jun 14, 2024 Diversity Multi-Armed Bandits
Code Code Available 0An Adaptive Method for Contextual Stochastic Multi-armed Bandits with Rewards Generated by a Linear Dynamical System Jun 14, 2024 Multi-Armed Bandits
— Unverified 0Towards Domain Adaptive Neural Contextual Bandits Jun 13, 2024 Decision Making Domain Adaptation
— Unverified 0A Federated Online Restless Bandit Framework for Cooperative Resource Allocation Jun 12, 2024 Federated Learning Multi-Armed Bandits
— Unverified 0Asymptotically Optimal Regret for Black-Box Predict-then-Optimize Jun 12, 2024 Decision Making Multi-Armed Bandits
— Unverified 0Sample Complexity Reduction via Policy Difference Estimation in Tabular Reinforcement Learning Jun 11, 2024 Multi-Armed Bandits Reinforcement Learning (RL)
— Unverified 0A conversion theorem and minimax optimality for continuum contextual bandits Jun 9, 2024 Multi-Armed Bandits
— Unverified 0Data-Driven Upper Confidence Bounds with Near-Optimal Regret for Heavy-Tailed Bandits Jun 9, 2024 Decision Making Multi-Armed Bandits
— Unverified 0Adaptively Learning to Select-Rank in Online Platforms Jun 7, 2024 Multi-Armed Bandits Thompson Sampling
— Unverified 0Optimal Batched Linear Bandits Jun 6, 2024 Computational Efficiency Multi-Armed Bandits
Code Code Available 0Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond Jun 3, 2024 Multi-Armed Bandits Reinforcement Learning (RL)
— Unverified 0Global Rewards in Restless Multi-Armed Bandits Jun 2, 2024 Multi-Armed Bandits
— Unverified 0A Batch Sequential Halving Algorithm without Performance Degradation Jun 1, 2024 Computational Efficiency Multi-Armed Bandits
— Unverified 0Strategic Linear Contextual Bandits Jun 1, 2024 Multi-Armed Bandits Recommendation Systems
— Unverified 0No-Regret Learning for Fair Multi-Agent Social Welfare Optimization May 31, 2024 Fairness Multi-Armed Bandits
— Unverified 0Understanding Memory-Regret Trade-Off for Streaming Stochastic Multi-Armed Bandits May 30, 2024 Multi-Armed Bandits
— Unverified 0Multi-Armed Bandits with Network Interference May 28, 2024 Multi-Armed Bandits
Code Code Available 0Causal Contextual Bandits with Adaptive Context May 28, 2024 Multi-Armed Bandits
Code Code Available 0Offline Oracle-Efficient Learning for Contextual MDPs via Layerwise Exploration-Exploitation Tradeoff May 28, 2024 Density Estimation Multi-Armed Bandits
— Unverified 0Optimizing Sharpe Ratio: Risk-Adjusted Decision-Making in Multi-Armed Bandits May 28, 2024 Decision Making Management
— Unverified 0Multi-Player Approaches for Dueling Bandits May 25, 2024 Multi-Armed Bandits
— Unverified 0Indexed Minimum Empirical Divergence-Based Algorithms for Linear Bandits May 24, 2024 Multi-Armed Bandits Thompson Sampling
— Unverified 0Budgeted Recommendation with Delayed Feedback May 19, 2024 Decision Making Multi-Armed Bandits
— Unverified 0No-Regret is not enough! Bandits with General Constraints through Adaptive Regret Minimization May 10, 2024 Multi-Armed Bandits
— Unverified 0Imprecise Multi-Armed Bandits May 9, 2024 Multi-Armed Bandits
— Unverified 0Federated Combinatorial Multi-Agent Multi-Armed Bandits May 9, 2024 Combinatorial Optimization Data Summarization
— Unverified 0Optimal Baseline Corrections for Off-Policy Contextual Bandits May 9, 2024 Decision Making Multi-Armed Bandits
Code Code Available 0