| "Give Me an Example Like This": Episodic Active Reinforcement Learning from Demonstrations | Jun 5, 2024 | Active LearningReinforcement Learning (RL) | CodeCode Available | 0 |
| Sound Heuristic Search Value Iteration for Undiscounted POMDPs with Reachability Objectives | Jun 5, 2024 | Decision MakingEfficient Exploration | CodeCode Available | 0 |
| Rectifying Reinforcement Learning for Reward Matching | Jun 4, 2024 | Decision Makingreinforcement-learning | —Unverified | 0 |
| Re-ReST: Reflection-Reinforced Self-Training for Language Agents | Jun 3, 2024 | Code GenerationImage Generation | CodeCode Available | 1 |
| Combining Experimental and Historical Data for Policy Evaluation | Jun 1, 2024 | Data IntegrationDecision Making | CodeCode Available | 0 |
| Reward Machines for Deep RL in Noisy and Uncertain Environments | May 31, 2024 | counterfactualDecision Making | CodeCode Available | 0 |
| Pursuing Overall Welfare in Federated Learning through Sequential Decision Making | May 31, 2024 | Decision MakingFairness | CodeCode Available | 1 |
| Low-rank finetuning for LLMs: A fairness perspective | May 28, 2024 | Decision MakingFairness | —Unverified | 0 |
| OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates of Multiple Estimators | May 27, 2024 | Decision MakingOffline RL | —Unverified | 0 |
| Rethinking Transformers in Solving POMDPs | May 27, 2024 | Decision MakingReinforcement Learning (RL) | CodeCode Available | 1 |