| Multi-expert Prompting Improves Reliability, Safety, and Usefulness of Large Language Models | Nov 1, 2024 | Decision MakingInformativeness | CodeCode Available | 1 |
| Online Intrinsic Rewards for Decision Making Agents from Large Language Model Feedback | Oct 30, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| DiffLight: A Partial Rewards Conditioned Diffusion Model for Traffic Signal Control with Missing Data | Oct 30, 2024 | Decision MakingImputation | CodeCode Available | 1 |
| Toward Conditional Distribution Calibration in Survival Prediction | Oct 27, 2024 | Conformal PredictionDecision Making | CodeCode Available | 1 |
| ROCKET-1: Mastering Open-World Interaction with Visual-Temporal Context Prompting | Oct 23, 2024 | Decision MakingMinecraft | CodeCode Available | 1 |
| Reflection-Bench: probing AI intelligence with reflection | Oct 21, 2024 | counterfactualDecision Making | CodeCode Available | 1 |
| A Comprehensive Evaluation of Cognitive Biases in LLMs | Oct 20, 2024 | Decision Making | CodeCode Available | 1 |
| MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficient Mobile Task Automation | Oct 17, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation | Oct 17, 2024 | Decision Making | CodeCode Available | 1 |
| Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning | Oct 17, 2024 | Decision MakingReinforcement Learning (RL) | CodeCode Available | 1 |