| Tapilot-Crossing: Benchmarking and Evolving LLMs Towards Interactive Data Analysis Agents | Mar 8, 2024 | BenchmarkingDecision Making | CodeCode Available | 1 |
| Human vs. Machine: Behavioral Differences Between Expert Humans and Language Models in Wargame Simulations | Mar 6, 2024 | Decision Making | CodeCode Available | 1 |
| MiKASA: Multi-Key-Anchor & Scene-Aware Transformer for 3D Visual Grounding | Mar 5, 2024 | 3D visual groundingDecision Making | CodeCode Available | 1 |
| AgentsCourt: Building Judicial Decision-Making Agents with Court Debate Simulation and Legal Knowledge Augmentation | Mar 5, 2024 | ArticlesDecision Making | CodeCode Available | 1 |
| ComTraQ-MPC: Meta-Trained DQN-MPC Integration for Trajectory Tracking with Limited Active Localization Updates | Mar 3, 2024 | Decision MakingModel Predictive Control | CodeCode Available | 1 |
| Playing NetHack with LLMs: Potential & Limitations as Zero-Shot Agents | Mar 1, 2024 | Decision MakingMinecraft | CodeCode Available | 1 |
| MemoNav: Working Memory Model for Visual Navigation | Feb 29, 2024 | Decision MakingGraph Attention | CodeCode Available | 1 |
| Large Language Models are Learnable Planners for Long-Term Recommendation | Feb 29, 2024 | Decision MakingLanguage Modelling | CodeCode Available | 1 |
| Benchmarking Data Science Agents | Feb 27, 2024 | BenchmarkingCode Generation | CodeCode Available | 1 |
| How Can LLM Guide RL? A Value-Based Approach | Feb 25, 2024 | Decision MakingReinforcement Learning (RL) | CodeCode Available | 1 |