| Explainable Claim Verification via Knowledge-Grounded Reasoning with Large Language Models | Oct 8, 2023 | Claim VerificationDecision Making | CodeCode Available | 1 |
| AvalonBench: Evaluating LLMs Playing the Game of Avalon | Oct 8, 2023 | Decision Making | CodeCode Available | 1 |
| Deep Learning for Two-Stage Robust Integer Optimization | Oct 6, 2023 | Decision MakingDeep Learning | CodeCode Available | 1 |
| Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets | Oct 6, 2023 | D4RLDecision Making | CodeCode Available | 1 |
| Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning | Oct 4, 2023 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use | Oct 4, 2023 | Decision Making | CodeCode Available | 1 |
| Trainable Noise Model as an XAI evaluation method: application on Sobol for remote sensing image segmentation | Oct 3, 2023 | Autonomous DrivingDecision Making | CodeCode Available | 1 |
| Towards Robust Fidelity for Evaluating Explainability of Graph Neural Networks | Oct 3, 2023 | Decision Making | CodeCode Available | 1 |
| Talk2BEV: Language-enhanced Bird's-eye View Maps for Autonomous Driving | Oct 3, 2023 | Autonomous DrivingDecision Making | CodeCode Available | 1 |
| Mini-BEHAVIOR: A Procedurally Generated Benchmark for Long-horizon Decision-Making in Embodied AI | Oct 3, 2023 | Decision Making | CodeCode Available | 1 |