| Can language agents be alternatives to PPO? A Preliminary Empirical Study On OpenAI Gym | Dec 6, 2023 | BenchmarkingDecision Making | CodeCode Available | 1 |
| Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving | Dec 6, 2023 | Autonomous DrivingDecision Making | CodeCode Available | 1 |
| RiskBench: A Scenario-based Benchmark for Risk Identification | Dec 4, 2023 | Decision Making | CodeCode Available | 1 |
| MEDPSeg: Hierarchical polymorphic multitask learning for the segmentation of ground-glass opacities, consolidation, and pulmonary structures on computed tomography | Dec 4, 2023 | AnatomyComputed Tomography (CT) | CodeCode Available | 1 |
| Understanding the (Extra-)Ordinary: Validating Deep Model Decisions with Prototypical Concept-based Explanations | Nov 28, 2023 | Decision Making | CodeCode Available | 1 |
| Utilizing Explainability Techniques for Reinforcement Learning Model Assurance | Nov 27, 2023 | Decision MakingDeep Reinforcement Learning | CodeCode Available | 1 |
| Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models | Nov 27, 2023 | Decision MakingQuestion Answering | CodeCode Available | 1 |
| VSViG: Real-time Video-based Seizure Detection via Skeleton-based Spatiotemporal ViG | Nov 24, 2023 | Action RecognitionDecision Making | CodeCode Available | 1 |
| Large Language Model as a Policy Teacher for Training Reinforcement Learning Agents | Nov 22, 2023 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| Labeling Neural Representations with Inverse Recognition | Nov 22, 2023 | Decision MakingSegmentation | CodeCode Available | 1 |
| Physical Reasoning and Object Planning for Household Embodied Agents | Nov 22, 2023 | 2kDecision Making | CodeCode Available | 1 |
| From Classification to Clinical Insights: Towards Analyzing and Reasoning About Mobile and Behavioral Health Data With Large Language Models | Nov 21, 2023 | Decision Making | CodeCode Available | 1 |
| Inherently Interpretable Time Series Classification via Multiple Instance Learning | Nov 16, 2023 | Decision MakingMultiple Instance Learning | CodeCode Available | 1 |
| DocLens: Multi-aspect Fine-grained Evaluation for Medical Text Generation | Nov 16, 2023 | Decision MakingInstruction Following | CodeCode Available | 1 |
| ToolTalk: Evaluating Tool-Usage in a Conversational Setting | Nov 15, 2023 | Decision Making | CodeCode Available | 1 |
| XplainLLM: A Knowledge-Augmented Dataset for Reliable Grounded Explanations in LLMs | Nov 15, 2023 | Decision MakingDecoder | CodeCode Available | 1 |
| A Comprehensive Evaluation of GPT-4V on Knowledge-Intensive Visual Question Answering | Nov 13, 2023 | Decision MakingExplanation Generation | CodeCode Available | 1 |
| Real-Time Machine-Learning-Based Optimization Using Input Convex Long Short-Term Memory Network | Nov 13, 2023 | Chemical ProcessComputational Efficiency | CodeCode Available | 1 |
| Benchmarking PtO and PnO Methods in the Predictive Combinatorial Optimization Regime | Nov 13, 2023 | BenchmarkingCombinatorial Optimization | CodeCode Available | 1 |
| MonoProb: Self-Supervised Monocular Depth Estimation with Interpretable Uncertainty | Nov 10, 2023 | Autonomous VehiclesDecision Making | CodeCode Available | 1 |
| ADaPT: As-Needed Decomposition and Planning with Language Models | Nov 8, 2023 | Decision Making | CodeCode Available | 1 |
| Everything of Thoughts: Defying the Law of Penrose Triangle for Thought Generation | Nov 7, 2023 | Decision Making | CodeCode Available | 1 |
| ALYMPICS: LLM Agents Meet Game Theory -- Exploring Strategic Decision-Making with AI Agents | Nov 6, 2023 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| Cal-DETR: Calibrated Detection Transformer | Nov 6, 2023 | Decision Making | CodeCode Available | 1 |
| An algorithmic framework for synthetic cost-aware decision making in molecular design | Nov 3, 2023 | Decision MakingProperty Prediction | CodeCode Available | 1 |
| DEFN: Dual-Encoder Fourier Group Harmonics Network for Three-Dimensional Indistinct-Boundary Object Segmentation | Nov 1, 2023 | 3D ReconstructionData Augmentation | CodeCode Available | 1 |
| Advances in Embodied Navigation Using Large Language Models: A Survey | Nov 1, 2023 | Decision Making | CodeCode Available | 1 |
| Interpretable Prototype-based Graph Information Bottleneck | Oct 30, 2023 | Decision MakingPrediction | CodeCode Available | 1 |
| Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning | Oct 30, 2023 | Decision MakingOffline RL | CodeCode Available | 1 |
| Hierarchical Framework for Interpretable and Probabilistic Model-Based Safe Reinforcement Learning | Oct 28, 2023 | Decision MakingDeep Reinforcement Learning | CodeCode Available | 1 |
| EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images | Oct 28, 2023 | Decision MakingMedical Visual Question Answering | CodeCode Available | 1 |
| Tree Prompting: Efficient Task Adaptation without Fine-Tuning | Oct 21, 2023 | ClassificationDecision Making | CodeCode Available | 1 |
| EconAgent: Large Language Model-Empowered Agents for Simulating Macroeconomic Activities | Oct 16, 2023 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| Bridging the Novice-Expert Gap via Models of Decision-Making: A Case Study on Remediating Math Mistakes | Oct 16, 2023 | Decision MakingMath | CodeCode Available | 1 |
| Can GPT-4V(ision) Serve Medical Applications? Case Studies on GPT-4V for Multimodal Medical Diagnosis | Oct 15, 2023 | AnatomyComputed Tomography (CT) | CodeCode Available | 1 |
| On Statistical Learning of Branch and Bound for Vehicle Routing Optimization | Oct 15, 2023 | Decision MakingGraph Attention | CodeCode Available | 1 |
| QACHECK: A Demonstration System for Question-Guided Multi-Hop Fact-Checking | Oct 11, 2023 | Decision MakingFact Checking | CodeCode Available | 1 |
| Explainable Image Similarity: Integrating Siamese Networks and Grad-CAM | Oct 11, 2023 | counterfactualDecision Making | CodeCode Available | 1 |
| Linear Latent World Models in Simple Transformers: A Case Study on Othello-GPT | Oct 11, 2023 | Decision Making | CodeCode Available | 1 |
| What If the TV Was Off? Examining Counterfactual Reasoning Abilities of Multi-modal Language Models | Oct 10, 2023 | BenchmarkingCode Generation | CodeCode Available | 1 |
| Explainable Claim Verification via Knowledge-Grounded Reasoning with Large Language Models | Oct 8, 2023 | Claim VerificationDecision Making | CodeCode Available | 1 |
| AvalonBench: Evaluating LLMs Playing the Game of Avalon | Oct 8, 2023 | Decision Making | CodeCode Available | 1 |
| Deep Learning for Two-Stage Robust Integer Optimization | Oct 6, 2023 | Decision MakingDeep Learning | CodeCode Available | 1 |
| Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets | Oct 6, 2023 | D4RLDecision Making | CodeCode Available | 1 |
| Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning | Oct 4, 2023 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use | Oct 4, 2023 | Decision Making | CodeCode Available | 1 |
| Trainable Noise Model as an XAI evaluation method: application on Sobol for remote sensing image segmentation | Oct 3, 2023 | Autonomous DrivingDecision Making | CodeCode Available | 1 |
| Towards Robust Fidelity for Evaluating Explainability of Graph Neural Networks | Oct 3, 2023 | Decision Making | CodeCode Available | 1 |
| Talk2BEV: Language-enhanced Bird's-eye View Maps for Autonomous Driving | Oct 3, 2023 | Autonomous DrivingDecision Making | CodeCode Available | 1 |
| Mini-BEHAVIOR: A Procedurally Generated Benchmark for Long-horizon Decision-Making in Embodied AI | Oct 3, 2023 | Decision Making | CodeCode Available | 1 |