| Distribution-Free, Risk-Controlling Prediction Sets | Jan 7, 2021 | BIG-bench Machine LearningClassification | CodeCode Available | 2 | 5 |
| Large AI Models in Health Informatics: Applications, Challenges, and the Future | Mar 21, 2023 | Decision MakingDrug Discovery | CodeCode Available | 2 | 5 |
| MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation | Oct 5, 2023 | BenchmarkingDecision Making | CodeCode Available | 2 | 5 |
| BEVCar: Camera-Radar Fusion for BEV Map and Object Segmentation | Mar 18, 2024 | Decision MakingScene Segmentation | CodeCode Available | 2 | 5 |
| A Comprehensive Guide to Explainable AI: From Classical Models to LLMs | Dec 1, 2024 | Causal Inferencecounterfactual | CodeCode Available | 2 | 5 |
| Context is Key: A Benchmark for Forecasting with Essential Textual Information | Oct 24, 2024 | Decision MakingTime Series | CodeCode Available | 2 | 5 |
| Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous Driving | May 24, 2024 | Autonomous DrivingDecision Making | CodeCode Available | 2 | 5 |
| Concept Bottleneck Language Models For protein design | Nov 9, 2024 | Decision MakingDrug Discovery | CodeCode Available | 2 | 5 |
| CombatVLA: An Efficient Vision-Language-Action Model for Combat Tasks in 3D Action Role-Playing Games | Mar 12, 2025 | Decision MakingVision-Language-Action | CodeCode Available | 2 | 5 |
| Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM Agents | Apr 25, 2024 | Decision MakingSpecificity | CodeCode Available | 2 | 5 |