| EHRNoteQA: An LLM Benchmark for Real-World Clinical Practice Using Discharge Summaries | Feb 25, 2024 | Decision MakingQuestion Answering | CodeCode Available | 1 |
| XRL-Bench: A Benchmark for Evaluating and Comparing Explainable Reinforcement Learning Techniques | Feb 20, 2024 | Decision MakingReinforcement Learning (RL) | CodeCode Available | 1 |
| Reflect-RL: Two-Player Online RL Fine-Tuning for LMs | Feb 20, 2024 | Decision MakingReinforcement Learning (RL) | CodeCode Available | 1 |
| Dynamic planning in hierarchical active inference | Feb 18, 2024 | Decision Making | CodeCode Available | 1 |
| PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in Control | Feb 16, 2024 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Explaining generative diffusion models via visual analysis for interpretable decision-making process | Feb 16, 2024 | Decision MakingDenoising | CodeCode Available | 1 |
| Uncertainty Quantification for Forward and Inverse Problems of PDEs via Latent Global Evolution | Feb 13, 2024 | Decision MakingDeep Learning | CodeCode Available | 1 |
| A RAG-Based Multi-Agent LLM System for Natural Hazard Resilience and Adaptation | Feb 12, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| Addressing cognitive bias in medical language models | Feb 12, 2024 | Decision Making | CodeCode Available | 1 |
| TELLER: A Trustworthy Framework for Explainable, Generalizable and Controllable Fake News Detection | Feb 12, 2024 | Decision MakingFake News Detection | CodeCode Available | 1 |