| EHRNoteQA: An LLM Benchmark for Real-World Clinical Practice Using Discharge Summaries | Feb 25, 2024 | Decision MakingQuestion Answering | CodeCode Available | 1 |
| Reflect-RL: Two-Player Online RL Fine-Tuning for LMs | Feb 20, 2024 | Decision MakingReinforcement Learning (RL) | CodeCode Available | 1 |
| XRL-Bench: A Benchmark for Evaluating and Comparing Explainable Reinforcement Learning Techniques | Feb 20, 2024 | Decision MakingReinforcement Learning (RL) | CodeCode Available | 1 |
| Dynamic planning in hierarchical active inference | Feb 18, 2024 | Decision Making | CodeCode Available | 1 |
| Explaining generative diffusion models via visual analysis for interpretable decision-making process | Feb 16, 2024 | Decision MakingDenoising | CodeCode Available | 1 |
| PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in Control | Feb 16, 2024 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Uncertainty Quantification for Forward and Inverse Problems of PDEs via Latent Global Evolution | Feb 13, 2024 | Decision MakingDeep Learning | CodeCode Available | 1 |
| Addressing cognitive bias in medical language models | Feb 12, 2024 | Decision Making | CodeCode Available | 1 |
| TELLER: A Trustworthy Framework for Explainable, Generalizable and Controllable Fake News Detection | Feb 12, 2024 | Decision MakingFake News Detection | CodeCode Available | 1 |
| A RAG-Based Multi-Agent LLM System for Natural Hazard Resilience and Adaptation | Feb 12, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| Self-Calibrating Conformal Prediction | Feb 11, 2024 | Binary ClassificationConformal Prediction | CodeCode Available | 1 |
| Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement | Feb 9, 2024 | Code GenerationDecision Making | CodeCode Available | 1 |
| Premier-TACO is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss | Feb 9, 2024 | Computational Efficiencycontinuous-control | CodeCode Available | 1 |
| Sym-Q: Adaptive Symbolic Regression via Sequential Decision-Making | Feb 7, 2024 | Decision Makingregression | CodeCode Available | 1 |
| Conformal Convolution and Monte Carlo Meta-learners for Predictive Inference of Individual Treatment Effects | Feb 7, 2024 | Decision MakingMarketing | CodeCode Available | 1 |
| Measuring Implicit Bias in Explicitly Unbiased Large Language Models | Feb 6, 2024 | Decision MakingDiagnostic | CodeCode Available | 1 |
| Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills | Feb 5, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| Deep hybrid models: infer and plan in a dynamic world | Feb 1, 2024 | Decision Makingreinforcement-learning | CodeCode Available | 1 |
| LLM Voting: Human Choices and AI Collective Decision Making | Jan 31, 2024 | Decision MakingDiversity | CodeCode Available | 1 |
| Layered and Staged Monte Carlo Tree Search for SMT Strategy Synthesis | Jan 30, 2024 | Decision MakingEfficient Exploration | CodeCode Available | 1 |
| Prompting Large Language Models for Zero-Shot Clinical Prediction with Structured Longitudinal Electronic Health Record Data | Jan 25, 2024 | Decision MakingIn-Context Learning | CodeCode Available | 1 |
| HAZARD Challenge: Embodied Decision Making in Dynamically Changing Environments | Jan 23, 2024 | Common Sense ReasoningDecision Making | CodeCode Available | 1 |
| Distributional Counterfactual Explanations With Optimal Transport | Jan 23, 2024 | counterfactualCounterfactual Explanation | CodeCode Available | 1 |
| Word-Level ASR Quality Estimation for Efficient Corpus Sampling and Post-Editing through Analyzing Attentions of a Reference-Free Metric | Jan 20, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| ORGANA: A Robotic Assistant for Automated Chemistry Experimentation and Characterization | Jan 13, 2024 | Decision MakingScheduling | CodeCode Available | 1 |