| Benchmarking LLMs for Political Science: A United Nations Perspective | Feb 19, 2025 | BenchmarkingDecision Making | CodeCode Available | 1 | 5 |
| Benchmarking saliency methods for chest X-ray interpretation | Oct 10, 2022 | BenchmarkingDecision Making | CodeCode Available | 1 | 5 |
| An Introduction to Deep Reinforcement Learning | Nov 30, 2018 | BIG-bench Machine LearningDecision Making | CodeCode Available | 1 | 5 |
| BetaZero: Belief-State Planning for Long-Horizon POMDPs using Learned Approximations | May 31, 2023 | Autonomous DrivingDecision Making | CodeCode Available | 1 | 5 |
| Digital Transformation in the Water Distribution System based on the Digital Twins Concept | Dec 9, 2024 | Decision MakingScheduling | CodeCode Available | 1 | 5 |
| DisastIR: A Comprehensive Information Retrieval Benchmark for Disaster Management | May 20, 2025 | Decision MakingInformation Retrieval | CodeCode Available | 1 | 5 |
| DiffSTG: Probabilistic Spatio-Temporal Graph Forecasting with Denoising Diffusion Models | Jan 31, 2023 | Decision MakingDenoising | CodeCode Available | 1 | 5 |
| Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for Sampling | Jun 11, 2024 | Decision MakingVariational Inference | CodeCode Available | 1 | 5 |
| Fraud-R1 : A Multi-Round Benchmark for Assessing the Robustness of LLM Against Augmented Fraud and Phishing Inducements | Feb 18, 2025 | Decision MakingFraud Detection | CodeCode Available | 1 | 5 |
| Diffusion-Based Electrocardiography Noise Quantification via Anomaly Detection | Jun 13, 2025 | Anomaly DetectionDecision Making | CodeCode Available | 1 | 5 |