| LLM-based Agent Simulation for Maternal Health Interventions: Uncertainty Estimation and Decision-focused Evaluation | Mar 25, 2025 | counterfactualDecision Making | CodeCode Available | 0 | 5 |
| LoFTI: Localization and Factuality Transfer to Indian Locales | Jul 16, 2024 | World Knowledge | CodeCode Available | 0 | 5 |
| LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description | Aug 9, 2024 | DiversityInstruction Following | CodeCode Available | 0 | 5 |
| Fact-or-Fair: A Checklist for Behavioral Testing of AI Models on Fairness-Related Queries | Feb 9, 2025 | DiversityFairness | CodeCode Available | 0 | 5 |
| DyVo: Dynamic Vocabularies for Learned Sparse Retrieval with Entities | Oct 10, 2024 | Document RankingEntity Embeddings | CodeCode Available | 0 | 5 |
| Logic Attention Based Neighborhood Aggregation for Inductive Knowledge Graph Embedding | Nov 4, 2018 | Graph EmbeddingKnowledge Graph Completion | CodeCode Available | 0 | 5 |
| Bravo MaRDI: A Wikibase Powered Knowledge Graph on Mathematics | Sep 20, 2023 | World Knowledge | CodeCode Available | 0 | 5 |
| DynaBench: A benchmark dataset for learning dynamical systems from low-resolution data | Jun 9, 2023 | World Knowledge | CodeCode Available | 0 | 5 |
| AntiLeak-Bench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge | Dec 18, 2024 | BenchmarkingWorld Knowledge | CodeCode Available | 0 | 5 |
| LitCQD: Multi-Hop Reasoning in Incomplete Knowledge Graphs with Numeric Literals | Apr 28, 2023 | Knowledge GraphsWorld Knowledge | CodeCode Available | 0 | 5 |