| BLADE: Benchmarking Language Model Agents for Data-Driven Science | Aug 19, 2024 | BenchmarkingDecision Making | CodeCode Available | 1 |
| Flooding Spread of Manipulated Knowledge in LLM-Based Multi-Agent Communities | Jul 10, 2024 | counterfactualFact Checking | CodeCode Available | 1 |
| Reverse Image Retrieval Cues Parametric Memory in Multimodal LLMs | May 29, 2024 | Image RetrievalQuestion Answering | CodeCode Available | 1 |
| Large Scale Knowledge Washing | May 26, 2024 | DecoderMemorization | CodeCode Available | 1 |
| Everything is Editable: Extend Knowledge Editing to Unstructured Data in Large Language Models | May 24, 2024 | knowledge editingWorld Knowledge | CodeCode Available | 1 |
| Elements of World Knowledge (EWOK): A cognition-inspired framework for evaluating basic world knowledge in language models | May 15, 2024 | AI AgentWorld Knowledge | CodeCode Available | 1 |
| PAC-Bayesian Generalization Bounds for Knowledge Graph Representation Learning | May 10, 2024 | DecoderGeneralization Bounds | CodeCode Available | 1 |
| Cross-Care: Assessing the Healthcare Implications of Pre-training Data on Language Model Bias | May 9, 2024 | Data VisualizationLanguage Modeling | CodeCode Available | 1 |
| LEARN: Knowledge Adaptation from Large Language Model to Recommendation for Practical Industrial Application | May 7, 2024 | Collaborative FilteringLanguage Modeling | CodeCode Available | 1 |
| A User-Centric Multi-Intent Benchmark for Evaluating Large Language Models | Apr 22, 2024 | BenchmarkingWorld Knowledge | CodeCode Available | 1 |