| On the Role of Long-tail Knowledge in Retrieval Augmented Large Language Models | Jun 24, 2024 | RAGRetrieval | —Unverified | 0 |
| LangSuitE: Planning, Controlling and Interacting with Large Language Models in Embodied Text Environments | Jun 24, 2024 | World Knowledge | CodeCode Available | 2 |
| OCALM: Object-Centric Assessment with Language Models | Jun 24, 2024 | ObjectReinforcement Learning (RL) | —Unverified | 0 |
| What Teaches Robots to Walk, Teaches Them to Trade too -- Regime Adaptive Execution using Informed Data and LLMs | Jun 20, 2024 | reinforcement-learningReinforcement Learning | —Unverified | 0 |
| Locating and Extracting Relational Concepts in Large Language Models | Jun 19, 2024 | World Knowledge | CodeCode Available | 0 |
| WikiContradict: A Benchmark for Evaluating LLMs on Real-World Knowledge Conflicts from Wikipedia | Jun 19, 2024 | Language ModellingRAG | —Unverified | 0 |
| Benchmarking Multi-Image Understanding in Vision and Language Models: Perception, Knowledge, Reasoning, and Multi-Hop Reasoning | Jun 18, 2024 | BenchmarkingWorld Knowledge | CodeCode Available | 0 |
| Are Large Language Models True Healthcare Jacks-of-All-Trades? Benchmarking Across Health Professions Beyond Physician Exams | Jun 17, 2024 | AllBenchmarking | CodeCode Available | 0 |
| A Systematic Analysis of Large Language Models as Soft Reasoners: The Case of Syllogistic Inferences | Jun 17, 2024 | In-Context Learningvalid | CodeCode Available | 0 |
| RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models | Jun 16, 2024 | Adversarial AttackBenchmarking | CodeCode Available | 2 |