| Large Language Models Only Pass Primary School Exams in Indonesia: A Comprehensive Test on IndoMMLU | Oct 7, 2023 | Multi-task Language UnderstandingWorld Knowledge | CodeCode Available | 1 |
| Towards End-to-End Embodied Decision Making via Multi-modal Large Language Model: Explorations with GPT4-Vision and Beyond | Oct 3, 2023 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| FELM: Benchmarking Factuality Evaluation of Large Language Models | Oct 1, 2023 | BenchmarkingMath | CodeCode Available | 1 |
| Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration | Sep 30, 2023 | World Knowledge | CodeCode Available | 1 |
| Physics of Language Models: Part 3.1, Knowledge Storage and Extraction | Sep 25, 2023 | Question AnsweringSentence | CodeCode Available | 1 |
| Retrieve-Rewrite-Answer: A KG-to-Text Enhanced LLMs Framework for Knowledge Graph Question Answering | Sep 20, 2023 | Graph Question AnsweringLanguage Modeling | CodeCode Available | 1 |
| Do PLMs Know and Understand Ontological Knowledge? | Sep 12, 2023 | Logical ReasoningMemorization | CodeCode Available | 1 |
| Head-to-Tail: How Knowledgeable are Large Language Models (LLMs)? A.K.A. Will LLMs Replace Knowledge Graphs? | Aug 20, 2023 | Knowledge GraphsWorld Knowledge | CodeCode Available | 1 |
| Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model | Aug 2, 2023 | HallucinationImage Captioning | CodeCode Available | 1 |
| Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation | Jul 20, 2023 | Open-Domain Question AnsweringQuestion Answering | CodeCode Available | 1 |