| Lenna: Language Enhanced Reasoning Detection Assistant | Dec 5, 2023 | World Knowledge | CodeCode Available | 1 | 5 |
| Exploiting Diffusion Prior for Real-World Image Dehazing with Unpaired Training | Mar 19, 2025 | Image DehazingWorld Knowledge | CodeCode Available | 1 | 5 |
| BEAR: A Unified Framework for Evaluating Relational Knowledge in Causal and Masked Language Models | Apr 5, 2024 | Factual probeGeneral Knowledge | CodeCode Available | 1 | 5 |
| Explaining Question Answering Models through Text Generation | Apr 12, 2020 | Question AnsweringText Generation | CodeCode Available | 1 | 5 |
| LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content | Oct 14, 2024 | Visual Question Answering (VQA)World Knowledge | CodeCode Available | 1 | 5 |
| Large Scale Knowledge Washing | May 26, 2024 | DecoderMemorization | CodeCode Available | 1 | 5 |
| Large Language Models Only Pass Primary School Exams in Indonesia: A Comprehensive Test on IndoMMLU | Oct 7, 2023 | Multi-task Language UnderstandingWorld Knowledge | CodeCode Available | 1 | 5 |
| Large-Scale Relation Learning for Question Answering over Knowledge Bases with Pre-trained Language Models | Nov 1, 2021 | Question AnsweringRelation | CodeCode Available | 1 | 5 |
| Enabling Intelligent Interactions between an Agent and an LLM: A Reinforcement Learning Approach | Jun 6, 2023 | Decision MakingSequential Decision Making | CodeCode Available | 1 | 5 |
| Lbl2Vec: An Embedding-Based Approach for Unsupervised Document Retrieval on Predefined Topics | Oct 12, 2022 | Document ClassificationRetrieval | CodeCode Available | 1 | 5 |
| Elephants Never Forget: Memorization and Learning of Tabular Data in Large Language Models | Apr 9, 2024 | Few-Shot LearningLanguage Modelling | CodeCode Available | 1 | 5 |
| Language Guided Visual Question Answering: Elevate Your Multimodal Language Model Using Knowledge-Enriched Prompts | Oct 31, 2023 | Image CaptioningLanguage Modeling | CodeCode Available | 1 | 5 |
| Language Models as Knowledge Bases: On Entity Representations, Storage Capacity, and Paraphrased Queries | Aug 20, 2020 | World Knowledge | CodeCode Available | 1 | 5 |
| CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge | Nov 2, 2018 | Common Sense ReasoningMultiple-choice | CodeCode Available | 1 | 5 |
| Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers | May 24, 2020 | Common Sense ReasoningWorld Knowledge | CodeCode Available | 1 | 5 |
| Common Sense Enhanced Knowledge-based Recommendation with Large Language Model | Mar 27, 2024 | Common Sense ReasoningKnowledge Graphs | CodeCode Available | 1 | 5 |
| A User-Centric Multi-Intent Benchmark for Evaluating Large Language Models | Apr 22, 2024 | BenchmarkingWorld Knowledge | CodeCode Available | 1 | 5 |
| Diversify and Conquer: Diversity-Centric Data Selection with Iterative Refinement | Sep 17, 2024 | Active LearningDiversity | CodeCode Available | 1 | 5 |
| A Unified Encoder-Decoder Framework with Entity Memory | Oct 7, 2022 | DecoderQuestion Answering | CodeCode Available | 1 | 5 |
| Combo of Thinking and Observing for Outside-Knowledge VQA | May 10, 2023 | DecoderQuestion Answering | CodeCode Available | 1 | 5 |
| ACES: Translation Accuracy Challenge Sets for Evaluating Machine Translation Metrics | Oct 27, 2022 | Machine TranslationTranslation | CodeCode Available | 1 | 5 |
| KoLA: Carefully Benchmarking World Knowledge of Large Language Models | Jun 15, 2023 | BenchmarkingHallucination | CodeCode Available | 1 | 5 |
| Is ChatGPT a Good Recommender? A Preliminary Study | Apr 20, 2023 | Recommendation SystemsWorld Knowledge | CodeCode Available | 1 | 5 |
| Elements of World Knowledge (EWOK): A cognition-inspired framework for evaluating basic world knowledge in language models | May 15, 2024 | AI AgentWorld Knowledge | CodeCode Available | 1 | 5 |
| KELM: Knowledge Enhanced Pre-Trained Language Representations with Message Passing on Hierarchical Relational Graphs | Sep 9, 2021 | Common Sense ReasoningLanguage Modelling | CodeCode Available | 1 | 5 |