| BLADE: Benchmarking Language Model Agents for Data-Driven Science | Aug 19, 2024 | BenchmarkingDecision Making | CodeCode Available | 1 | 5 |
| Differentially Private Federated Knowledge Graphs Embedding | May 17, 2021 | Graph EmbeddingKnowledge Graph Embedding | CodeCode Available | 1 | 5 |
| MEIM: Multi-partition Embedding Interaction Beyond Block Term Format for Efficient and Expressive Link Prediction | Sep 30, 2022 | Graph EmbeddingKnowledge Graph Embedding | CodeCode Available | 1 | 5 |
| Bring Your Own KG: Self-Supervised Program Synthesis for Zero-Shot KGQA | Nov 14, 2023 | In-Context LearningProgram Synthesis | CodeCode Available | 1 | 5 |
| LowFER: Low-rank Bilinear Pooling for Link Prediction | Aug 25, 2020 | Knowledge Graph CompletionKnowledge Graphs | CodeCode Available | 1 | 5 |
| F-ViTA: Foundation Model Guided Visible to Thermal Translation | Apr 3, 2025 | Scene UnderstandingStyle Transfer | CodeCode Available | 1 | 5 |
| An Automatic Graph Construction Framework based on Large Language Models for Recommendation | Dec 24, 2024 | graph constructionQuantization | CodeCode Available | 1 | 5 |
| Machine Translation Meta Evaluation through Translation Accuracy Challenge Sets | Jan 29, 2024 | BenchmarkingMachine Translation | CodeCode Available | 1 | 5 |
| Is Bigger and Deeper Always Better? Probing LLaMA Across Scales and Layers | Dec 7, 2023 | MathMultiple-choice | CodeCode Available | 1 | 5 |
| Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model | Aug 2, 2023 | HallucinationImage Captioning | CodeCode Available | 1 | 5 |
| LLM Embeddings Improve Test-time Adaptation to Tabular Y|X-Shifts | Oct 9, 2024 | Test-time AdaptationWorld Knowledge | CodeCode Available | 1 | 5 |
| Head-to-Tail: How Knowledgeable are Large Language Models (LLMs)? A.K.A. Will LLMs Replace Knowledge Graphs? | Aug 20, 2023 | Knowledge GraphsWorld Knowledge | CodeCode Available | 1 | 5 |
| Meta-Learning Online Adaptation of Language Models | May 24, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Can LLMs' Tuning Methods Work in Medical Multimodal Domain? | Mar 11, 2024 | Transfer LearningWorld Knowledge | CodeCode Available | 1 | 5 |
| Beyond Factuality: A Comprehensive Evaluation of Large Language Models as Knowledge Generators | Oct 11, 2023 | Information RetrievalInformativeness | CodeCode Available | 1 | 5 |
| ASER: A Large-scale Eventuality Knowledge Graph | May 1, 2019 | Knowledge GraphsWorld Knowledge | CodeCode Available | 1 | 5 |
| Beyond Embeddings: The Promise of Visual Table in Visual Reasoning | Mar 27, 2024 | Representation LearningVisual Question Answering | CodeCode Available | 1 | 5 |
| Carpe Diem: On the Evaluation of World Knowledge in Lifelong Language Models | Nov 14, 2023 | Continual LearningQuestion Answering | CodeCode Available | 1 | 5 |
| Adapting to Non-Stationary Environments: Multi-Armed Bandit Enhanced Retrieval-Augmented Generation on Knowledge Graphs | Dec 10, 2024 | Knowledge GraphsRAG | CodeCode Available | 1 | 5 |
| Exploring the Potential of Large Foundation Models for Open-Vocabulary HOI Detection | Apr 9, 2024 | Human-Object Interaction DetectionWorld Knowledge | CodeCode Available | 1 | 5 |
| Chain-of-Skills: A Configurable Model for Open-domain Question Answering | May 4, 2023 | Open-Domain Question AnsweringQuestion Answering | CodeCode Available | 1 | 5 |
| I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token | Dec 9, 2024 | World Knowledge | CodeCode Available | 1 | 5 |
| Better Together: Enhancing Generative Knowledge Graph Completion with Language Models and Neighborhood Information | Nov 2, 2023 | ImputationKnowledge Graph Completion | CodeCode Available | 1 | 5 |
| Analyzing Knowledge Graph Embedding Methods from a Multi-Embedding Interaction Perspective | Mar 27, 2019 | Graph EmbeddingKnowledge Graph Embedding | CodeCode Available | 1 | 5 |
| LLaRA: Large Language-Recommendation Assistant | Dec 5, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Lenna: Language Enhanced Reasoning Detection Assistant | Dec 5, 2023 | World Knowledge | CodeCode Available | 1 | 5 |
| Exploiting Diffusion Prior for Real-World Image Dehazing with Unpaired Training | Mar 19, 2025 | Image DehazingWorld Knowledge | CodeCode Available | 1 | 5 |
| BEAR: A Unified Framework for Evaluating Relational Knowledge in Causal and Masked Language Models | Apr 5, 2024 | Factual probeGeneral Knowledge | CodeCode Available | 1 | 5 |
| Explaining Question Answering Models through Text Generation | Apr 12, 2020 | Question AnsweringText Generation | CodeCode Available | 1 | 5 |
| LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content | Oct 14, 2024 | Visual Question Answering (VQA)World Knowledge | CodeCode Available | 1 | 5 |
| Large Scale Knowledge Washing | May 26, 2024 | DecoderMemorization | CodeCode Available | 1 | 5 |
| Large Language Models Only Pass Primary School Exams in Indonesia: A Comprehensive Test on IndoMMLU | Oct 7, 2023 | Multi-task Language UnderstandingWorld Knowledge | CodeCode Available | 1 | 5 |
| Large-Scale Relation Learning for Question Answering over Knowledge Bases with Pre-trained Language Models | Nov 1, 2021 | Question AnsweringRelation | CodeCode Available | 1 | 5 |
| Enabling Intelligent Interactions between an Agent and an LLM: A Reinforcement Learning Approach | Jun 6, 2023 | Decision MakingSequential Decision Making | CodeCode Available | 1 | 5 |
| Lbl2Vec: An Embedding-Based Approach for Unsupervised Document Retrieval on Predefined Topics | Oct 12, 2022 | Document ClassificationRetrieval | CodeCode Available | 1 | 5 |
| Elephants Never Forget: Memorization and Learning of Tabular Data in Large Language Models | Apr 9, 2024 | Few-Shot LearningLanguage Modelling | CodeCode Available | 1 | 5 |
| Language Guided Visual Question Answering: Elevate Your Multimodal Language Model Using Knowledge-Enriched Prompts | Oct 31, 2023 | Image CaptioningLanguage Modeling | CodeCode Available | 1 | 5 |
| Language Models as Knowledge Bases: On Entity Representations, Storage Capacity, and Paraphrased Queries | Aug 20, 2020 | World Knowledge | CodeCode Available | 1 | 5 |
| CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge | Nov 2, 2018 | Common Sense ReasoningMultiple-choice | CodeCode Available | 1 | 5 |
| Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers | May 24, 2020 | Common Sense ReasoningWorld Knowledge | CodeCode Available | 1 | 5 |
| Common Sense Enhanced Knowledge-based Recommendation with Large Language Model | Mar 27, 2024 | Common Sense ReasoningKnowledge Graphs | CodeCode Available | 1 | 5 |
| A User-Centric Multi-Intent Benchmark for Evaluating Large Language Models | Apr 22, 2024 | BenchmarkingWorld Knowledge | CodeCode Available | 1 | 5 |
| Diversify and Conquer: Diversity-Centric Data Selection with Iterative Refinement | Sep 17, 2024 | Active LearningDiversity | CodeCode Available | 1 | 5 |
| A Unified Encoder-Decoder Framework with Entity Memory | Oct 7, 2022 | DecoderQuestion Answering | CodeCode Available | 1 | 5 |
| Combo of Thinking and Observing for Outside-Knowledge VQA | May 10, 2023 | DecoderQuestion Answering | CodeCode Available | 1 | 5 |
| ACES: Translation Accuracy Challenge Sets for Evaluating Machine Translation Metrics | Oct 27, 2022 | Machine TranslationTranslation | CodeCode Available | 1 | 5 |
| KoLA: Carefully Benchmarking World Knowledge of Large Language Models | Jun 15, 2023 | BenchmarkingHallucination | CodeCode Available | 1 | 5 |
| Is ChatGPT a Good Recommender? A Preliminary Study | Apr 20, 2023 | Recommendation SystemsWorld Knowledge | CodeCode Available | 1 | 5 |
| Elements of World Knowledge (EWOK): A cognition-inspired framework for evaluating basic world knowledge in language models | May 15, 2024 | AI AgentWorld Knowledge | CodeCode Available | 1 | 5 |
| KELM: Knowledge Enhanced Pre-Trained Language Representations with Message Passing on Hierarchical Relational Graphs | Sep 9, 2021 | Common Sense ReasoningLanguage Modelling | CodeCode Available | 1 | 5 |