| Morph Call: Probing Morphosyntactic Content of Multilingual Transformers | Apr 26, 2021 | Common Sense ReasoningMORPH | CodeCode Available | 0 |
| Who Relies More on World Knowledge and Bias for Syntactic Ambiguity Resolution: Humans or LLMs? | Mar 13, 2025 | NavigateWorld Knowledge | CodeCode Available | 0 |
| BiasKG: Adversarial Knowledge Graphs to Induce Bias in Large Language Models | May 8, 2024 | Knowledge GraphsLanguage Modeling | CodeCode Available | 0 |
| Does Commonsense help in detecting Sarcasm? | Sep 17, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| AKEW: Assessing Knowledge Editing in the Wild | Feb 29, 2024 | Articlescounterfactual | CodeCode Available | 0 |
| Walk-and-Relate: A Random-Walk-based Algorithm for Representation Learning on Sparse Knowledge Graphs | Sep 19, 2022 | Knowledge GraphsRepresentation Learning | CodeCode Available | 0 |
| Arrows are the Verbs of Diagrams | Aug 1, 2018 | BIG-bench Machine LearningWorld Knowledge | CodeCode Available | 0 |
| Multi-Preference Lambda-weighted Listwise DPO for Dynamic Preference Alignment | Jun 24, 2025 | Informativenessreinforcement-learning | CodeCode Available | 0 |
| TimeCausality: Evaluating the Causal Ability in Time Dimension for Vision Language Models | May 21, 2025 | Human AgingQuestion Answering | CodeCode Available | 0 |
| LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description | Aug 9, 2024 | DiversityInstruction Following | CodeCode Available | 0 |
| My Teacher Thinks The World Is Flat! Interpreting Automatic Essay Scoring Mechanism | Dec 27, 2020 | Common Sense ReasoningNatural Language Understanding | CodeCode Available | 0 |
| LLMTreeRec: Unleashing the Power of Large Language Models for Cold-Start Recommendations | Mar 31, 2024 | Recommendation SystemsRe-Ranking | CodeCode Available | 0 |
| Benchmarking Spatiotemporal Reasoning in LLMs and Reasoning Models: Capabilities and Challenges | May 16, 2025 | BenchmarkingState Estimation | CodeCode Available | 0 |
| NLITrans at SemEval-2018 Task 12: Transfer of Semantic Knowledge for Argument Comprehension | Apr 23, 2018 | PositionSentence | CodeCode Available | 0 |
| CoRTEx: Contrastive Learning for Representing Terms via Explanations with Applications on Constructing Biomedical Knowledge Graphs | Dec 13, 2023 | ClusteringContrastive Learning | CodeCode Available | 0 |
| FLAME: Self-Supervised Low-Resource Taxonomy Expansion using Large Language Models | Feb 21, 2024 | Recommendation SystemsTaxonomy Expansion | CodeCode Available | 0 |
| Log Probabilities Are a Reliable Estimate of Semantic Plausibility in Base and Instruction-Tuned Language Models | Mar 21, 2024 | SentenceWorld Knowledge | CodeCode Available | 0 |
| LitCQD: Multi-Hop Reasoning in Incomplete Knowledge Graphs with Numeric Literals | Apr 28, 2023 | Knowledge GraphsWorld Knowledge | CodeCode Available | 0 |
| ObjCAViT: Improving Monocular Depth Estimation Using Natural Language Models And Image-Object Cross-Attention | Nov 30, 2022 | Depth EstimationImage-to-Image Translation | CodeCode Available | 0 |
| Are Large Language Models True Healthcare Jacks-of-All-Trades? Benchmarking Across Health Professions Beyond Physician Exams | Jun 17, 2024 | AllBenchmarking | CodeCode Available | 0 |
| Large Language Models Need Consultants for Reasoning: Becoming an Expert in a Complex Human System Through Behavior Simulation | Mar 27, 2024 | Common Sense ReasoningWorld Knowledge | CodeCode Available | 0 |
| Scaling Autoregressive Models for Content-Rich Text-to-Image Generation | Jun 22, 2022 | DecoderImage Generation | CodeCode Available | 0 |
| Finding Motifs in Knowledge Graphs using Compression | Apr 16, 2021 | Knowledge GraphsWorld Knowledge | CodeCode Available | 0 |
| Advancing and Benchmarking Personalized Tool Invocation for LLMs | May 7, 2025 | BenchmarkingWorld Knowledge | CodeCode Available | 0 |
| Video Summarization: Towards Entity-Aware Captions | Dec 1, 2023 | Image CaptioningVideo Captioning | CodeCode Available | 0 |
| Language models show human-like content effects on reasoning tasks | Jul 14, 2022 | Language ModellingLogical Reasoning | CodeCode Available | 0 |
| Scope Ambiguities in Large Language Models | Apr 5, 2024 | World Knowledge | CodeCode Available | 0 |
| Filling the Image Information Gap for VQA: Prompting Large Language Models to Proactively Ask Questions | Nov 20, 2023 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Language Model Behavior: A Comprehensive Survey | Mar 20, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Contextual Knowledge Pursuit for Faithful Visual Synthesis | Nov 29, 2023 | Language ModellingRetrieval | CodeCode Available | 0 |
| ComDensE : Combined Dense Embedding of Relation-aware and Common Features for Knowledge Graph Completion | Jun 29, 2022 | Inductive BiasKnowledge Graph Completion | CodeCode Available | 0 |
| Figurative Language in Recognizing Textual Entailment | Jun 2, 2021 | Natural Language InferenceRTE | CodeCode Available | 0 |
| Combining Analogy with Language Models for Knowledge Extraction | Jun 22, 2021 | ArticlesLanguage Modeling | CodeCode Available | 0 |
| On the Necessity of World Knowledge for Mitigating Missing Labels in Extreme Classification | Aug 18, 2024 | ImputationMissing Labels | CodeCode Available | 0 |
| COFAR: Commonsense and Factual Reasoning in Image Search | Oct 16, 2022 | Image RetrievalRetrieval | CodeCode Available | 0 |
| Knowledge Graph Completion with Mixed Geometry Tensor Factorization | Apr 3, 2025 | Knowledge Graph CompletionKnowledge Graphs | CodeCode Available | 0 |
| Knowledge Generation -- Variational Bayes on Knowledge Graphs | Jan 21, 2021 | DecoderGraph Matching | CodeCode Available | 0 |
| Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access | Sep 3, 2016 | reinforcement-learningReinforcement Learning | CodeCode Available | 0 |
| Benchmarking Multi-Image Understanding in Vision and Language Models: Perception, Knowledge, Reasoning, and Multi-Hop Reasoning | Jun 18, 2024 | BenchmarkingWorld Knowledge | CodeCode Available | 0 |
| Open-World Knowledge Graph Completion | Nov 9, 2017 | Entity LinkingKnowledge Graph Completion | CodeCode Available | 0 |
| Knowledge Boundary and Persona Dynamic Shape A Better Social Media Agent | Mar 28, 2024 | World Knowledge | CodeCode Available | 0 |
| Augment or Not? A Comparative Study of Pure and Augmented Large Language Model Recommenders | May 29, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Knowledge-Augmented Language Model and its Application to Unsupervised Named-Entity Recognition | Apr 9, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| StorySparkQA: Expert-Annotated QA Pairs with Real-World Knowledge for Children's Story-Based Learning | Nov 16, 2023 | Question AnsweringWorld Knowledge | CodeCode Available | 0 |
| Knowledge Acquisition Disentanglement for Knowledge-based Visual Question Answering with Large Language Models | Jul 22, 2024 | DisentanglementQuestion Answering | CodeCode Available | 0 |
| KGQuiz: Evaluating the Generalization of Encoded Knowledge in Large Language Models | Oct 15, 2023 | Multiple-choiceTriplet | CodeCode Available | 0 |
| CoDA21: Evaluating Language Understanding Capabilities of NLP Models With Context-Definition Alignment | Mar 11, 2022 | Natural Language UnderstandingWorld Knowledge | CodeCode Available | 0 |
| Fact-or-Fair: A Checklist for Behavioral Testing of AI Models on Fairness-Related Queries | Feb 9, 2025 | DiversityFairness | CodeCode Available | 0 |
| PCR4ALL: A Comprehensive Evaluation Benchmark for Pronoun Coreference Resolution in English | Jun 1, 2022 | coreference-resolutionCoreference Resolution | CodeCode Available | 0 |
| Is Incoherence Surprising? Targeted Evaluation of Coherence Prediction from Language Models | May 7, 2021 | Coherence EvaluationLanguage Modelling | CodeCode Available | 0 |