| GRE Suite: Geo-localization Inference via Fine-Tuned Vision-Language Models and Enhanced Reasoning Chains | May 24, 2025 | geo-localizationVisual Reasoning | CodeCode Available | 1 |
| HeadlineCause: A Dataset of News Headlines for Detecting Causalities | Aug 28, 2021 | Commonsense Causal ReasoningCommon Sense Reasoning | CodeCode Available | 1 |
| ACES: Translation Accuracy Challenge Sets for Evaluating Machine Translation Metrics | Oct 27, 2022 | Machine TranslationTranslation | CodeCode Available | 1 |
| CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge | Nov 2, 2018 | Common Sense ReasoningMultiple-choice | CodeCode Available | 1 |
| Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers | May 24, 2020 | Common Sense ReasoningWorld Knowledge | CodeCode Available | 1 |
| Common Sense Enhanced Knowledge-based Recommendation with Large Language Model | Mar 27, 2024 | Common Sense ReasoningKnowledge Graphs | CodeCode Available | 1 |
| A User-Centric Multi-Intent Benchmark for Evaluating Large Language Models | Apr 22, 2024 | BenchmarkingWorld Knowledge | CodeCode Available | 1 |
| GRILLBot In Practice: Lessons and Tradeoffs Deploying Large Language Models for Adaptable Conversational Task Assistants | Feb 12, 2024 | Code GenerationManagement | CodeCode Available | 1 |
| InGram: Inductive Knowledge Graph Embedding via Relation Graphs | May 31, 2023 | Entity EmbeddingsGraph Embedding | CodeCode Available | 1 |
| A Unified Encoder-Decoder Framework with Entity Memory | Oct 7, 2022 | DecoderQuestion Answering | CodeCode Available | 1 |