| Unsupervised Commonsense Question Answering with Self-Talk | Apr 11, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| REALM: Retrieval-Augmented Language Model Pre-Training | Feb 10, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| ASER: A Large-scale Eventuality Knowledge Graph | May 1, 2019 | Knowledge GraphsWorld Knowledge | CodeCode Available | 1 |
| Analyzing Knowledge Graph Embedding Methods from a Multi-Embedding Interaction Perspective | Mar 27, 2019 | Graph EmbeddingKnowledge Graph Embedding | CodeCode Available | 1 |
| CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge | Nov 2, 2018 | Common Sense ReasoningMultiple-choice | CodeCode Available | 1 |
| Breaking NLI Systems with Sentences that Require Simple Lexical Inferences | May 6, 2018 | World Knowledge | CodeCode Available | 1 |
| On the Evaluation of Semantic Phenomena in Neural Machine Translation Using Natural Language Inference | Apr 25, 2018 | Machine TranslationNatural Language Inference | CodeCode Available | 1 |
| Imagine This! Scripts to Compositions to Videos | Apr 10, 2018 | RetrievalWorld Knowledge | CodeCode Available | 1 |
| Off-Policy General Value Functions to Represent Dynamic Role Assignments in RoboCup 3D Soccer Simulation | Feb 18, 2014 | Reinforcement LearningReinforcement Learning (RL) | CodeCode Available | 1 |
| HRSeg: High-Resolution Visual Perception and Enhancement for Reasoning Segmentation | Jul 17, 2025 | Reasoning SegmentationWorld Knowledge | —Unverified | 0 |
| Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical Jokes | Jul 17, 2025 | Common Sense ReasoningWorld Knowledge | —Unverified | 0 |
| KEN: Knowledge Augmentation and Emotion Guidance Network for Multimodal Fake News Detection | Jul 13, 2025 | Fake News DetectionMisinformation | —Unverified | 0 |
| Video Event Reasoning and Prediction by Fusing World Knowledge from LLMs with Vision Foundation Models | Jul 8, 2025 | Future predictionLarge Language Model | —Unverified | 0 |
| A Semi-supervised Scalable Unified Framework for E-commerce Query Classification | Jun 26, 2025 | ClassificationWorld Knowledge | —Unverified | 0 |
| MIRAGE: A Benchmark for Multimodal Information-Seeking and Reasoning in Agricultural Expert-Guided Conversations | Jun 25, 2025 | World Knowledge | CodeCode Available | 0 |
| From 2D to 3D Cognition: A Brief Survey of General World Models | Jun 25, 2025 | Autonomous DrivingScene Generation | —Unverified | 0 |
| Multi-Preference Lambda-weighted Listwise DPO for Dynamic Preference Alignment | Jun 24, 2025 | Informativenessreinforcement-learning | CodeCode Available | 0 |
| ImpliRet: Benchmarking the Implicit Fact Retrieval Challenge | Jun 17, 2025 | BenchmarkingRetrieval | CodeCode Available | 0 |
| MMMG: A Massive, Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning | Jun 12, 2025 | Image GenerationMultimodal Reasoning | —Unverified | 0 |
| RoCA: Robust Cross-Domain End-to-End Autonomous Driving | Jun 11, 2025 | Autonomous DrivingDomain Adaptation | —Unverified | 0 |
| ReCogDrive: A Reinforced Cognitive Framework for End-to-End Autonomous Driving | Jun 9, 2025 | Autonomous DrivingImitation Learning | —Unverified | 0 |
| Serendipitous Recommendation with Multimodal LLM | Jun 9, 2025 | Recommendation SystemsWorld Knowledge | —Unverified | 0 |
| Vid2Sim: Generalizable, Video-based Reconstruction of Appearance, Geometry and Physics for Mesh-free Simulation | Jun 6, 2025 | Computational EfficiencyWorld Knowledge | —Unverified | 0 |
| Quantifying Cross-Modality Memorization in Vision-Language Models | Jun 5, 2025 | Machine UnlearningMemorization | —Unverified | 0 |
| TIIF-Bench: How Does Your T2I Model Follow Your Instructions? | Jun 2, 2025 | BenchmarkingInstruction Following | —Unverified | 0 |