| Video Question Answering on Screencast Tutorials | Aug 2, 2020 | Question AnsweringVideo Question Answering | —Unverified | 0 |
| What Gives the Answer Away? Question Answering Bias Analysis on Video QA Datasets | Jul 7, 2020 | Multiple-choiceQuestion Answering | —Unverified | 0 |
| Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training | Jul 5, 2020 | DecoderQuestion Answering | —Unverified | 0 |
| Modality Shifting Attention Network for Multi-modal Video Question Answering | Jul 4, 2020 | Question AnsweringTemporal Localization | —Unverified | 0 |
| DramaQA: Character-Centered Video Story Understanding with Hierarchical QA | May 7, 2020 | Question AnsweringVideo Question Answering | CodeCode Available | 0 |
| Knowledge-Based Visual Question Answering in Videos | Apr 17, 2020 | Question AnsweringVideo Question Answering | —Unverified | 0 |
| Noise Estimation Using Density Estimation for Self-Supervised Multimodal Learning | Mar 6, 2020 | Density EstimationNoise Estimation | CodeCode Available | 0 |
| Multimodal Transformer with Pointer Network for the DSTC8 AVSD Challenge | Feb 25, 2020 | Question AnsweringVideo Question Answering | —Unverified | 0 |
| TutorialVQA: Question Answering Dataset for Tutorial Videos | Dec 2, 2019 | Question AnsweringVideo Question Answering | CodeCode Available | 0 |
| Video Dialog via Progressive Inference and Cross-Transformer | Nov 1, 2019 | Answer GenerationQuestion Answering | —Unverified | 0 |
| KnowIT VQA: Answering Knowledge-Based Questions about Videos | Oct 23, 2019 | Question AnsweringVideo Question Answering | —Unverified | 0 |
| A Better Way to Attend: Attention with Trees for Video Question Answering | Sep 5, 2019 | Question AnsweringVideo Question Answering | CodeCode Available | 0 |
| Learning Question-Guided Video Representation for Multi-Turn Video Question Answering | Jul 31, 2019 | NavigateQuestion Answering | —Unverified | 0 |
| OmniNet: A unified architecture for multi-modal multi-task learning | Jul 17, 2019 | Image CaptioningMulti-Task Learning | CodeCode Available | 0 |
| Neural Reasoning, Fast and Slow, for Video Question Answering | Jul 10, 2019 | Natural QuestionsQuestion Answering | —Unverified | 0 |
| Video Question Generation via Cross-Modal Self-Attention Networks Learning | Jul 5, 2019 | DiversityQuestion Answering | —Unverified | 0 |
| Open-Ended Long-Form Video Question Answering via Hierarchical Convolutional Self-Attention Networks | Jun 28, 2019 | Answer GenerationDecoder | —Unverified | 0 |
| Adversarial Multimodal Network for Movie Question Answering | Jun 24, 2019 | Question AnsweringVideo Question Answering | —Unverified | 0 |
| ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering | Jun 6, 2019 | Question AnsweringVideo Question Answering | CodeCode Available | 0 |
| Gaining Extra Supervision via Multi-task learning for Multi-Modal Video Question Answering | May 28, 2019 | Inductive BiasMetric Learning | —Unverified | 0 |
| TVQA+: Spatio-Temporal Grounding for Video Question Answering | Apr 25, 2019 | Question AnsweringVideo Question Answering | CodeCode Available | 0 |
| Heterogeneous Memory Enhanced Multimodal Attention Model for Video Question Answering | Apr 8, 2019 | Question AnsweringVideo Question Answering | CodeCode Available | 0 |
| Holistic Multi-modal Memory Network for Movie Question Answering | Nov 12, 2018 | Question AnsweringRetrieval | —Unverified | 0 |
| TVQA: Localized, Compositional Video Question Answering | Sep 5, 2018 | Video Question Answering | CodeCode Available | 0 |
| A Joint Sequence Fusion Model for Video Question Answering and Retrieval | Aug 7, 2018 | DecoderMultiple-choice | CodeCode Available | 0 |