| Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding | Jun 6, 2016 | Phrase GroundingVisual Grounding | CodeCode Available | 0 |
| Multimodal Residual Learning for Visual QA | Jun 5, 2016 | Multiple-choiceQuestion Answering | CodeCode Available | 0 |
| Answer-Type Prediction for Visual Question Answering | Jun 1, 2016 | Object RecognitionPrediction | —Unverified | 0 |
| Hierarchical Question-Image Co-Attention for Visual Question Answering | May 31, 2016 | Visual DialogVisual Question Answering | CodeCode Available | 1 |
| End-to-End Instance Segmentation with Recurrent Attention | May 30, 2016 | Autonomous DrivingImage Captioning | CodeCode Available | 0 |
| Ask Your Neurons: A Deep Learning Approach to Visual Question Answering | May 9, 2016 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Leveraging Visual Question Answering for Image-Caption Ranking | May 4, 2016 | Image RetrievalQuestion Answering | —Unverified | 0 |
| Learning Models for Actions and Person-Object Interactions with Transfer to Question Answering | Apr 16, 2016 | General ClassificationHuman-Object Interaction Detection | —Unverified | 0 |
| Counting Everyday Objects in Everyday Scenes | Apr 12, 2016 | ObjectObject Counting | CodeCode Available | 0 |
| A Focused Dynamic Attention Model for Visual Question Answering | Apr 6, 2016 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Neural Attention Models for Sequence Classification: Analysis and Application to Key Term Extraction and Dialogue Act Detection | Mar 31, 2016 | Caption GenerationClassification | —Unverified | 0 |
| Image Captioning and Visual Question Answering Based on Attributes and External Knowledge | Mar 9, 2016 | General KnowledgeImage Captioning | —Unverified | 0 |
| Dynamic Memory Networks for Visual and Textual Question Answering | Mar 4, 2016 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Neural Self Talk: Image Understanding via Continuous Questioning and Answering | Dec 10, 2015 | Question AnsweringQuestion Generation | —Unverified | 0 |
| Simple Baseline for Visual Question Answering | Dec 7, 2015 | Visual Question AnsweringVisual Question Answering (VQA) | CodeCode Available | 0 |
| A Restricted Visual Turing Test for Deep Scene and Event Understanding | Dec 6, 2015 | Question AnsweringVideo Captioning | —Unverified | 0 |
| Where To Look: Focus Regions for Visual Question Answering | Nov 23, 2015 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Ask Me Anything: Free-form Visual Question Answering Based on Knowledge from External Sources | Nov 22, 2015 | FormGeneral Knowledge | —Unverified | 0 |
| Compositional Memory for Visual Question Answering | Nov 18, 2015 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering | Nov 18, 2015 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering | Nov 17, 2015 | Image CaptioningQuestion Answering | CodeCode Available | 0 |
| Yin and Yang: Balancing and Answering Binary Visual Questions | Nov 16, 2015 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Visual7W: Grounded Question Answering in Images | Nov 11, 2015 | Multiple-choiceMultiple Choice Question Answering (MCQA) | —Unverified | 0 |
| Explicit Knowledge-based Reasoning for Visual Question Answering | Nov 9, 2015 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Neural Module Networks | Nov 9, 2015 | Visual Question AnsweringVisual Question Answering (VQA) | CodeCode Available | 0 |
| What value do explicit high level concepts have in vision to language problems? | Jun 3, 2015 | Image CaptioningQuestion Answering | CodeCode Available | 0 |
| VQA: Visual Question Answering | May 3, 2015 | Image CaptioningMultiple-choice | CodeCode Available | 1 |