StarVQA+: Co-training Space-Time Attention for Video Quality Assessment Jun 21, 2023 Video Quality Assessment Visual Question Answering (VQA)
Code Code Available 0Encyclopedic VQA: Visual questions about detailed properties of fine-grained categories Jun 15, 2023 Question Answering Retrieval
— Unverified 0AVIS: Autonomous Visual Information Seeking with Large Language Model Agent Jun 13, 2023 Decision Making Language Modeling
— Unverified 0Visual Question Answering (VQA) on Images with Superimposed Text Jun 13, 2023 Question Answering Visual Question Answering
— Unverified 0Weakly Supervised Visual Question Answer Generation Jun 11, 2023 Answer Generation Dependency Parsing
— Unverified 0Knowledge Detection by Relevant Question and Image Attributes in Visual Question Answering Jun 8, 2023 Question Answering Retrieval
— Unverified 0Multi-CLIP: Contrastive Vision-Language Pre-training for Question Answering tasks in 3D Scenes Jun 4, 2023 Common Sense Reasoning Question Answering
— Unverified 0MetaVL: Transferring In-Context Learning Ability From Language Models to Vision-Language Models Jun 2, 2023 In-Context Learning Language Modeling
— Unverified 0Overcoming Language Bias in Remote Sensing Visual Question Answering via Adversarial Training Jun 1, 2023 Question Answering Visual Question Answering
— Unverified 0Evaluating the Capabilities of Multi-modal Reasoning Models with Synthetic Task Data Jun 1, 2023 Anomaly Detection Image Generation
— Unverified 0LiT-4-RSVQA: Lightweight Transformer-based Visual Question Answering in Remote Sensing Jun 1, 2023 Question Answering Visual Question Answering
— Unverified 0Using Visual Cropping to Enhance Fine-Detail Question Answering of BLIP-Family Models May 31, 2023 Question Answering Visual Question Answering
— Unverified 0Unveiling Cross Modality Bias in Visual Question Answering: A Causal View with Possible Worlds VQA May 31, 2023 counterfactual Counterfactual Inference
— Unverified 0Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge May 30, 2023 Answer Selection Question Answering
— Unverified 0HaVQA: A Dataset for Visual Question Answering and Multimodal Research in Hausa Language May 28, 2023 Machine Translation Multimodal Machine Translation
Code Code Available 0Modularized Zero-shot VQA with Pre-trained Models May 27, 2023 object-detection Object Detection
Code Code Available 0Zero-shot Visual Question Answering with Language Model Feedback May 26, 2023 Language Modeling Language Modelling
Code Code Available 0Study of Subjective and Objective Quality Assessment of Mobile Cloud Gaming Videos May 26, 2023 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 0Dynamic Clue Bottlenecks: Towards Interpretable-by-Design Visual Question Answering May 24, 2023 Question Answering Visual Question Answering
— Unverified 0Measuring Faithful and Plausible Visual Grounding in VQA May 24, 2023 Question Answering Visual Grounding
Code Code Available 0Transferring Visual Attributes from Natural Language to Verified Image Generation May 24, 2023 Image Generation Text to Image Generation
— Unverified 0Image Manipulation via Multi-Hop Instructions -- A New Dataset and Weakly-Supervised Neuro-Symbolic Approach May 23, 2023 Image Manipulation Question Answering
— Unverified 0DUBLIN -- Document Understanding By Language-Image Network May 23, 2023 Document Classification document understanding
— Unverified 0Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design May 22, 2023 image-classification Image Classification
— Unverified 0VLAB: Enhancing Video Language Pre-training by Feature Adapting and Blending May 22, 2023 Question Answering Retrieval
— Unverified 0Visual Question Answering: A Survey on Techniques and Common Trends in Recent Literature May 18, 2023 Question Answering Visual Question Answering
— Unverified 0An Empirical Study on the Language Modal in Visual Question Answering May 17, 2023 Question Answering Visual Question Answering
— Unverified 0TG-VQA: Ternary Game of Video Question Answering May 17, 2023 Contrastive Learning Question Answering
— Unverified 0A Novel Stochastic LSTM Model Inspired by Quantum Machine Learning May 17, 2023 Quantum Machine Learning Visual Question Answering (VQA)
— Unverified 0SB-VQA: A Stack-Based Video Quality Assessment Framework for Video Enhancement May 15, 2023 Video Enhancement Video Quality Assessment
— Unverified 0OpenViVQA: Task, Dataset, and Multimodal Fusion Models for Visual Question Answering in Vietnamese May 7, 2023 Information Retrieval Question Answering
Code Code Available 0Adaptive loose optimization for robust question answering May 6, 2023 Extractive Question-Answering Machine Reading Comprehension
Code Code Available 0Analysis of Visual Question Answering Algorithms with attention model May 4, 2023 Question Answering Visual Question Answering
— Unverified 0GAMIVAL: Video Quality Prediction on Mobile Cloud Gaming Content May 3, 2023 Video Quality Assessment Visual Question Answering (VQA)
Code Code Available 0An Empirical Comparison of Optimizers for Quantum Machine Learning with SPSA-based Gradients Apr 27, 2023 Quantum Machine Learning Visual Question Answering (VQA)
— Unverified 0HDR-ChipQA: No-Reference Quality Assessment on High Dynamic Range Videos Apr 25, 2023 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 0Making Video Quality Assessment Models Robust to Bit Depth Apr 25, 2023 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 0PDFVQA: A New Dataset for Real-World VQA on PDF Documents Apr 13, 2023 document understanding Key Information Extraction
— Unverified 0CAVL: Learning Contrastive and Adaptive Representations of Vision and Language Apr 10, 2023 Image Retrieval Phrase Grounding
— Unverified 0Improving Visual Question Answering Models through Robustness Analysis and In-Context Learning with a Chain of Basic Questions Apr 6, 2023 In-Context Learning Question Answering
— Unverified 0Locate Then Generate: Bridging Vision and Language with Bounding Box for Scene-Text VQA Apr 4, 2023 Answer Generation Language Modelling
— Unverified 0SC-ML: Self-supervised Counterfactual Metric Learning for Debiased Visual Question Answering Apr 4, 2023 counterfactual Metric Learning
— Unverified 0Q2ATransformer: Improving Medical VQA via an Answer Querying Decoder Apr 4, 2023 Classification Decoder
— Unverified 0Instance-Level Trojan Attacks on Visual Question Answering via Adversarial Learning in Neuron Activation Space Apr 2, 2023 Question Answering Visual Question Answering
— Unverified 0MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks Mar 29, 2023 Cross-Modal Retrieval Decoder
Code Code Available 0Unmasked Teacher: Towards Training-Efficient Video Foundation Models Mar 28, 2023 Action Classification Action Recognition
Code Code Available 0Curriculum Learning for Compositional Visual Reasoning Mar 27, 2023 Question Answering Visual Question Answering
— Unverified 0CoBIT: A Contrastive Bi-directional Image-Text Generation Model Mar 23, 2023 Decoder Image Generation
— Unverified 0Integrating Image Features with Convolutional Sequence-to-sequence Network for Multilingual Visual Question Answering Mar 22, 2023 Question Answering Visual Question Answering
Code Code Available 03D Concept Learning and Reasoning from Multi-View Images Mar 20, 2023 Question Answering Visual Question Answering
— Unverified 0