SimpsonsVQA: Enhancing Inquiry-Based Learning with a Tailored Dataset Oct 30, 2024 Question Answering Visual Question Answering
— Unverified 00 SimVQA: Exploring Simulated Environments for Visual Question Answering Mar 31, 2022 Data Augmentation Diversity
— Unverified 00 Single-Modal Entropy based Active Learning for Visual Question Answering Oct 21, 2021 Active Learning Question Answering
— Unverified 00 SK-VQA: Synthetic Knowledge Generation at Scale for Training Context-Augmented Multimodal LLMs Jun 28, 2024 RAG Retrieval-augmented Generation
— Unverified 00 SlideChat: A Large Vision-Language Assistant for Whole-Slide Pathology Image Understanding Oct 15, 2024 Instruction Following Visual Question Answering (VQA)
— Unverified 00 SMMILE: An Expert-Driven Benchmark for Multimodal Medical In-Context Learning Jun 26, 2025 In-Context Learning Medical Visual Question Answering
— Unverified 00 SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM Mar 7, 2024 Question Answering Retrieval
— Unverified 00 SocialGesture: Delving into Multi-person Gesture Understanding Apr 3, 2025 Gesture Recognition Question Answering
— Unverified 00 Social-LLaVA: Enhancing Robot Navigation through Human-Language Reasoning in Social Spaces Dec 30, 2024 2k Robot Navigation
— Unverified 00 Solving Visual Madlibs with Multiple Cues Aug 11, 2016 Activity Prediction Attribute
— Unverified 00 Sparks of Artificial General Intelligence(AGI) in Semiconductor Material Science: Early Explorations into the Next Frontier of Generative AI-Assisted Electron Micrograph Analysis Sep 17, 2024 In-Context Learning Question Answering
— Unverified 00 Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers Nov 28, 2024 Image Captioning image-classification
— Unverified 00 Spatial Attention as an Interface for Image Captioning Models Sep 29, 2020 Image Captioning Question Answering
— Unverified 00 Spatial Knowledge Distillation to aid Visual Reasoning Dec 10, 2018 Diagnostic Knowledge Distillation
— Unverified 00 Spatial Language Understanding with Multimodal Graphs using Declarative Learning based Programming Sep 1, 2017 Image Captioning Image Retrieval
— Unverified 00 SpatialLLM: A Compound 3D-Informed Design towards Spatially-Intelligent Large Multimodal Models May 1, 2025 Spatial Reasoning Visual Question Answering (VQA)
— Unverified 00 SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities Jan 22, 2024 Question Answering Spatial Reasoning
— Unverified 00 Spectral Graph-Based Method of Multimodal Word Embedding Aug 1, 2017 Graph Embedding Image Retrieval
— Unverified 00 SplatTalk: 3D VQA with Gaussian Splatting Mar 8, 2025 3DGS Question Answering
— Unverified 00 Spoken question answering for visual queries May 29, 2025 Question Answering Visual Question Answering (VQA)
— Unverified 00 SQuINTing at VQA Models: Introspecting VQA Models with Sub-Questions Jan 20, 2020 Visual Question Answering (VQA)
— Unverified 00 Stacked Latent Attention for Multimodal Reasoning Jun 1, 2018 Image Captioning Multimodal Reasoning
— Unverified 00 Stacking with Auxiliary Features for Visual Question Answering Jun 1, 2018 Common Sense Reasoning Question Answering
— Unverified 00 StackOverflowVQA: Stack Overflow Visual Question Answering Dataset May 17, 2024 Question Answering Sentence
— Unverified 00 Steering LVLMs via Sparse Autoencoder for Hallucination Mitigation May 22, 2025 Hallucination Image Captioning
— Unverified 00 STL-CQA: Structure-based Transformers with Localization and Encoding for Chart Question Answering Nov 1, 2020 Chart Question Answering Question Answering
— Unverified 00 Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering Sep 4, 2018 Factual Visual Question Answering General Knowledge
— Unverified 00 StructuralLM: Structural Pre-training for Form Understanding May 24, 2021 document-image-classification Document Image Classification
— Unverified 00 Structured Two-stream Attention Network for Video Question Answering Jun 2, 2022 Question Answering Video Question Answering
— Unverified 00 Structure Learning for Neural Module Networks May 27, 2019 Question Answering Visual Question Answering
— Unverified 00 Study of Subjective and Objective Quality Assessment of Mobile Cloud Gaming Videos May 26, 2023 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 00 Study of the effect of Sharpness on Blind Video Quality Assessment Apr 6, 2024 SSIM Video Quality Assessment
— Unverified 00 Subjective and Objective Analysis of Streamed Gaming Videos Mar 24, 2022 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 00 Subjective and Objective Quality Assessment of Rendered Human Avatar Videos in Virtual Reality Aug 13, 2024 Video Compression Video Quality Assessment
— Unverified 00 Subtleties in the trainability of quantum machine learning models Oct 27, 2021 BIG-bench Machine Learning Quantum Machine Learning
— Unverified 00 Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation Sep 10, 2019 Common Sense Reasoning Data Augmentation
— Unverified 00 Supervising the Transfer of Reasoning Patterns in VQA Jun 10, 2021 PAC learning Transfer Learning
— Unverified 00 Surgical-LVLM: Learning to Adapt Large Vision-Language Model for Grounded Visual Question Answering in Robotic Surgery Mar 22, 2024 Language Modeling Language Modelling
— Unverified 00 SurgicalVLM-Agent: Towards an Interactive AI Co-Pilot for Pituitary Surgery Mar 12, 2025 Activity Recognition Anatomy
— Unverified 00 Survey of Recent Advances in Visual Question Answering Sep 24, 2017 Question Answering Survey
— Unverified 00 Survey of Visual Question Answering: Datasets and Techniques May 10, 2017 Deep Learning Question Answering
— Unverified 00 Survey of Visual-Semantic Embedding Methods for Zero-Shot Image Retrieval May 16, 2021 Graph Generation Image Captioning
— Unverified 00 Swarm Intelligence in Geo-Localization: A Multi-Agent Large Vision-Language Model Collaborative Framework Aug 21, 2024 geo-localization Language Modeling
— Unverified 00 Syntax Tree Constrained Graph Network for Visual Question Answering Sep 17, 2023 Question Answering Visual Question Answering
— Unverified 00 Synthesize Step-by-Step: Tools, Templates and LLMs as Data Generators for Reasoning-Based Chart VQA Mar 25, 2024 Chart Question Answering Data Augmentation
— Unverified 00 Synthesize Step-by-Step: Tools Templates and LLMs as Data Generators for Reasoning-Based Chart VQA Jan 1, 2024 Chart Question Answering Data Augmentation
— Unverified 00 T2I-FactualBench: Benchmarking the Factuality of Text-to-Image Models with Knowledge-Intensive Concepts Dec 5, 2024 Benchmarking Image Generation
— Unverified 00 Tackling VQA with Pretrained Foundation Models without Further Training Sep 27, 2023 Question Answering Visual Question Answering
— Unverified 00 Take A Step Back: Rethinking the Two Stages in Visual Reasoning Jul 29, 2024 Logical Reasoning Question Answering
— Unverified 00 Taking a HINT: Leveraging Explanations to Make Vision and Language Models More Grounded Feb 11, 2019 Image Captioning Question Answering
— Unverified 00