Sentence Attention Blocks for Answer Grounding Sep 20, 2023 Question Answering Sentence
— Unverified 0Is the House Ready For Sleeptime? Generating and Evaluating Situational Queries for Embodied Question Answering May 8, 2024 2k Embodied Question Answering
— Unverified 0Serving and Optimizing Machine Learning Workflows on Heterogeneous Infrastructures May 10, 2022 AutoML BIG-bench Machine Learning
— Unverified 0Sheffield MultiMT: Using Object Posterior Predictions for Multimodal Machine Translation Sep 1, 2017 Image Captioning Image Classification
— Unverified 0Show Why the Answer is Correct! Towards Explainable AI using Compositional Temporal Attention May 15, 2021 Question Answering Visual Question Answering
— Unverified 0Silence is Not Consensus: Disrupting Agreement Bias in Multi-Agent LLMs via Catfish Agent for Clinical Decision Making May 27, 2025 Decision Making Diagnostic
— Unverified 0Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps Dec 9, 2020 Decoder Image Captioning
— Unverified 0SimpleLLM4AD: An End-to-End Vision-Language Model with Graph Visual Question Answering for Autonomous Driving Jul 31, 2024 Autonomous Driving Language Modeling
— Unverified 0SimpsonsVQA: Enhancing Inquiry-Based Learning with a Tailored Dataset Oct 30, 2024 Question Answering Visual Question Answering
— Unverified 0SimVQA: Exploring Simulated Environments for Visual Question Answering Mar 31, 2022 Data Augmentation Diversity
— Unverified 0Single-Modal Entropy based Active Learning for Visual Question Answering Oct 21, 2021 Active Learning Question Answering
— Unverified 0SK-VQA: Synthetic Knowledge Generation at Scale for Training Context-Augmented Multimodal LLMs Jun 28, 2024 RAG Retrieval-augmented Generation
— Unverified 0SlideChat: A Large Vision-Language Assistant for Whole-Slide Pathology Image Understanding Oct 15, 2024 Instruction Following Visual Question Answering (VQA)
— Unverified 0SMMILE: An Expert-Driven Benchmark for Multimodal Medical In-Context Learning Jun 26, 2025 In-Context Learning Medical Visual Question Answering
— Unverified 0SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM Mar 7, 2024 Question Answering Retrieval
— Unverified 0SocialGesture: Delving into Multi-person Gesture Understanding Apr 3, 2025 Gesture Recognition Question Answering
— Unverified 0Social-LLaVA: Enhancing Robot Navigation through Human-Language Reasoning in Social Spaces Dec 30, 2024 2k Robot Navigation
— Unverified 0Solving Visual Madlibs with Multiple Cues Aug 11, 2016 Activity Prediction Attribute
— Unverified 0Sparks of Artificial General Intelligence(AGI) in Semiconductor Material Science: Early Explorations into the Next Frontier of Generative AI-Assisted Electron Micrograph Analysis Sep 17, 2024 In-Context Learning Question Answering
— Unverified 0Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers Nov 28, 2024 Image Captioning image-classification
— Unverified 0Spatial Attention as an Interface for Image Captioning Models Sep 29, 2020 Image Captioning Question Answering
— Unverified 0Spatial Knowledge Distillation to aid Visual Reasoning Dec 10, 2018 Diagnostic Knowledge Distillation
— Unverified 0Spatial Language Understanding with Multimodal Graphs using Declarative Learning based Programming Sep 1, 2017 Image Captioning Image Retrieval
— Unverified 0SpatialLLM: A Compound 3D-Informed Design towards Spatially-Intelligent Large Multimodal Models May 1, 2025 Spatial Reasoning Visual Question Answering (VQA)
— Unverified 0SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities Jan 22, 2024 Question Answering Spatial Reasoning
— Unverified 0Spectral Graph-Based Method of Multimodal Word Embedding Aug 1, 2017 Graph Embedding Image Retrieval
— Unverified 0SplatTalk: 3D VQA with Gaussian Splatting Mar 8, 2025 3DGS Question Answering
— Unverified 0Spoken question answering for visual queries May 29, 2025 Question Answering Visual Question Answering (VQA)
— Unverified 0SQuINTing at VQA Models: Introspecting VQA Models with Sub-Questions Jan 20, 2020 Visual Question Answering (VQA)
— Unverified 0Stacked Latent Attention for Multimodal Reasoning Jun 1, 2018 Image Captioning Multimodal Reasoning
— Unverified 0Stacking with Auxiliary Features for Visual Question Answering Jun 1, 2018 Common Sense Reasoning Question Answering
— Unverified 0StackOverflowVQA: Stack Overflow Visual Question Answering Dataset May 17, 2024 Question Answering Sentence
— Unverified 0Steering LVLMs via Sparse Autoencoder for Hallucination Mitigation May 22, 2025 Hallucination Image Captioning
— Unverified 0STL-CQA: Structure-based Transformers with Localization and Encoding for Chart Question Answering Nov 1, 2020 Chart Question Answering Question Answering
— Unverified 0Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering Sep 4, 2018 Factual Visual Question Answering General Knowledge
— Unverified 0StructuralLM: Structural Pre-training for Form Understanding May 24, 2021 document-image-classification Document Image Classification
— Unverified 0Structured Two-stream Attention Network for Video Question Answering Jun 2, 2022 Question Answering Video Question Answering
— Unverified 0Structure Learning for Neural Module Networks May 27, 2019 Question Answering Visual Question Answering
— Unverified 0Study of Subjective and Objective Quality Assessment of Mobile Cloud Gaming Videos May 26, 2023 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 0Study of the effect of Sharpness on Blind Video Quality Assessment Apr 6, 2024 SSIM Video Quality Assessment
— Unverified 0Subjective and Objective Analysis of Streamed Gaming Videos Mar 24, 2022 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 0Subjective and Objective Quality Assessment of Rendered Human Avatar Videos in Virtual Reality Aug 13, 2024 Video Compression Video Quality Assessment
— Unverified 0Subtleties in the trainability of quantum machine learning models Oct 27, 2021 BIG-bench Machine Learning Quantum Machine Learning
— Unverified 0Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation Sep 10, 2019 Common Sense Reasoning Data Augmentation
— Unverified 0Supervising the Transfer of Reasoning Patterns in VQA Jun 10, 2021 PAC learning Transfer Learning
— Unverified 0Surgical-LVLM: Learning to Adapt Large Vision-Language Model for Grounded Visual Question Answering in Robotic Surgery Mar 22, 2024 Language Modeling Language Modelling
— Unverified 0SurgicalVLM-Agent: Towards an Interactive AI Co-Pilot for Pituitary Surgery Mar 12, 2025 Activity Recognition Anatomy
— Unverified 0Survey of Recent Advances in Visual Question Answering Sep 24, 2017 Question Answering Survey
— Unverified 0Survey of Visual Question Answering: Datasets and Techniques May 10, 2017 Deep Learning Question Answering
— Unverified 0Survey of Visual-Semantic Embedding Methods for Zero-Shot Image Retrieval May 16, 2021 Graph Generation Image Captioning
— Unverified 0