Self-ReS: Self-Reflection in Large Vision-Language Models for Long Video Understanding Mar 26, 2025 GPU Question Answering
— Unverified 0A Survey of Multimodal Retrieval-Augmented Generation Mar 26, 2025 Information Retrieval Question Answering
— Unverified 0Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields Mar 26, 2025 Question Answering Visual Question Answering
— Unverified 0VGAT: A Cancer Survival Analysis Framework Transitioning from Generative Visual Question Answering to Genomic Reconstruction Mar 25, 2025 Generative Visual Question Answering Question Answering
Code Code Available 0Can Vision-Language Models Answer Face to Face Questions in the Real-World? Mar 25, 2025 Question Answering
— Unverified 0DomainCQA: Crafting Expert-Level QA from Domain-Specific Charts Mar 25, 2025 Astronomy Chart Question Answering
— Unverified 0ORION: A Holistic End-to-End Autonomous Driving Framework by Vision-Language Instructed Action Generation Mar 25, 2025 Action Generation Autonomous Driving
— Unverified 0VectorFit : Adaptive Singular & Bias Vector Fine-Tuning of Pre-trained Foundation Models Mar 25, 2025 image-classification Image Classification
— Unverified 0BiblioPage: A Dataset of Scanned Title Pages for Bibliographic Metadata Extraction Mar 25, 2025 document understanding object-detection
Code Code Available 0DeCAP: Context-Adaptive Prompt Generation for Debiasing Zero-shot Question Answering in Large Language Models Mar 25, 2025 Fairness Question Answering
— Unverified 0Improved Alignment of Modalities in Large Vision Language Models Mar 25, 2025 GPU Image Captioning
— Unverified 0KSHSeek: Data-Driven Approaches to Mitigating and Detecting Knowledge-Shortcut Hallucinations in Generative Models Mar 25, 2025 Hallucination Question Answering
— Unverified 0ImF: Implicit Fingerprint for Large Language Models Mar 25, 2025 Adversarial Attack Question Answering
— Unverified 0Context-Efficient Retrieval with Factual Decomposition Mar 25, 2025 Form Information Retrieval
— Unverified 0LEGO-Puzzles: How Good Are MLLMs at Multi-Step Spatial Reasoning? Mar 25, 2025 Autonomous Navigation Question Answering
— Unverified 0Open-Vocabulary Functional 3D Scene Graphs for Real-World Indoor Spaces Mar 24, 2025 Question Answering
— Unverified 0DiN: Diffusion Model for Robust Medical VQA with Semantic Noisy Labels Mar 24, 2025 Medical Visual Question Answering Question Answering
— Unverified 0Synthetic Function Demonstrations Improve Generation in Low-Resource Programming Languages Mar 24, 2025 Question Answering RAG
— Unverified 0Where is this coming from? Making groundedness count in the evaluation of Document VQA models Mar 24, 2025 Question Answering Visual Question Answering
— Unverified 0MAGIC-VQA: Multimodal And Grounded Inference with Commonsense Knowledge for Visual Question Answering Mar 24, 2025 Graph Neural Network Question Answering
— Unverified 0A Survey of Large Language Model Agents for Question Answering Mar 24, 2025 Answer Generation Information Retrieval
— Unverified 0When is dataset cartography ineffective? Using training dynamics does not improve robustness against Adversarial SQuAD Mar 24, 2025 Adversarial Robustness Extractive Question-Answering
— Unverified 0Expanding the Boundaries of Vision Prior Knowledge in Multi-modal Large Language Models Mar 23, 2025 Question Answering Visual Question Answering
— Unverified 0SUNAR: Semantic Uncertainty based Neighborhood Aware Retrieval for Complex QA Mar 23, 2025 Question Answering Retrieval
— Unverified 0SLIDE: Sliding Localized Information for Document Extraction Mar 23, 2025 Chunking graph construction
— Unverified 0Unmasking Deceptive Visuals: Benchmarking Multimodal Large Language Models on Misleading Chart Question Answering Mar 23, 2025 Benchmarking Chart Question Answering
— Unverified 0Progressive Prompt Detailing for Improved Alignment in Text-to-Image Generative Models Mar 22, 2025 Question Answering Visual Question Answering
Code Code Available 0Relation Extraction with Instance-Adapted Predicate Descriptions Mar 22, 2025 Decoder Question Answering
Code Code Available 04D-Bench: Benchmarking Multi-modal Large Language Models for 4D Object Understanding Mar 22, 2025 Benchmarking Object
Code Code Available 0Does Chain-of-Thought Reasoning Help Mobile GUI Agent? An Empirical Study Mar 21, 2025 Attribute Mathematical Problem-Solving
Code Code Available 0MARS: A Multi-Agent Framework Incorporating Socratic Guidance for Automated Prompt Optimization Mar 21, 2025 Question Answering
— Unverified 0Dense Passage Retrieval in Conversational Search Mar 21, 2025 Conversational Search Information Retrieval
Code Code Available 0A Study into Investigating Temporal Robustness of LLMs Mar 21, 2025 Question Answering World Knowledge
— Unverified 0PVChat: Personalized Video Chat with One-Shot Learning Mar 21, 2025 One-Shot Learning Question Answering
— Unverified 0Typed-RAG: Type-aware Multi-Aspect Decomposition for Non-Factoid Question Answering Mar 20, 2025 Question Answering RAG
Code Code Available 0ECKGBench: Benchmarking Large Language Models in E-commerce Leveraging Knowledge Graph Mar 20, 2025 Benchmarking Hallucination
— Unverified 0Bridging Technology and Humanities: Evaluating the Impact of Large Language Models on Social Sciences Research with DeepSeek-R1 Mar 20, 2025 Large Language Model Logical Reasoning
— Unverified 0Big Help or Big Brother? Auditing Tracking, Profiling, and Personalization in Generative AI Assistants Mar 20, 2025 Question Answering
— Unverified 0A Vision Centric Remote Sensing Benchmark Mar 20, 2025 Question Answering Representation Learning
— Unverified 0DocVideoQA: Towards Comprehensive Understanding of Document-Centric Videos through Question Answering Mar 20, 2025 Contrastive Learning Question Answering
— Unverified 0AutoDrive-QA- Automated Generation of Multiple-Choice Questions for Autonomous Driving Datasets Using Large Vision-Language Models Mar 20, 2025 Autonomous Driving Multiple-choice
— Unverified 0GraspCoT: Integrating Physical Property Reasoning for 6-DoF Grasping under Flexible Language Instructions Mar 20, 2025 Question Answering
— Unverified 0MKG-Rank: Enhancing Large Language Models with Knowledge Graph for Multilingual Medical Question Answering Mar 20, 2025 Knowledge Graphs Medical Question Answering
— Unverified 0UMIT: Unifying Medical Imaging Tasks via Vision-Language Models Mar 20, 2025 Diagnostic Medical Image Analysis
Code Code Available 0UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation Mar 19, 2025 Language Model Evaluation Language Modeling
— Unverified 0Bias Evaluation and Mitigation in Retrieval-Augmented Medical Question-Answering Systems Mar 19, 2025 counterfactual Decision Making
— Unverified 0MAMM-Refine: A Recipe for Improving Faithfulness in Generation with Multi-Agent Collaboration Mar 19, 2025 Long Form Question Answering Question Answering
— Unverified 0Solla: Towards a Speech-Oriented LLM That Hears Acoustic Context Mar 19, 2025 Audio captioning Audio Question Answering
Code Code Available 0EfficientLLaVA:Generalizable Auto-Pruning for Large Vision-language Models Mar 19, 2025 MM-Vet Multimodal Reasoning
— Unverified 0TruthLens:A Training-Free Paradigm for DeepFake Detection Mar 19, 2025 Binary Classification DeepFake Detection
— Unverified 0