Multimodal Multihop Source Retrieval for Web Question Answering Jan 7, 2025 Multi-hop Question Answering Question Answering
— Unverified 0Visual question answering: from early developments to recent advances -- a survey Jan 7, 2025 Descriptive Natural Language Understanding
— Unverified 0Localizing AI: Evaluating Open-Weight Language Models for Languages of Baltic States Jan 7, 2025 Machine Translation Multiple-choice
— Unverified 0Multilingual Open QA on the MIA Shared Task Jan 7, 2025 Cross-Lingual Information Retrieval Information Retrieval
— Unverified 0KAnoCLIP: Zero-Shot Anomaly Detection through Knowledge-Driven Prompt Learning and Enhanced Cross-Modal Integration Jan 7, 2025 Anomaly Detection Anomaly Segmentation
— Unverified 0QuIM-RAG: Advancing Retrieval-Augmented Generation with Inverted Question Matching for Enhanced QA Performance Jan 6, 2025 Question Answering RAG
— Unverified 0BoundingDocs: a Unified Dataset for Document Question Answering with Spatial Annotations Jan 6, 2025 Document AI document understanding
— Unverified 0Socratic Questioning: Learn to Self-guide Multimodal Reasoning in the Wild Jan 6, 2025 Hallucination Multimodal Reasoning
Code Code Available 0ReDiT: Re‑evaluating large visual question answering model confidence by defining input scenario Difficulty and applying Temperature mapping Jan 6, 2025 Question Answering Visual Question Answering
Code Code Available 0FlippedRAG: Black-Box Opinion Manipulation Adversarial Attacks to Retrieval-Augmented Generation Models Jan 6, 2025 Adversarial Attack Hallucination
— Unverified 0Survey on Question Answering over Visually Rich Documents: Methods, Challenges, and Trends Jan 4, 2025 document understanding Question Answering
— Unverified 0Accounting for Focus Ambiguity in Visual Questions Jan 4, 2025 Question Answering Visual Question Answering
— Unverified 0QuArch: A Question-Answering Dataset for AI Agents in Computer Architecture Jan 3, 2025 Benchmarking Question Answering
— Unverified 0MoColl: Agent-Based Specific and General Model Collaboration for Image Captioning Jan 3, 2025 Diagnostic General Knowledge
— Unverified 0HLV-1K: A Large-scale Hour-Long Video Benchmark for Time-Specific Long Video Understanding Jan 3, 2025 Question Answering Video Understanding
Code Code Available 0Interpretable Face Anti-Spoofing: Enhancing Generalization with Multimodal Large Language Models Jan 3, 2025 Binary Classification Face Anti-Spoofing
— Unverified 0A Survey on Large Language Models with some Insights on their Capabilities and Limitations Jan 3, 2025 Code Generation Question Answering
— Unverified 0(WhyPHI) Fine-Tuning PHI-3 for Multiple-Choice Question Answering: Methodology, Results, and Challenges Jan 3, 2025 Multiple-choice Question Answering
Code Code Available 0The Essence of Contextual Understanding in Theory of Mind: A Study on Question Answering with Story Characters Jan 3, 2025 Question Answering
— Unverified 0CLIP-UP: CLIP-Based Unanswerable Problem Detection for Visual Question Answering Jan 2, 2025 Multiple-choice Question Answering
— Unverified 0Citations and Trust in LLM Generated Responses Jan 2, 2025 Chatbot Question Answering
— Unverified 0Advancing Singlish Understanding: Bridging the Gap with Datasets and Multimodal Models Jan 2, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation Jan 1, 2025 Language Modeling Language Modelling
— Unverified 0Alignment, Mining and Fusion: Representation Alignment with Hard Negative Mining and Selective Knowledge Fusion for Medical Visual Question Answering Jan 1, 2025 Contrastive Learning Medical Visual Question Answering
— Unverified 0Separation of Powers: On Segregating Knowledge from Observation in LLM-enabled Knowledge-based Visual Question Answering Jan 1, 2025 Multiple-choice Question Answering
— Unverified 0AdaDARE-gamma: Balancing Stability and Plasticity in Multi-modal LLMs through Efficient Adaptation Jan 1, 2025 Image Captioning Question Answering
— Unverified 0AdaCM^2: On Understanding Extremely Long-Term Video with Adaptive Cross-Modality Memory Reduction Jan 1, 2025 GPU Question Answering
— Unverified 0SVLTA: Benchmarking Vision-Language Temporal Alignment via Synthetic Video Situation Jan 1, 2025 Benchmarking Diagnostic
— Unverified 0Efficient Motion-Aware Video MLLM Jan 1, 2025 Question Answering Video Question Answering
— Unverified 0EfficientLLaVA: Generalizable Auto-Pruning for Large Vision-language Models Jan 1, 2025 MM-Vet Multimodal Reasoning
— Unverified 0Beyond Text: Implementing Multimodal Large Language Model-Powered Multi-Agent Systems Using a No-Code Platform Jan 1, 2025 Code Generation Image Generation
— Unverified 0MIMO: A Medical Vision Language Model with Visual Referring Multimodal Input and Pixel Grounding Multimodal Output Jan 1, 2025 Instruction Following Language Modeling
Code Code Available 0Flexible Frame Selection for Efficient Video Reasoning Jan 1, 2025 Language Modeling Language Modelling
— Unverified 0AVQACL: A Novel Benchmark for Audio-Visual Question Answering Continual Learning Jan 1, 2025 Audio-visual Question Answering Continual Learning
Code Code Available 0Font-Agent: Enhancing Font Understanding with Large Language Models Jan 1, 2025 Font Generation Question Answering
— Unverified 0JTD-UAV: MLLM-Enhanced Joint Tracking and Description Framework for Anti-UAV Systems Jan 1, 2025 Question Answering Visual Question Answering
— Unverified 0HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding Jan 1, 2025 Question Answering Video Understanding
— Unverified 0Zero-shot 3D Question Answering via Voxel-based Dynamic Token Compression Jan 1, 2025 Question Answering
— Unverified 0Seeing More with Less: Human-like Representations in Vision Models Jan 1, 2025 object-detection Object Detection
— Unverified 0A review of faithfulness metrics for hallucination assessment in Large Language Models Dec 31, 2024 Benchmarking Hallucination
— Unverified 0Probing Visual Language Priors in VLMs Dec 31, 2024 Question Answering Visual Question Answering
— Unverified 0LLM-MedQA: Enhancing Medical Question Answering through Case Studies in Large Language Models Dec 31, 2024 Medical Question Answering MedQA
— Unverified 0An Empirical Evaluation of Large Language Models on Consumer Health Questions Dec 31, 2024 Medical Question Answering Question Answering
— Unverified 0MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation Models Dec 31, 2024 Multiple-choice Question Answering
Code Code Available 0EQUATOR: A Deterministic Framework for Evaluating LLM Reasoning with Open-Ended Questions. # v1.0.0-beta Dec 31, 2024 Multiple-choice Question Answering
— Unverified 0MapQaTor: An Extensible Framework for Efficient Annotation of Map-Based QA Datasets Dec 30, 2024 Question Answering
Code Code Available 0Are LLMs Really Not Knowledgable? Mining the Submerged Knowledge in LLMs' Memory Dec 30, 2024 Question Answering
— Unverified 0WalkVLM:Aid Visually Impaired People Walking by Vision Language Model Dec 30, 2024 Language Modeling Language Modelling
— Unverified 0Hierarchical Banzhaf Interaction for General Video-Language Representation Learning Dec 30, 2024 Contrastive Learning Question Answering
— Unverified 0Plug-and-Play Training Framework for Preference Optimization Dec 30, 2024 Mathematical Reasoning Question Answering
— Unverified 0