Hierarchical Banzhaf Interaction for General Video-Language Representation Learning Dec 30, 2024 Contrastive Learning Question Answering
— Unverified 0MapQaTor: An Extensible Framework for Efficient Annotation of Map-Based QA Datasets Dec 30, 2024 Question Answering
Code Code Available 0WalkVLM:Aid Visually Impaired People Walking by Vision Language Model Dec 30, 2024 Language Modeling Language Modelling
— Unverified 0HALLUCINOGEN: A Benchmark for Evaluating Object Hallucination in Large Visual-Language Models Dec 29, 2024 Hallucination Object
Code Code Available 0Audiopedia: Audio QA with Knowledge Dec 29, 2024 Audio Question Answering Entity Linking
Code Code Available 0Building a Rich Dataset to Empower the Persian Question Answering Systems Dec 28, 2024 Question Answering
— Unverified 0Efficient Multi-Agent Collaboration with Tool Use for Online Planning in Complex Table Question Answering Dec 28, 2024 Question Answering
— Unverified 0ErgoChat: a Visual Query System for the Ergonomic Risk Assessment of Construction Workers Dec 27, 2024 Image Captioning Question Answering
— Unverified 0TARGA: Targeted Synthetic Data Generation for Practical Reasoning over Structured Data Dec 27, 2024 In-Context Learning Knowledge Base Question Answering
— Unverified 0Text2Insight: Transform natural language text into insights seamlessly using multi-model architecture Dec 27, 2024 named-entity-recognition Named Entity Recognition
— Unverified 0Pre-training, Fine-tuning and Re-ranking: A Three-Stage Framework for Legal Question Answering Dec 27, 2024 Question Answering Representation Learning
— Unverified 0Perceive, Query & Reason: Enhancing Video QA with Question-Guided Temporal Queries Dec 26, 2024 Question Answering Video Question Answering
— Unverified 0Improving Generated and Retrieved Knowledge Combination Through Zero-shot Generation Dec 25, 2024 Open-Domain Question Answering Question Answering
— Unverified 0LININ: Logic Integrated Neural Inference Network for Explanatory Visual Question Answering Dec 24, 2024 Explanatory Visual Question Answering Multimodal Reasoning
Code Code Available 0TextMatch: Enhancing Image-Text Consistency Through Multimodal Optimization Dec 24, 2024 In-Context Learning Question Answering
— Unverified 0GeAR: Graph-enhanced Agent for Retrieval-augmented Generation Dec 24, 2024 Multi-hop Question Answering Question Answering
— Unverified 0Unlocking the Potential of Multiple BERT Models for Bangla Question Answering in NCTB Textbooks Dec 24, 2024 Question Answering Reading Comprehension
— Unverified 0Multi-Agents Based on Large Language Models for Knowledge-based Visual Question Answering Dec 24, 2024 Question Answering Visual Question Answering
— Unverified 0HAUR: Human Annotation Understanding and Recognition Through Text-Heavy Images Dec 24, 2024 Optical Character Recognition (OCR) Question Answering
— Unverified 0Exploring Embedding Priors in Prompt-Tuning for Improved Interpretability and Control Dec 24, 2024 Question Answering
— Unverified 0FFA Sora, video generation as fundus fluorescein angiography simulator Dec 23, 2024 Privacy Preserving Question Answering
— Unverified 0Multimodal Preference Data Synthetic Alignment with Reward Model Dec 23, 2024 2k Caption Generation
Code Code Available 0VidCtx: Context-aware Video Question Answering with Image Models Dec 23, 2024 Large Language Model Question Answering
Code Code Available 0RAGONITE: Iterative Retrieval on Induced Databases and Verbalized RDF for Conversational QA over KGs with RAG Dec 23, 2024 Conversational Question Answering Knowledge Graphs
— Unverified 0From Models to Microtheories: Distilling a Model's Topical Knowledge for Grounded Question Answering Dec 23, 2024 Question Answering
Code Code Available 0Factuality or Fiction? Benchmarking Modern LLMs on Ambiguous QA with Citations Dec 23, 2024 Benchmarking Question Answering
— Unverified 0Cross-Lingual Text-Rich Visual Comprehension: An Information Theory Perspective Dec 23, 2024 Question Answering Visual Question Answering
Code Code Available 0Survey of Large Multimodal Model Datasets, Application Categories and Taxonomy Dec 23, 2024 Image Captioning Question Answering
— Unverified 0FriendsQA: A New Large-Scale Deep Video Understanding Dataset with Fine-grained Topic Categorization for Story Videos Dec 22, 2024 Language Modelling Large Language Model
Code Code Available 0MINTQA: A Multi-Hop Question Answering Benchmark for Evaluating LLMs on New and Tail Knowledge Dec 22, 2024 Multi-hop Question Answering Question Answering
Code Code Available 0Prompting Large Language Models with Rationale Heuristics for Knowledge-based Visual Question Answering Dec 22, 2024 Question Answering Visual Question Answering
— Unverified 0Application of Multimodal Large Language Models in Autonomous Driving Dec 21, 2024 Autonomous Driving Decision Making
— Unverified 0Speech Retrieval-Augmented Generation without Automatic Speech Recognition Dec 21, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0DragonVerseQA: Open-Domain Long-Form Context-Aware Question-Answering Dec 21, 2024 Articles Form
Code Code Available 0Automated CVE Analysis: Harnessing Machine Learning In Designing Question-Answering Models For Cybersecurity Information Extraction Dec 21, 2024 Question Answering
— Unverified 0SilVar: Speech Driven Multimodal Model for Reasoning Visual Question Answering and Object Localization Dec 21, 2024 Image Captioning Multimodal Reasoning
Code Code Available 0STAMPsy: Towards SpatioTemporal-Aware Mixed-Type Dialogues for Psychological Counseling Dec 21, 2024 Conversational Recommendation Dialogue Generation
— Unverified 0MRAG: A Modular Retrieval Framework for Time-Sensitive Question Answering Dec 20, 2024 Question Answering Retrieval
— Unverified 0NGQA: A Nutritional Graph Question Answering Benchmark for Personalized Health-aware Nutritional Reasoning Dec 20, 2024 Graph Question Answering Nutrition
— Unverified 0Contrastive Learning for Task-Independent SpeechLLM-Pretraining Dec 20, 2024 Contrastive Learning Question Answering
Code Code Available 0HybGRAG: Hybrid Retrieval-Augmented Generation on Textual and Relational Knowledge Bases Dec 20, 2024 Question Answering RAG
— Unverified 0Logical Consistency of Large Language Models in Fact-checking Dec 20, 2024 Fact Checking Hallucination
— Unverified 0PolySmart @ TRECVid 2024 Medical Video Question Answering Dec 20, 2024 Question Answering Retrieval
— Unverified 0FedPIA -- Permuting and Integrating Adapters leveraging Wasserstein Barycenters for Finetuning Foundation Models in Multi-Modal Federated Learning Dec 19, 2024 Federated Learning parameter-efficient fine-tuning
— Unverified 0Query pipeline optimization for cancer patient question answering systems Dec 19, 2024 Hallucination Passage Retrieval
— Unverified 0EarthDial: Turning Multi-sensory Earth Observations to Interactive Dialogues Dec 19, 2024 Change Detection Disaster Response
— Unverified 0Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs Dec 19, 2024 Arithmetic Reasoning Code Generation
— Unverified 0Unveiling Uncertainty: A Deep Dive into Calibration and Performance of Multimodal Large Language Models Dec 19, 2024 Autonomous Driving Image Captioning
Code Code Available 0GraphEQA: Using 3D Semantic Scene Graphs for Real-time Embodied Question Answering Dec 19, 2024 Efficient Exploration Embodied Question Answering
— Unverified 0FiVL: A Framework for Improved Vision-Language Alignment Dec 19, 2024 Answer Generation Multimodal Reasoning
Code Code Available 0