Task Specific Pruning with LLM-Sieve: How Many Parameters Does Your Task Really Need? May 23, 2025 Medical Question Answering Quantization
— Unverified 0FinRAGBench-V: A Benchmark for Multimodal RAG with Visual Citation in the Financial Domain May 23, 2025 Question Answering RAG
— Unverified 0PerMedCQA: Benchmarking Large Language Models on Medical Consumer Question Answering in Persian Language May 23, 2025 Benchmarking Question Answering
— Unverified 0Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding May 23, 2025 Form Question Answering
— Unverified 0Analyzing Mitigation Strategies for Catastrophic Forgetting in End-to-End Training of Spoken Language Models May 23, 2025 Continual Learning Question Answering
— Unverified 0CXReasonBench: A Benchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays May 23, 2025 Diagnostic Question Answering
Code Code Available 0Grounding Chest X-Ray Visual Question Answering with Generated Radiology Reports May 22, 2025 Answer Generation Question Answering
— Unverified 0Are the Hidden States Hiding Something? Testing the Limits of Factuality-Encoding Capabilities in LLMs May 22, 2025 Question Answering
— Unverified 0UNCLE: Uncertainty Expressions in Long-Form Generation May 22, 2025 4k Form
— Unverified 0Evaluating Large Language Model with Knowledge Oriented Language Specific Simple Question Answering May 22, 2025 Global Facts Language Modeling
Code Code Available 0CT-Agent: A Multimodal-LLM Agent for 3D CT Radiology Question Answering May 22, 2025 Computed Tomography (CT) Question Answering
— Unverified 0Tools in the Loop: Quantifying Uncertainty of LLM Question Answering Systems That Use Tools May 22, 2025 Information Retrieval Question Answering
— Unverified 0Steering LVLMs via Sparse Autoencoder for Hallucination Mitigation May 22, 2025 Hallucination Image Captioning
— Unverified 0Zero-Shot Anomaly Detection in Battery Thermal Images Using Visual Question Answering with Prior Knowledge May 22, 2025 Anomaly Detection Question Answering
— Unverified 0Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding May 22, 2025 Causal Inference Hallucination
— Unverified 0Augmenting LLM Reasoning with Dynamic Notes Writing for Complex QA May 22, 2025 Multi-hop Question Answering Question Answering
— Unverified 0VoxRAG: A Step Toward Transcription-Free RAG Systems in Spoken Question Answering May 22, 2025 Question Answering RAG
— Unverified 0Continually Self-Improving Language Models for Bariatric Surgery Question--Answering May 22, 2025 Large Language Model Misinformation
— Unverified 0CUB: Benchmarking Context Utilisation Techniques for Language Models May 22, 2025 Benchmarking Fact Checking
— Unverified 0EarthSE: A Benchmark Evaluating Earth Scientific Exploration Capability for Large Language Models May 22, 2025 Question Answering Specificity
— Unverified 0A Causal Approach to Mitigate Modality Preference Bias in Medical Visual Question Answering May 22, 2025 counterfactual Medical Visual Question Answering
— Unverified 0Collaboration among Multiple Large Language Models for Medical Question Answering May 22, 2025 Medical Question Answering Multiple-choice
— Unverified 0Set-LLM: A Permutation-Invariant LLM May 21, 2025 Multiple-choice Question Answering
— Unverified 0Visual Question Answering on Multiple Remote Sensing Image Modalities May 21, 2025 Question Answering Visual Question Answering
— Unverified 0Towards Spoken Mathematical Reasoning: Benchmarking Speech-based Models over Multi-faceted Math Problems May 21, 2025 Benchmarking Math
— Unverified 0Human-centered Interactive Learning via MLLMs for Text-to-Image Person Re-identification May 21, 2025 Data Augmentation Large Language Model
— Unverified 0ChartCards: A Chart-Metadata Generation Framework for Multi-Task Chart Understanding May 21, 2025 Chart Question Answering Chart Understanding
Code Code Available 0Discovering Pathology Rationale and Token Allocation for Efficient Multimodal Pathology Reasoning May 21, 2025 Computational Efficiency Diagnostic
— Unverified 0TimeCausality: Evaluating the Causal Ability in Time Dimension for Vision Language Models May 21, 2025 Human Aging Question Answering
Code Code Available 0BR-TaxQA-R: A Dataset for Question Answering with References for Brazilian Personal Income Tax Law, including case law May 21, 2025 Answer Generation Question Answering
— Unverified 0TinyDrive: Multiscale Visual Question Answering with Selective Token Routing for Autonomous Driving May 21, 2025 Autonomous Driving Question Answering
— Unverified 0UrduFactCheck: An Agentic Fact-Checking Framework for Urdu with Evidence Boosting and Benchmarking May 21, 2025 Benchmarking Claim Verification
Code Code Available 0SNAP: A Benchmark for Testing the Effects of Capture Conditions on Fundamental Vision Tasks May 21, 2025 image-classification Image Classification
Code Code Available 0Leveraging Online Data to Enhance Medical Knowledge in a Small Persian Language Model May 21, 2025 Language Modeling Language Modelling
Code Code Available 0KaFT: Knowledge-aware Fine-tuning for Boosting LLMs' Domain-specific Question-Answering Performance May 21, 2025 Hallucination Question Answering
— Unverified 0ViQAgent: Zero-Shot Video Question Answering via Agent with Open-Vocabulary Grounding Validation May 21, 2025 Decision Making Language Modeling
Code Code Available 0RAVEN: Query-Guided Representation Alignment for Question Answering over Audio, Video, Embedded Sensors, and Natural Language May 21, 2025 Question Answering
Code Code Available 0Robo2VLM: Visual Question Answering from Large-Scale In-the-Wild Robot Manipulation Datasets May 21, 2025 Dataset Generation Descriptive
— Unverified 0StepSearch: Igniting LLMs Search Ability via Step-Wise Proximal Policy Optimization May 21, 2025 Question Answering Reinforcement Learning (RL)
Code Code Available 0Improving LLM First-Token Predictions in Multiple-Choice Question Answering via Prefilling Attack May 21, 2025 Multiple-choice Multiple Choice Question Answering (MCQA)
— Unverified 0Traveling Across Languages: Benchmarking Cross-Lingual Consistency in Multimodal LLMs May 21, 2025 Benchmarking Question Answering
Code Code Available 0Exploring The Visual Feature Space for Multimodal Neural Decoding May 21, 2025 Brain Decoding Question Answering
Code Code Available 0CRAFT: Training-Free Cascaded Retrieval for Tabular QA May 21, 2025 Natural Language Queries Natural Questions
— Unverified 0Single LLM, Multiple Roles: A Unified Retrieval-Augmented Generation Framework Using Role-Specific Token Optimization May 21, 2025 Open-Domain Question Answering Question Answering
— Unverified 0Keep Security! Benchmarking Security Policy Preservation in Large Language Model Contexts Against Indirect Attacks in Question Answering May 21, 2025 Benchmarking Language Modeling
Code Code Available 0LiveVLM: Efficient Online Video Understanding via Streaming-Oriented KV Cache and Retrieval May 21, 2025 Autonomous Driving Question Answering
— Unverified 0Social Bias in Popular Question-Answering Benchmarks May 21, 2025 Question Answering Reading Comprehension
— Unverified 0VoQA: Visual-only Question Answering May 20, 2025 Question Answering
Code Code Available 0RAVENEA: A Benchmark for Multimodal Retrieval-Augmented Visual Culture Understanding May 20, 2025 Image Captioning Question Answering
Code Code Available 0Debating for Better Reasoning: An Unsupervised Multimodal Approach May 20, 2025 Question Answering Visual Question Answering
— Unverified 0