SemViQA: A Semantic Question Answering System for Vietnamese Information Fact-Checking Mar 2, 2025 Fact Checking Fact Verification
Code Code Available 2FunBench: Benchmarking Fundus Reading Skills of MLLMs Mar 2, 2025 Anatomy Benchmarking
— Unverified 0Towards Efficient Educational Chatbots: Benchmarking RAG Frameworks Mar 2, 2025 Benchmarking Chatbot
— Unverified 0GlossGPT: GPT for Word Sense Disambiguation using Few-shot Chain-of-Thought Prompting Mar 1, 2025 Question Answering Word Sense Disambiguation
Code Code Available 0Streaming Video Question-Answering with In-context Video KV-Cache Retrieval Mar 1, 2025 GPU Question Answering
Code Code Available 2AILS-NTUA at SemEval-2025 Task 8: Language-to-Code prompting and Error Fixing for Tabular Question Answering Mar 1, 2025 Diversity Natural Language Queries
Code Code Available 0CL-MoE: Enhancing Multimodal Large Language Model with Dual Momentum Mixture-of-Experts for Continual Visual Question Answering Mar 1, 2025 Continual Learning Language Modeling
— Unverified 0Fine-Grained Retrieval-Augmented Generation for Visual Question Answering Feb 28, 2025 Question Answering RAG
— Unverified 0WebFAQ: A Multilingual Collection of Natural Q&A Datasets for Dense Retrieval Feb 28, 2025 Dataset Generation Open-Domain Question Answering
— Unverified 0MedHallTune: An Instruction-Tuning Benchmark for Mitigating Medical Hallucination in Vision-Language Models Feb 28, 2025 Decision Making Hallucination
Code Code Available 0TempRetriever: Fusion-based Temporal Dense Passage Retrieval for Time-Sensitive Questions Feb 28, 2025 Information Retrieval Passage Retrieval
— Unverified 0PreMind: Multi-Agent Video Understanding for Advanced Indexing of Presentation-style Videos Feb 28, 2025 Question Answering Video Understanding
— Unverified 0Protecting multimodal large language models against misleading visualizations Feb 27, 2025 Language Modeling Language Modelling
Code Code Available 0Bisecting K-Means in RAG for Enhancing Question-Answering Tasks Performance in Telecommunications Feb 27, 2025 Clustering Information Retrieval
— Unverified 0Few-Shot Multilingual Open-Domain QA from 5 Examples Feb 27, 2025 Few-Shot Learning Open-Domain Question Answering
Code Code Available 0Med-RLVR: Emerging Medical Reasoning from a 3B base model via reinforcement Learning Feb 27, 2025 Math Medical Question Answering
— Unverified 0ChineseEcomQA: A Scalable E-commerce Concept Evaluation Benchmark for Large Language Models Feb 27, 2025 Question Answering RAG
Code Code Available 1Can Large Language Models Unveil the Mysteries? An Exploration of Their Ability to Unlock Information in Complex Scenarios Feb 27, 2025 Data Integration Question Answering
— Unverified 0From Retrieval to Generation: Comparing Different Approaches Feb 27, 2025 Language Modeling Language Modelling
— Unverified 0M-LLM Based Video Frame Selection for Efficient Video Understanding Feb 27, 2025 EgoSchema Language Modeling
— Unverified 0UQABench: Evaluating User Embedding for Prompting LLMs in Personalized Question Answering Feb 26, 2025 Question Answering
Code Code Available 1END: Early Noise Dropping for Efficient and Effective Context Denoising Feb 26, 2025 Denoising In-Context Learning
— Unverified 0Time-MQA: Time Series Multi-Task Question Answering with Context Enhancement Feb 26, 2025 Anomaly Detection Natural Language Queries
— Unverified 0Nexus: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision Feb 26, 2025 Audio Synthesis Automatic Speech Recognition
— Unverified 0Winning Big with Small Models: Knowledge Distillation vs. Self-Training for Reducing Hallucination in QA Agents Feb 26, 2025 Hallucination Knowledge Distillation
— Unverified 0MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning Feb 26, 2025 Domain Generalization Medical Image Analysis
— Unverified 0Talking to the brain: Using Large Language Models as Proxies to Model Brain Semantic Representation Feb 26, 2025 Question Answering valid
— Unverified 0MEBench: Benchmarking Large Language Models for Cross-Document Multi-Entity Question Answering Feb 26, 2025 Benchmarking Question Answering
— Unverified 0FSPO: Few-Shot Preference Optimization of Synthetic Preference Data in LLMs Elicits Effective Personalization to Real Users Feb 26, 2025 In-Context Learning Meta-Learning
Code Code Available 1SECURA: Sigmoid-Enhanced CUR Decomposition with Uninterrupted Retention and Low-Rank Adaptation in Large Language Models Feb 25, 2025 Continual Learning GSM8K
— Unverified 0FilterRAG: Zero-Shot Informed Retrieval-Augmented Generation to Mitigate Hallucinations in VQA Feb 25, 2025 Question Answering Retrieval
— Unverified 0KiRAG: Knowledge-Driven Iterative Retriever for Enhancing Retrieval-Augmented Generation Feb 25, 2025 Multi-hop Question Answering Question Answering
— Unverified 0ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents Feb 25, 2025 Question Answering RAG
Code Code Available 4Uncertainty Quantification in Retrieval Augmented Question Answering Feb 25, 2025 Question Answering Retrieval
Code Code Available 0LevelRAG: Enhancing Retrieval-Augmented Generation with Multi-hop Logic Planning over Rewriting Augmented Searchers Feb 25, 2025 Multi-hop Question Answering Question Answering
Code Code Available 2Say Less, Mean More: Leveraging Pragmatics in Retrieval-Augmented Generation Feb 25, 2025 ARC Passage Retrieval
— Unverified 0Detecting Knowledge Boundary of Vision Large Language Models by Sampling-Based Inference Feb 25, 2025 Question Answering RAG
Code Code Available 0Tip of the Tongue Query Elicitation for Simulated Evaluation Feb 25, 2025 Community Question Answering Question Answering
Code Code Available 0MM-PoisonRAG: Disrupting Multimodal RAG with Local and Global Poisoning Attacks Feb 25, 2025 Misinformation Question Answering
Code Code Available 1Evaluating the Effect of Retrieval Augmentation on Social Biases Feb 24, 2025 Large Language Model Question Answering
— Unverified 0All-in-one: Understanding and Generation in Multimodal Reasoning with the MAIA Benchmark Feb 24, 2025 All Multimodal Reasoning
— Unverified 0AAD-LLM: Neural Attention-Driven Auditory Scene Understanding Feb 24, 2025 Question Answering Response Generation
— Unverified 0HIPPO: Enhancing the Table Understanding Capability of Large Language Models through Hybrid-Modal Preference Optimization Feb 24, 2025 Diversity Fact Verification
Code Code Available 1MultiOCR-QA: Dataset for Evaluating Robustness of LLMs in Question Answering on Multilingual OCR Texts Feb 24, 2025 Optical Character Recognition Optical Character Recognition (OCR)
Code Code Available 0Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction Feb 24, 2025 Language Modeling Language Modelling
Code Code Available 3MULTITAT: Benchmarking Multilingual Table-and-Text Question Answering Feb 24, 2025 Benchmarking Question Answering
Code Code Available 0MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs Feb 24, 2025 Question Answering Visual Question Answering
Code Code Available 3Benchmarking Retrieval-Augmented Generation in Multi-Modal Contexts Feb 24, 2025 Benchmarking Fact Verification
Code Code Available 2Retrieval-Augmented Visual Question Answering via Built-in Autoregressive Search Engines Feb 23, 2025 Answer Generation Language Modeling
— Unverified 0Visual-RAG: Benchmarking Text-to-Image Retrieval Augmented Generation for Visual Knowledge Intensive Queries Feb 23, 2025 Benchmarking Image Retrieval
Code Code Available 0