CSE-SFP: Enabling Unsupervised Sentence Representation Learning via a Single Forward Pass May 1, 2025 Contrastive Learning Information Retrieval
— Unverified 0HalluMix: A Task-Agnostic, Multi-Domain Benchmark for Real-World Hallucination Detection May 1, 2025 Extractive Question-Answering Hallucination
— Unverified 0UniBiomed: A Universal Foundation Model for Grounded Biomedical Image Interpretation Apr 30, 2025 Diagnostic Large Language Model
Code Code Available 1Calibrating Uncertainty Quantification of Multi-Modal LLMs using Grounding Apr 30, 2025 Medical Question Answering Question Answering
— Unverified 0Zoomer: Adaptive Image Focus Optimization for Black-box MLLM Apr 30, 2025 Image Captioning Object Recognition
— Unverified 0Talk Before You Retrieve: Agent-Led Discussions for Better RAG in Medical QA Apr 30, 2025 Information Retrieval Medical Question Answering
Code Code Available 0ConSens: Assessing context grounding in open-book question answering Apr 30, 2025 Question Answering
— Unverified 0ChestX-Reasoner: Advancing Radiology Foundation Models with Reasoning through Step-by-Step Verification Apr 29, 2025 Diagnostic Question Answering
Code Code Available 1LLM Enhancer: Merged Approach using Vector Embedding for Reducing Large Language Model Hallucinations with External Knowledge Apr 29, 2025 Language Modeling Language Modelling
— Unverified 0SetKE: Knowledge Editing for Knowledge Elements Overlap Apr 29, 2025 Incremental Learning knowledge editing
— Unverified 0LMME3DHF: Benchmarking and Evaluating Multimodal 3D Human Face Generation with LMMs Apr 29, 2025 Benchmarking Face Generation
— Unverified 0UniversalRAG: Retrieval-Augmented Generation over Corpora of Diverse Modalities and Granularities Apr 29, 2025 Question Answering RAG
Code Code Available 2SpatialReasoner: Towards Explicit and Generalizable 3D Spatial Reasoning Apr 28, 2025 Question Answering Spatial Reasoning
— Unverified 0Toward Evaluative Thinking: Meta Policy Optimization with Evolving Reward Models Apr 28, 2025 Mathematical Reasoning Meta-Learning
Code Code Available 0m-KAILIN: Knowledge-Driven Agentic Scientific Corpus Distillation Framework for Biomedical Large Language Models Training Apr 28, 2025 Question Answering
— Unverified 0OpenTCM: A GraphRAG-Empowered LLM-based System for Traditional Chinese Medicine Knowledge Retrieval and Diagnosis Apr 28, 2025 Diagnostic Information Retrieval
— Unverified 0Knowledge Distillation of Domain-adapted LLMs for Question-Answering in Telecom Apr 28, 2025 Domain Adaptation Knowledge Distillation
— Unverified 0TreeHop: Generate and Filter Next Query Embeddings Efficiently for Multi-hop Question Answering Apr 28, 2025 Multi-hop Question Answering Question Answering
Code Code Available 1Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers Apr 27, 2025 Hallucination Question Answering
Code Code Available 5Test It Before You Trust It: Applying Software Testing for Trustworthy In-context Learning Apr 26, 2025 In-Context Learning Philosophy
Code Code Available 0An Empirical Study of Evaluating Long-form Question Answering Apr 25, 2025 Form Informativeness
Code Code Available 0Kimi-Audio Technical Report Apr 25, 2025 Audio Question Answering Question Answering
Code Code Available 7VideoMultiAgents: A Multi-Agent Framework for Video Question Answering Apr 25, 2025 Caption Generation EgoSchema
Code Code Available 1Pushing the boundary on Natural Language Inference Apr 25, 2025 Fact Checking Information Retrieval
— Unverified 0Data-Driven Calibration of Prediction Sets in Large Vision-Language Models Based on Inductive Conformal Prediction Apr 24, 2025 Conformal Prediction Hallucination
— Unverified 0Benchmarking Multimodal Mathematical Reasoning with Explicit Visual Dependency Apr 24, 2025 Benchmarking Math
Code Code Available 1A Comprehensive Survey of Knowledge-Based Vision Question Answering Systems: The Lifecycle of Knowledge in Visual Reasoning Task Apr 24, 2025 Question Answering Retrieval
— Unverified 0FinBERT-QA: Financial Question Answering with pre-trained BERT Language Models Apr 24, 2025 Answer Selection Information Retrieval
Code Code Available 2TraveLLaMA: Facilitating Multi-modal Large Language Models to Understand Urban Scenes and Provide Travel Assistance Apr 23, 2025 Question Answering Scene Understanding
— Unverified 0Survey of Video Diffusion Models: Foundations, Implementations, and Applications Apr 22, 2025 Computational Efficiency Denoising
Code Code Available 1FinDER: Financial Dataset for Question Answering and Evaluating Retrieval-Augmented Generation Apr 22, 2025 Question Answering RAG
— Unverified 0Towards Understanding Camera Motions in Any Video Apr 21, 2025 Question Answering Text Retrieval
— Unverified 0Efficient Document Retrieval with G-Retriever Apr 21, 2025 graph construction Question Answering
Code Code Available 0The Great Nugget Recall: Automating Fact Extraction and RAG Evaluation with Large Language Models Apr 21, 2025 Question Answering RAG
— Unverified 0FinSage: A Multi-aspect RAG System for Financial Filings Question Answering Apr 20, 2025 Question Answering RAG
— Unverified 0FairSteer: Inference Time Debiasing for LLMs with Dynamic Activation Steering Apr 20, 2025 counterfactual Fairness
— Unverified 0Neglected Risks: The Disturbing Reality of Children's Images in Datasets and the Urgent Call for Accountability Apr 20, 2025 Question Answering Visual Question Answering
— Unverified 0A Hierarchical Framework for Measuring Scientific Paper Innovation via Large Language Models Apr 20, 2025 Question Answering
— Unverified 0CoLoTa: A Dataset for Entity-based Commonsense Reasoning over Long-Tail Knowledge Apr 20, 2025 Claim Verification Graph Question Answering
— Unverified 0Are Vision LLMs Road-Ready? A Comprehensive Benchmark for Safety-Critical Driving Video Understanding Apr 20, 2025 Autonomous Driving Image Captioning
Code Code Available 0Bottom-Up Synthesis of Knowledge-Grounded Task-Oriented Dialogues with Iteratively Self-Refined Prompts Apr 19, 2025 Conversational Question Answering Language Modeling
— Unverified 0LegalRAG: A Hybrid RAG System for Multilingual Legal Information Retrieval Apr 19, 2025 Information Retrieval Question Answering
— Unverified 0Walk the Talk? Measuring the Faithfulness of Large Language Model Explanations Apr 19, 2025 Language Modeling Language Modelling
Code Code Available 1SConU: Selective Conformal Uncertainty in Large Language Models Apr 19, 2025 Conformal Prediction Question Answering
— Unverified 0Learning to Attribute with Attention Apr 18, 2025 Attribute Language Modeling
Code Code Available 1Long-context Non-factoid Question Answering in Indic Languages Apr 18, 2025 coreference-resolution Coreference Resolution
Code Code Available 0ChartQA-X: Generating Explanations for Charts Apr 17, 2025 Decision Making Explanation Generation
— Unverified 0WebLists: Extracting Structured Information From Complex Interactive Websites Using Executable LLM Agents Apr 17, 2025 Navigate Question Answering
— Unverified 0Benchmarking LLM-based Relevance Judgment Methods Apr 17, 2025 Benchmarking Open-Domain Question Answering
Code Code Available 0Hadamard product in deep learning: Introduction, Advances and Challenges Apr 17, 2025 Computational Efficiency Deep Learning
— Unverified 0