Relative Overfitting and Accept-Reject Framework May 12, 2025 Language Modeling Language Modelling
— Unverified 0PLHF: Prompt Optimization with Few-Shot Human Feedback May 11, 2025 Question Answering
— Unverified 0BioProBench: Comprehensive Dataset and Benchmark in Biological Protocol Understanding and Reasoning May 11, 2025 Question Answering
Code Code Available 1Overview of the NLPCC 2025 Shared Task 4: Multi-modal, Multilingual, and Multi-hop Medical Instructional Video Question Answering Challenge May 11, 2025 Multimodal Reasoning Question Answering
— Unverified 0Building a Human-Verified Clinical Reasoning Dataset via a Human LLM Hybrid Pipeline for Trustworthy Medical AI May 11, 2025 Medical Question Answering Question Answering
— Unverified 0Multi-Modal Explainable Medical AI Assistant for Trustworthy Human-AI Collaboration May 11, 2025 Benchmarking Descriptive
— Unverified 0OMGM: Orchestrate Multiple Granularities and Modalities for Efficient Multimodal Retrieval May 10, 2025 Cross-Modal Retrieval Question Answering
— Unverified 0SmartPilot: A Multiagent CoPilot for Adaptive and Intelligent Manufacturing May 10, 2025 Decision Making Production Forecasting
Code Code Available 1CellVerse: Do Large Language Models Really Understand Cell Biology? May 9, 2025 Drug Response Prediction Question Answering
— Unverified 0Towards Developmentally Plausible Rewards: Communicative Success as a Learning Signal for Interactive Language Models May 9, 2025 Language Acquisition Question Answering
— Unverified 0A Grounded Memory System For Smart Personal Assistants May 9, 2025 Entity Disambiguation Image Captioning
— Unverified 0Natural Reflection Backdoor Attack on Vision Language Model for Autonomous Driving May 9, 2025 Autonomous Driving Backdoor Attack
— Unverified 0NeoQA: Evidence-based Question Answering with Generated News Events May 9, 2025 Articles Question Answering
Code Code Available 0Document Attribution: Examining Citation Relationships using Large Language Models May 9, 2025 Document Summarization Natural Language Inference
— Unverified 0MM-Skin: Enhancing Dermatology Vision-Language Model with an Image-Text Dataset Derived from Textbooks May 9, 2025 Diagnostic Instruction Following
Code Code Available 1Assessing Robustness to Spurious Correlations in Post-Training Language Models May 9, 2025 Instruction Following Mathematical Reasoning
— Unverified 0Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information May 9, 2025 Benchmarking Form
— Unverified 0Continuous Thought Machines May 8, 2025 Computational Efficiency Question Answering
Code Code Available 5Lost in OCR Translation? Vision-Based Approaches to Robust Document Retrieval May 8, 2025 Computational Efficiency Optical Character Recognition
— Unverified 0Probabilistic Embeddings for Frozen Vision-Language Models: Uncertainty Quantification with Gaussian Process Latent Variable Models May 8, 2025 Active Learning cross-modal alignment
Code Code Available 0LiTransProQA: an LLM-based Literary Translation evaluation metric with Professional Question Answering May 8, 2025 Machine Translation Question Answering
Code Code Available 0An Open-Source Dual-Loss Embedding Model for Semantic Retrieval in Higher Education May 8, 2025 Large Language Model Question Answering
— Unverified 0SITE: towards Spatial Intelligence Thorough Evaluation May 8, 2025 Question Answering Spatial Reasoning
— Unverified 0Q-Heart: ECG Question Answering via Knowledge-Informed Multimodal LLMs May 7, 2025 Electrocardiography (ECG) Language Modeling
— Unverified 0HiPerRAG: High-Performance Retrieval Augmented Generation for Scientific Insights May 7, 2025 Articles Contrastive Learning
— Unverified 0Fine-Tuning Large Language Models and Evaluating Retrieval Methods for Improved Question Answering on Building Codes May 7, 2025 Language Modeling Language Modelling
— Unverified 0EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement Learning May 7, 2025 Multiple-choice Question Answering
Code Code Available 2A Reasoning-Focused Legal Retrieval Benchmark May 6, 2025 Question Answering RAG
— Unverified 0Characterising Topic Familiarity and Query Specificity Using Eye-Tracking Data May 6, 2025 Pupil Dilation Question Answering
Code Code Available 0VLM Q-Learning: Aligning Vision-Language Models for Interactive Decision-Making May 6, 2025 Decision Making General Knowledge
— Unverified 0IndicSQuAD: A Comprehensive Multilingual Question Answering Dataset for Indic Languages May 6, 2025 Question Answering
Code Code Available 1DyGEnc: Encoding a Sequence of Textual Scene Graphs to Reason and Answer Questions in Dynamic Scenes May 6, 2025 Question Answering
Code Code Available 0VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model May 6, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 4MedArabiQ: Benchmarking Large Language Models on Arabic Medical Tasks May 6, 2025 Benchmarking Multiple-choice
Code Code Available 0Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering May 5, 2025 Hallucination Question Answering
Code Code Available 1Sim2Real Transfer for Vision-Based Grasp Verification May 5, 2025 Object object-detection
Code Code Available 0Structure Causal Models and LLMs Integration in Medical Visual Question Answering May 5, 2025 Causal Inference Medical Visual Question Answering
— Unverified 0Task-Oriented Semantic Communication in Large Multimodal Models-based Vehicle Networks May 5, 2025 Question Answering Semantic Communication
— Unverified 0LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis May 5, 2025 Chatbot Decoder
Code Code Available 3Compositional Image-Text Matching and Retrieval by Grounding Entities May 4, 2025 Image Captioning Image-text matching
Code Code Available 0RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video May 4, 2025 Benchmarking Question Answering
Code Code Available 1Adaptive Token Boundaries: Integrating Human Chunking Mechanisms into Multimodal LLMs May 3, 2025 Chunking Question Answering
— Unverified 0Knowledge-Augmented Language Models Interpreting Structured Chest X-Ray Findings May 3, 2025 Question Answering Visual Question Answering
— Unverified 0OODTE: A Differential Testing Engine for the ONNX Optimizer May 3, 2025 object-detection Object Detection
— Unverified 0Beyond Attention: Toward Machines with Intrinsic Higher Mental States May 2, 2025 Question Answering
— Unverified 0TRAVELER: A Benchmark for Evaluating Temporal Reasoning across Vague, Implicit and Explicit References May 2, 2025 Natural Language Understanding Question Answering
— Unverified 0Transferable Adversarial Attacks on Black-Box Vision-Language Models May 2, 2025 Image Captioning Object Recognition
— Unverified 0Grounding Task Assistance with Multimodal Cues from a Single Demonstration May 2, 2025 Question Answering Visual Question Answering
— Unverified 0AdCare-VLM: Leveraging Large Vision Language Model (LVLM) to Monitor Long-Term Medication Adherence and Care May 1, 2025 Language Modeling Language Modelling
Code Code Available 0CSE-SFP: Enabling Unsupervised Sentence Representation Learning via a Single Forward Pass May 1, 2025 Contrastive Learning Information Retrieval
— Unverified 0