Towards Omnidirectional Reasoning with 360-R1: A Dataset, Benchmark, and GRPO-based Method May 20, 2025 Hallucination Object Localization
— Unverified 0Toward Effective Reinforcement Learning Fine-Tuning for Medical VQA in Vision-Language Models May 20, 2025 Medical Visual Question Answering Question Answering
— Unverified 0YESciEval: Robust LLM-as-a-Judge for Scientific Question Answering May 20, 2025 Question Answering
— Unverified 0Reinforcing Question Answering Agents with Minimalist Policy Gradient Optimization May 20, 2025 Hallucination In-Context Learning
— Unverified 0Abacus: A Cost-Based Optimizer for Semantic Operator Systems May 20, 2025 Question Answering
— Unverified 0Beyond Chains: Bridging Large Language Models and Knowledge Bases in Complex Question Answering May 20, 2025 Knowledge Base Question Answering Question Answering
— Unverified 0HausaNLP: Current Status, Challenges and Future Directions for Hausa Natural Language Processing May 20, 2025 Language Modeling Language Modelling
— Unverified 0AutoRev: Automatic Peer Review System for Academic Research Papers May 20, 2025 Question Answering Review Generation
— Unverified 0VoQA: Visual-only Question Answering May 20, 2025 Question Answering
Code Code Available 0Automatic Dataset Generation for Knowledge Intensive Question Answering Tasks May 20, 2025 Dataset Generation Question Answering
— Unverified 0Debating for Better Reasoning: An Unsupervised Multimodal Approach May 20, 2025 Question Answering Visual Question Answering
— Unverified 0Interpretable Traces, Unexpected Outcomes: Investigating the Disconnect in Trace-Based Knowledge Distillation May 20, 2025 Information Retrieval Knowledge Distillation
— Unverified 0Domain Adaptation of VLM for Soccer Video Understanding May 20, 2025 Action Classification Domain Adaptation
— Unverified 0QA-prompting: Improving Summarization with Large Language Models using Question-Answering May 20, 2025 In-Context Learning Question Answering
Code Code Available 0RAVENEA: A Benchmark for Multimodal Retrieval-Augmented Visual Culture Understanding May 20, 2025 Image Captioning Question Answering
Code Code Available 0Memory-Centric Embodied Question Answer May 20, 2025 Embodied Question Answering Large Language Model
— Unverified 0Visual Instruction Bottleneck Tuning May 20, 2025 Hallucination Object Hallucination
— Unverified 0Studying the Role of Input-Neighbor Overlap in Retrieval-Augmented Language Models Training Efficiency May 20, 2025 Language Modeling Language Modelling
— Unverified 0Alignment-Augmented Speculative Decoding with Alignment Sampling and Conditional Verification May 19, 2025 Code Completion Question Answering
— Unverified 0ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models May 19, 2025 Chart Question Answering Chart Understanding
— Unverified 0The Hidden Structure -- Improving Legal Document Understanding Through Explicit Text Formatting May 19, 2025 document understanding Optical Character Recognition (OCR)
— Unverified 0SurveillanceVQA-589K: A Benchmark for Comprehensive Surveillance Video-Language Understanding with Large Models May 19, 2025 Causal Inference Decision Making
— Unverified 0KIT's Offline Speech Translation and Instruction Following Submission for IWSLT 2025 May 19, 2025 Automatic Speech Recognition Instruction Following
— Unverified 0Rethinking Predictive Modeling for LLM Routing: When Simple kNN Beats Complex Learned Routers May 19, 2025 Instruction Following Question Answering
— Unverified 0Tianyi: A Traditional Chinese Medicine all-rounder language model and its Real-World Clinical Practice May 19, 2025 All Hallucination
— Unverified 0Q^2Forge: Minting Competency Questions and SPARQL Queries for Question-Answering Over Knowledge Graphs May 19, 2025 Knowledge Graphs Question Answering
— Unverified 0Understanding Complexity in VideoQA via Visual Program Generation May 19, 2025 Code Generation Question Answering
— Unverified 0AMAQA: A Metadata-based QA Dataset for RAG Systems May 19, 2025 Question Answering RAG
— Unverified 0ORQA: A Benchmark and Foundation Model for Holistic Operating Room Modeling May 19, 2025 Graph Generation Knowledge Distillation
— Unverified 0A Case Study of Cross-Lingual Zero-Shot Generalization for Classical Languages in LLMs May 19, 2025 Machine Translation named-entity-recognition
Code Code Available 0Table-R1: Region-based Reinforcement Learning for Table Understanding May 18, 2025 Question Answering reinforcement-learning
— Unverified 0Disambiguation in Conversational Question Answering in the Era of LLM: A Survey May 18, 2025 Benchmarking Conversational Question Answering
— Unverified 0GMSA: Enhancing Context Compression via Group Merging and Layer Semantic Alignment May 18, 2025 Computational Efficiency Question Answering
— Unverified 0RAGXplain: From Explainable Evaluation to Actionable Guidance of RAG Pipelines May 18, 2025 Decision Making Question Answering
— Unverified 0Enhancing Large Language Models with Reward-guided Tree Search for Knowledge Graph Question and Answering May 18, 2025 Graph Question Answering Knowledge Graphs
— Unverified 0CCNU at SemEval-2025 Task 3: Leveraging Internal and External Knowledge of Large Language Models for Multilingual Hallucination Annotation May 17, 2025 Hallucination Question Answering
— Unverified 0TinyRS-R1: Compact Multimodal Language Model for Remote Sensing May 17, 2025 Language Modeling Language Modelling
— Unverified 0Recursive Question Understanding for Complex Question Answering over Heterogeneous Personal Data May 17, 2025 Language Modeling Language Modelling
— Unverified 0Beyond Retrieval: Joint Supervision and Multimodal Document Ranking for Textbook Question Answering May 17, 2025 Document Ranking Large Language Model
— Unverified 0BELLE: A Bi-Level Multi-Agent Reasoning Framework for Multi-Hop Question Answering May 17, 2025 Multi-hop Question Answering Question Answering
— Unverified 0AutoMedEval: Harnessing Language Models for Automatic Medical Capability Evaluation May 17, 2025 Question Answering
— Unverified 0Unveiling Knowledge Utilization Mechanisms in LLM-based Retrieval-Augmented Generation May 17, 2025 Open-Domain Question Answering Question Answering
— Unverified 0Masking in Multi-hop QA: An Analysis of How Language Models Perform with Context Permutation May 16, 2025 Decoder Multi-hop Question Answering
Code Code Available 0THELMA: Task Based Holistic Evaluation of Large Language Model Applications-RAG Question Answering May 16, 2025 Language Modeling Language Modelling
— Unverified 0Time-R1: Towards Comprehensive Temporal Reasoning in LLMs May 16, 2025 Question Answering Reinforcement Learning (RL)
Code Code Available 0TCC-Bench: Benchmarking the Traditional Chinese Culture Understanding Capabilities of MLLMs May 16, 2025 Benchmarking Question Answering
Code Code Available 0HumaniBench: A Human-Centric Framework for Large Multimodal Models Evaluation May 16, 2025 Benchmarking Ethics
Code Code Available 0Temporally-Grounded Language Generation: A Benchmark for Real-Time Vision-Language Models May 16, 2025 Image Captioning Question Answering
Code Code Available 0Semantic Caching of Contextual Summaries for Efficient Question-Answering with Language Models May 16, 2025 Question Answering Retrieval
— Unverified 0Scaling Reasoning can Improve Factuality in Large Language Models May 16, 2025 Knowledge Graphs Large Language Model
Code Code Available 0