RAG-R1 : Incentivize the Search and Reasoning Capabilities of LLMs through Multi-query Parallelism Jun 30, 2025 Question Answering RAG
Code Code Available 55 Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding Apr 14, 2025 Question Answering
Code Code Available 55 Search-o1: Agentic Search-Enhanced Large Reasoning Models Jan 9, 2025 Code Generation
Code Code Available 55 MEIA: Multimodal Embodied Perception and Interaction in Unknown Environments Feb 1, 2024 Embodied Question Answering Language Modeling
Code Code Available 55 Wings: Learning Multimodal LLMs without Text-only Forgetting Jun 5, 2024 Question Answering Visual Question Answering
Code Code Available 55 VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks Jun 12, 2024 Image Generation Language Modeling
Code Code Available 55 LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA Sep 4, 2024 Question Answering Sentence
Code Code Available 45 LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day Jun 1, 2023 Image Classification Instruction Following
Code Code Available 45 Learning to Generate Instruction Tuning Datasets for Zero-Shot Task Adaptation Feb 28, 2024 Attribute Extractive Question-Answering
Code Code Available 45 Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces Dec 18, 2024 Question Answering Spatial Reasoning
Code Code Available 45 VideoChat: Chat-Centric Video Understanding May 10, 2023 Question Answering Video-based Generative Performance Benchmarking
Code Code Available 45 Knowledge-tuning Large Language Models with Structured Medical Knowledge Bases for Reliable Response Generation in Chinese Sep 8, 2023 Domain Adaptation Hallucination
Code Code Available 45 Knowledge Fusion of Large Language Models Jan 19, 2024 Code Generation Common Sense Reasoning
Code Code Available 45 FinBen: A Holistic Financial Benchmark for Large Language Models Feb 20, 2024 Question Answering RAG
Code Code Available 45 Tarsier2: Advancing Large Vision-Language Models from Detailed Video Description to Comprehensive Video Understanding Jan 14, 2025 Embodied Question Answering Hallucination
Code Code Available 45 Chain-of-Discussion: A Multi-Model Framework for Complex Evidence-Based Question Answering Feb 26, 2024 Evidence Selection Open-Ended Question Answering
Code Code Available 45 Holistic Evaluation of Language Models Nov 16, 2022 Fairness Question Answering
Code Code Available 45 SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models Feb 13, 2025 Question Answering RAG
Code Code Available 45 The Llama 3 Herd of Models Jul 31, 2024 answerability prediction Language Modeling
Code Code Available 45 GPT-4V(ision) is a Generalist Web Agent, if Grounded Jan 3, 2024 Image Captioning Question Answering
Code Code Available 45 GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation May 26, 2025 Question Answering Synthetic Data Generation
Code Code Available 45 Scaling Up Biomedical Vision-Language Models: Fine-Tuning, Instruction Tuning, and Multi-Modal Learning May 23, 2025 Decoder Image Captioning
Code Code Available 45 Sailor: Open Language Models for South-East Asia Apr 4, 2024 Language Modeling Language Modelling
Code Code Available 45 AlignScore: Evaluating Factual Consistency with a Unified Alignment Function May 26, 2023 Fact Verification Information Retrieval
Code Code Available 45 Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection Oct 17, 2023 Fact Verification Question Answering
Code Code Available 45 Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks May 22, 2020 Fact Verification Question Answering
Code Code Available 45 ReAct: Synergizing Reasoning and Acting in Language Models Oct 6, 2022 Decision Making Fact Verification
Code Code Available 45 Retrieval-Augmented Generation with Hierarchical Knowledge Mar 13, 2025 Multi-hop Question Answering Question Answering
Code Code Available 45 Benchmarking Retrieval-Augmented Generation for Medicine Feb 20, 2024 Benchmarking Information Retrieval
Code Code Available 45 Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation Feb 4, 2025 Benchmarking Information Retrieval
Code Code Available 45 Flamingo: a Visual Language Model for Few-Shot Learning Apr 29, 2022 Few-Shot Learning Generative Visual Question Answering
Code Code Available 45 Retrieval-Generation Synergy Augmented Large Language Models Oct 8, 2023 Question Answering Retrieval
Code Code Available 45 BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining Oct 19, 2022 Document Classification Language Modelling
Code Code Available 45 Galactica: A Large Language Model for Science Nov 16, 2022 Anachronisms Bias Detection
Code Code Available 45 Predicting Subjective Features of Questions of QA Websites using BERT Feb 24, 2020 Community Question Answering Question Answering
Code Code Available 45 Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models Feb 12, 2024 Hallucination Object Localization
Code Code Available 45 OmniDrive: A Holistic Vision-Language Dataset for Autonomous Driving with Counterfactual Reasoning May 2, 2024 Autonomous Driving counterfactual
Code Code Available 45 OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM Feb 14, 2024 Medical Visual Question Answering Question Answering
Code Code Available 45 G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering Feb 12, 2024 Common Sense Reasoning Graph Classification
Code Code Available 45 BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text Mar 27, 2024 Articles Language Modeling
Code Code Available 45 N-Grammer: Augmenting Transformers with latent n-grams Jul 13, 2022 Common Sense Reasoning Coreference Resolution
Code Code Available 45 OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning Dec 31, 2024 Benchmarking Logical Reasoning
Code Code Available 45 OpenDriveVLA: Towards End-to-end Autonomous Driving with Large Vision Language Action Model Mar 30, 2025 Autonomous Driving Decision Making
Code Code Available 45 Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking Mar 14, 2024 GSM8K Language Modelling
Code Code Available 45 Mixtral of Experts Jan 8, 2024 Code Generation Common Sense Reasoning
Code Code Available 45 A Survey on Vision-Language-Action Models for Embodied AI May 23, 2024 Image Captioning Instruction Following
Code Code Available 45 Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up Questions Aug 1, 2024 Medical Question Answering MedQA
Code Code Available 45 MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations Jun 13, 2024 3D visual grounding Attribute
Code Code Available 45 MEDITRON-70B: Scaling Medical Pretraining for Large Language Models Nov 27, 2023 Articles Conditional Text Generation
Code Code Available 45 EasyRAG: Efficient Retrieval-Augmented Generation Framework for Automated Network Operations Oct 14, 2024 Answer Generation Question Answering
Code Code Available 45