SynthVLM: High-Efficiency and High-Quality Synthetic Data for Vision Language Models Jul 30, 2024 Caption Generation Question Answering
Code Code Available 2Advancing Multimodal Large Language Models in Chart Question Answering with Visualization-Referenced Instruction Tuning Jul 29, 2024 Chart Question Answering Question Answering
Code Code Available 2Towards A Generalizable Pathology Foundation Model via Unified Knowledge Distillation Jul 26, 2024 Knowledge Distillation Question Answering
Code Code Available 2Retrieval with Learned Similarities Jul 22, 2024 Question Answering Recommendation Systems
Code Code Available 2MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity Jul 22, 2024 Diversity Multiple-choice
Code Code Available 2LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding Jul 22, 2024 Multiple-choice Question Answering
Code Code Available 2RAG-QA Arena: Evaluating Domain Robustness for Long-form Retrieval Augmented Question Answering Jul 19, 2024 Domain Generalization Form
Code Code Available 2SPINACH: SPARQL-Based Information Navigation for Challenging Real-World Questions Jul 16, 2024 In-Context Learning Knowledge Base Question Answering
Code Code Available 2Scientific QA System with Verifiable Answers Jul 16, 2024 Articles Information Retrieval
Code Code Available 2SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers Jul 12, 2024 Articles Question Answering
Code Code Available 2PersonaRAG: Enhancing Retrieval-Augmented Generation Systems with User-Centric Agents Jul 12, 2024 Information Retrieval Question Answering
Code Code Available 2GOFA: A Generative One-For-All Model for Joint Graph Language Modeling Jul 12, 2024 All Language Modeling
Code Code Available 2WSI-VQA: Interpreting Whole Slide Images by Generative Visual Question Answering Jul 8, 2024 Diagnostic Generative Visual Question Answering
Code Code Available 2How do you know that? Teaching Generative Language Models to Reference Answers to Biomedical Questions Jul 6, 2024 Question Answering RAG
Code Code Available 2AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents Jul 5, 2024 Decision Making Multi-hop Question Answering
Code Code Available 2ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models Jul 5, 2024 Hallucination Long Form Question Answering
Code Code Available 2ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild Jul 4, 2024 Chart Understanding Decision Making
Code Code Available 2MiniGPT-Med: Large Language Model as a General Interface for Radiology Diagnosis Jul 4, 2024 Diagnostic Language Modeling
Code Code Available 2A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding Jul 2, 2024 document understanding Key Information Extraction
Code Code Available 2Efficient Large Multi-modal Models via Visual Context Compression Jun 28, 2024 Question Answering Visual Question Answering
Code Code Available 2Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation Jun 26, 2024 Hallucination Knowledge Base Question Answering
Code Code Available 2Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA Jun 25, 2024 Benchmarking Long-Context Understanding
Code Code Available 2TorchSpatial: A Location Encoding Framework and Benchmark for Spatial Representation Learning Jun 21, 2024 Fairness Geographic Question Answering
Code Code Available 2TAGLAS: An atlas of text-attributed graph datasets in the era of large graph and language models Jun 20, 2024 Graph Question Answering Node Classification
Code Code Available 2Evaluating RAG-Fusion with RAGElo: an Automated Elo-based Framework Jun 20, 2024 Hallucination Question Answering
Code Code Available 2VRSBench: A Versatile Vision-Language Benchmark Dataset for Remote Sensing Image Understanding Jun 18, 2024 Image Captioning Question Answering
Code Code Available 2Breaking the Ceiling of the LLM Community by Treating Token Generation as a Classification for Ensembling Jun 18, 2024 Arithmetic Reasoning Language Modeling
Code Code Available 2ISR-DPO: Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective DPO Jun 17, 2024 Language Modelling Question Answering
Code Code Available 2GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities Jun 17, 2024 Audio Question Answering Instruction Following
Code Code Available 2Task Me Anything Jun 17, 2024 2k Attribute
Code Code Available 2Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning Jun 17, 2024 Data Augmentation Mathematical Reasoning
Code Code Available 2MedCalc-Bench: Evaluating Large Language Models for Medical Calculations Jun 17, 2024 Descriptive Medical Diagnosis
Code Code Available 2CHiSafetyBench: A Chinese Hierarchical Safety Benchmark for Large Language Models Jun 14, 2024 Multiple-choice Question Answering
Code Code Available 2Needle In A Video Haystack: A Scalable Synthetic Evaluator for Video MLLMs Jun 13, 2024 Benchmarking Question Answering
Code Code Available 2Yo'LLaVA: Your Personalized Language and Vision Assistant Jun 13, 2024 Image Captioning Question Answering
Code Code Available 2Towards Vision-Language Geo-Foundation Model: A Survey Jun 13, 2024 Earth Observation Image Captioning
Code Code Available 2Explore the Limits of Omni-modal Pretraining at Scale Jun 13, 2024 Language Modeling Language Modelling
Code Code Available 2Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs Jun 13, 2024 Arithmetic Reasoning Fact Verification
Code Code Available 2RS-Agent: Automating Remote Sensing Tasks through Intelligent Agent Jun 11, 2024 AI Agent Descriptive
Code Code Available 2Recurrent Context Compression: Efficiently Expanding the Context Window of LLM Jun 10, 2024 Long-Context Understanding Question Answering
Code Code Available 2F-LMM: Grounding Frozen Large Multimodal Models Jun 9, 2024 General Knowledge Instruction Following
Code Code Available 2CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning Jun 7, 2024 Instruction Following Math
Code Code Available 2From Redundancy to Relevance: Information Flow in LVLMs Across Reasoning Tasks Jun 4, 2024 Image Captioning Language Modelling
Code Code Available 2TCMBench: A Comprehensive Benchmark for Evaluating Large Language Models in Traditional Chinese Medicine Jun 3, 2024 Benchmarking Question Answering
Code Code Available 2TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy Jun 3, 2024 Language Modelling Question Answering
Code Code Available 2ANAH: Analytical Annotation of Hallucinations in Large Language Models May 30, 2024 Generative Question Answering Hallucination
Code Code Available 2Empowering Large Language Models to Set up a Knowledge Retrieval Indexer via Self-Learning May 27, 2024 Question Answering RAG
Code Code Available 2Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model May 27, 2024 Decoder Language Modeling
Code Code Available 2Crafting Interpretable Embeddings by Asking LLMs Questions May 26, 2024 Question Answering
Code Code Available 2Enhancing Visual-Language Modality Alignment in Large Vision Language Models via Self-Improvement May 24, 2024 Hallucination Image Comprehension
Code Code Available 2