BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions Aug 19, 2023 MME Optical Character Recognition (OCR)
Code Code Available 2MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in Large Language Models Aug 17, 2023 Decision Making Hallucination
Code Code Available 2TeCH: Text-guided Reconstruction of Lifelike Clothed Humans Aug 16, 2023 Descriptive Question Answering
Code Code Available 2Large Language Models for Information Retrieval: A Survey Aug 14, 2023 Information Retrieval Question Answering
Code Code Available 23D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment Aug 8, 2023 3D Question Answering (3D-QA) Dense Captioning
Code Code Available 2EduChat: A Large-Scale Language Model-based Chatbot System for Intelligent Education Aug 5, 2023 Chatbot Language Modeling
Code Code Available 2Towards Generalist Foundation Model for Radiology by Leveraging Web-scale 2D&3D Medical Data Aug 4, 2023 Question Answering Visual Question Answering
Code Code Available 2ConceptLab: Creative Concept Generation using VLM-Guided Diffusion Prior Constraints Aug 3, 2023 Image Generation Language Modelling
Code Code Available 2The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World Aug 3, 2023 All Question Answering
Code Code Available 2MovieChat: From Dense Token to Sparse Memory for Long Video Understanding Jul 31, 2023 Multiple-choice Question Answering
Code Code Available 2RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control Jul 28, 2023 Object Question Answering
Code Code Available 2Med-Flamingo: a Multimodal Medical Few-shot Learner Jul 27, 2023 Medical Visual Question Answering Question Answering
Code Code Available 2Think-on-Graph: Deep and Responsible Reasoning of Large Language Model on Knowledge Graph Jul 15, 2023 Hallucination Knowledge Graphs
Code Code Available 2Lost in the Middle: How Language Models Use Long Contexts Jul 6, 2023 Language Modelling Position
Code Code Available 2JourneyDB: A Benchmark for Generative Image Understanding Jul 3, 2023 Image Captioning Image Comprehension
Code Code Available 2BatGPT: A Bidirectional Autoregessive Talker from Generative Pre-trained Transformer Jul 1, 2023 Language Modeling Language Modelling
Code Code Available 2ToolQA: A Dataset for LLM Question Answering with External Tools Jun 23, 2023 Hallucination Question Answering
Code Code Available 2LVLM-eHub: A Comprehensive Evaluation Benchmark for Large Vision-Language Models Jun 15, 2023 Hallucination Image Captioning
Code Code Available 2PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark for Finance Jun 8, 2023 Conversational Question Answering Language Modeling
Code Code Available 2Fine-Grained Human Feedback Gives Better Rewards for Language Model Training Jun 2, 2023 Language Modeling Language Modelling
Code Code Available 2Contextual Object Detection with Multimodal Large Language Models May 29, 2023 Cloze Test Decoder
Code Code Available 2BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks May 26, 2023 Image Captioning Medical Visual Question Answering
Code Code Available 2Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models May 24, 2023 Chatbot Natural Language Understanding
Code Code Available 2NuScenes-QA: A Multi-modal Visual Question Answering Benchmark for Autonomous Driving Scenario May 24, 2023 Autonomous Driving Question Answering
Code Code Available 2The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning May 23, 2023 Common Sense Reasoning Common Sense Reasoning (Zero-Shot)
Code Code Available 2LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities May 22, 2023 Event Extraction graph construction
Code Code Available 2Pengi: An Audio Language Model for Audio Tasks May 19, 2023 Audio captioning Audio Question Answering
Code Code Available 2ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings May 19, 2023 In-Context Learning Question Answering
Code Code Available 2StructGPT: A General Framework for Large Language Model to Reason over Structured Data May 16, 2023 Language Modeling Language Modelling
Code Code Available 2OCRBench: On the Hidden Mystery of OCR in Large Multimodal Models May 13, 2023 Key Information Extraction Nutrition
Code Code Available 2WebCPM: Interactive Web Search for Chinese Long-form Question Answering May 11, 2023 Form Information Retrieval
Code Code Available 2RetroMAE-2: Duplex Masked Auto-Encoder For Pre-Training Retrieval-Oriented Language Models May 4, 2023 Information Retrieval Open-Domain Question Answering
Code Code Available 2Huatuo-26M, a Large-scale Chinese Medical QA Dataset May 2, 2023 Language Modeling Language Modelling
Code Code Available 2LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions Apr 27, 2023 Common Sense Reasoning Coreference Resolution
Code Code Available 2PMC-LLaMA: Towards Building Open-source Language Models for Medicine Apr 27, 2023 Language Modeling Language Modelling
Code Code Available 2Unlocking Context Constraints of LLMs: Enhancing Context Efficiency of LLMs with Self-Information-Based Content Filtering Apr 24, 2023 Articles Question Answering
Code Code Available 2VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset Apr 17, 2023 Audio captioning Audio-Video Question Answering (AVQA)
Code Code Available 2LongForm: Effective Instruction Tuning with Reverse Instructions Apr 17, 2023 Long Form Question Answering News Generation
Code Code Available 2ChatGPT Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions Mar 12, 2023 Image Captioning Question Answering
Code Code Available 2PaLM-E: An Embodied Multimodal Language Model Mar 6, 2023 Language Modeling Language Modelling
Code Code Available 2Prophet: Prompting Large Language Models with Complementary Answer Heuristics for Knowledge-based Visual Question Answering Mar 3, 2023 Language Modelling Large Language Model
Code Code Available 2Hyena Hierarchy: Towards Larger Convolutional Language Models Feb 21, 2023 2k 8k
Code Code Available 2ChatIE: Zero-Shot Information Extraction via Chatting with ChatGPT Feb 20, 2023 Event Extraction named-entity-recognition
Code Code Available 2MQAG: Multiple-choice Question Answering and Generation for Assessing Information Consistency in Summarization Jan 28, 2023 Hallucination Multiple-choice
Code Code Available 2PrimeQA: The Prime Repository for State-of-the-Art Multilingual Question Answering Research and Development Jan 23, 2023 Question Answering Reading Comprehension
Code Code Available 2Hungry Hungry Hippos: Towards Language Modeling with State Space Models Dec 28, 2022 8k Coreference Resolution
Code Code Available 2Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions Dec 20, 2022 Hallucination Question Answering
Code Code Available 2Discovering Latent Knowledge in Language Models Without Supervision Dec 7, 2022 Imitation Learning Language Modelling
Code Code Available 2Visual Programming: Compositional visual reasoning without training Nov 18, 2022 In-Context Learning Question Answering
Code Code Available 2RetroMAE v2: Duplex Masked Auto-Encoder For Pre-Training Retrieval-Oriented Language Models Nov 16, 2022 Dimensionality Reduction Information Retrieval
Code Code Available 2