| EtC: Temporal Boundary Expand then Clarify for Weakly Supervised Video Grounding with Multimodal Large Language Model | Dec 5, 2023 | Boundary DetectionLanguage Modeling | —Unverified | 0 |
| MedXChat: A Unified Multimodal Large Language Model Framework towards CXRs Understanding and Generation | Dec 4, 2023 | Instruction FollowingLanguage Modeling | —Unverified | 0 |
| TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding | Dec 4, 2023 | Dense CaptioningHighlight Detection | CodeCode Available | 2 |
| mPLUG-PaperOwl: Scientific Diagram Analysis with the Multimodal Large Language Model | Nov 30, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| CoDi-2: In-Context, Interleaved, and Interactive Any-to-Any Generation | Nov 30, 2023 | Image GenerationIn-Context Learning | —Unverified | 0 |
| LLMGA: Multimodal Large Language Model based Generation Assistant | Nov 27, 2023 | Image GenerationLanguage Modeling | CodeCode Available | 2 |
| GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation | Nov 25, 2023 | Instruction FollowingLanguage Modeling | —Unverified | 0 |
| LION : Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge | Nov 20, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| How to Bridge the Gap between Modalities: Survey on Multimodal Large Language Model | Nov 10, 2023 | Image CaptioningLanguage Modeling | —Unverified | 0 |
| Chain of Images for Intuitively Reasoning | Nov 9, 2023 | Common Sense ReasoningLanguage Modelling | CodeCode Available | 1 |