| HGRN2: Gated Linear RNNs with State Expansion | Apr 11, 2024 | Image ClassificationLanguage Modeling | CodeCode Available | 2 |
| LaVy: Vietnamese Multimodal Large Language Model | Apr 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| UMBRAE: Unified Multimodal Brain Decoding | Apr 10, 2024 | Brain DecodingLanguage Modeling | CodeCode Available | 2 |
| Optimization Methods for Personalizing Large Language Models through Retrieval Augmentation | Apr 9, 2024 | Knowledge DistillationLanguage Modeling | CodeCode Available | 2 |
| MotionChain: Conversational Motion Controllers via Multimodal Prompts | Apr 2, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| ARAGOG: Advanced RAG Output Grading | Apr 1, 2024 | Document EmbeddingLanguage Modeling | CodeCode Available | 2 |
| Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward | Apr 1, 2024 | Instruction FollowingLanguage Modeling | CodeCode Available | 2 |
| VHM: Versatile and Honest Vision Language Model for Remote Sensing Image Analysis | Mar 29, 2024 | HallucinationImage Captioning | CodeCode Available | 2 |
| Multi-Frame, Lightweight & Efficient Vision-Language Models for Question Answering in Autonomous Driving | Mar 28, 2024 | Autonomous DrivingLanguage Modeling | CodeCode Available | 2 |
| Can Language Beat Numerical Regression? Language-Based Multimodal Trajectory Prediction | Mar 27, 2024 | Image CaptioningLanguage Modeling | CodeCode Available | 2 |