| LaVy: Vietnamese Multimodal Large Language Model | Apr 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| HGRN2: Gated Linear RNNs with State Expansion | Apr 11, 2024 | Image ClassificationLanguage Modeling | CodeCode Available | 2 |
| UMBRAE: Unified Multimodal Brain Decoding | Apr 10, 2024 | Brain DecodingLanguage Modeling | CodeCode Available | 2 |
| Optimization Methods for Personalizing Large Language Models through Retrieval Augmentation | Apr 9, 2024 | Knowledge DistillationLanguage Modeling | CodeCode Available | 2 |
| MotionChain: Conversational Motion Controllers via Multimodal Prompts | Apr 2, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward | Apr 1, 2024 | Instruction FollowingLanguage Modeling | CodeCode Available | 2 |
| ARAGOG: Advanced RAG Output Grading | Apr 1, 2024 | Document EmbeddingLanguage Modeling | CodeCode Available | 2 |
| VHM: Versatile and Honest Vision Language Model for Remote Sensing Image Analysis | Mar 29, 2024 | HallucinationImage Captioning | CodeCode Available | 2 |
| Multi-Frame, Lightweight & Efficient Vision-Language Models for Question Answering in Autonomous Driving | Mar 28, 2024 | Autonomous DrivingLanguage Modeling | CodeCode Available | 2 |
| Can Language Beat Numerical Regression? Language-Based Multimodal Trajectory Prediction | Mar 27, 2024 | Image CaptioningLanguage Modeling | CodeCode Available | 2 |
| An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLM | Mar 27, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance | Mar 25, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models | Mar 20, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Advancing Time Series Classification with Multimodal Language Modeling | Mar 19, 2024 | ClassificationLanguage Modeling | CodeCode Available | 2 |
| LLM3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning | Mar 18, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| SelfIE: Self-Interpretation of Large Language Model Embeddings | Mar 16, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Generative Region-Language Pretraining for Open-Ended Object Detection | Mar 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| VideoAgent: Long-form Video Understanding with Large Language Model as Agent | Mar 15, 2024 | EgoSchemaForm | CodeCode Available | 2 |
| What Was Your Prompt? A Remote Keylogging Attack on AI Assistants | Mar 14, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents | Mar 13, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Generative Pretrained Structured Transformers: Unsupervised Syntactic Language Models at Scale | Mar 13, 2024 | Constituency Grammar InductionLanguage Modeling | CodeCode Available | 2 |
| CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model | Mar 13, 2024 | General KnowledgeInstruction Following | CodeCode Available | 2 |
| LLM-Assisted Light: Leveraging Large Language Model Capabilities for Human-Mimetic Traffic Signal Control in Complex Urban Environments | Mar 13, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 2 |
| Beyond Text: Frozen Large Language Models in Visual Signal Comprehension | Mar 12, 2024 | DeblurringDecoder | CodeCode Available | 2 |
| VLKEB: A Large Vision-Language Model Knowledge Editing Benchmark | Mar 12, 2024 | knowledge editingLanguage Modeling | CodeCode Available | 2 |