| OmAgent: A Multi-modal Agent Framework for Complex Video Understanding with Task Divide-and-Conquer | Jun 24, 2024 | AI AgentLarge Language Model | CodeCode Available | 2 |
| EDGE-LLM: Enabling Efficient Large Language Model Adaptation on Edge Devices via Layerwise Unified Compression and Adaptive Layer Tuning and Voting | Jun 22, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Language Alignment via Nash-learning and Adaptive feedback | Jun 22, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| MR-MLLM: Mutual Reinforcement of Multimodal Comprehension and Vision Perception | Jun 22, 2024 | Common Sense ReasoningLanguage Modelling | —Unverified | 0 |
| video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models | Jun 22, 2024 | DiversityLanguage Modeling | CodeCode Available | 0 |
| Unveiling Entity-Level Unlearning for Large Language Models: A Comprehensive Analysis | Jun 22, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Automated radiotherapy treatment planning guided by GPT-4Vision | Jun 21, 2024 | In-Context LearningLanguage Modelling | —Unverified | 0 |
| Open-Vocabulary Temporal Action Localization using Multimodal Guidance | Jun 21, 2024 | Action LocalizationLanguage Modelling | —Unverified | 0 |
| Inferring Pluggable Types with Machine Learning | Jun 21, 2024 | 16kLanguage Modeling | —Unverified | 0 |
| LLM2FEA: Discover Novel Designs with Generative Evolutionary Multitasking | Jun 21, 2024 | Language ModellingLarge Language Model | —Unverified | 0 |