| Explore the Limits of Omni-modal Pretraining at Scale | Jun 13, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Discovering Preference Optimization Algorithms with and for Large Language Models | Jun 12, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| RS-Agent: Automating Remote Sensing Tasks through Intelligent Agent | Jun 11, 2024 | AI AgentDescriptive | CodeCode Available | 2 |
| LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model | Jun 7, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Tool-Planner: Task Planning with Clusters across Multiple Tools | Jun 6, 2024 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| Jailbreak Vision Language Models via Bi-Modal Adversarial Prompt | Jun 6, 2024 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| PosterLLaVa: Constructing a Unified Multi-modal Layout Generator with LLM | Jun 5, 2024 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models | Jun 5, 2024 | Few-Shot LearningLanguage Modeling | CodeCode Available | 2 |
| From Redundancy to Relevance: Information Flow in LVLMs Across Reasoning Tasks | Jun 4, 2024 | Image CaptioningLanguage Modelling | CodeCode Available | 2 |
| Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow | Jun 3, 2024 | GPULanguage Modeling | CodeCode Available | 2 |