| The Condition Number as a Scale-Invariant Proxy for Information Encoding in Neural Units | Jun 19, 2025 | Large Language ModelMultimodal Large Language Model | CodeCode Available | 1 |
| un^2CLIP: Improving CLIP's Visual Detail Capturing Ability via Inverting unCLIP | May 30, 2025 | Large Language ModelMultimodal Large Language Model | CodeCode Available | 1 |
| Period-LLM: Extending the Periodic Capability of Multimodal Large Language Model | May 30, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution | May 27, 2025 | 8kAvg | CodeCode Available | 1 |
| Multimodal LLM-Guided Semantic Correction in Text-to-Image Diffusion | May 26, 2025 | DenoisingImage Generation | CodeCode Available | 1 |
| Unifying Multimodal Large Language Model Capabilities and Modalities via Model Merging | May 26, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| ChemMLLM: Chemical Multimodal Large Language Model | May 22, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| BusterX: MLLM-Powered AI-Generated Video Forgery Detection and Explanation | May 19, 2025 | Binary ClassificationDeepFake Detection | CodeCode Available | 1 |
| Unifying Segment Anything in Microscopy with Multimodal Large Language Model | May 16, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| SmartFreeEdit: Mask-Free Spatial-Aware Image Editing with Complex Instruction Understanding | Apr 17, 2025 | Image GenerationLarge Language Model | CodeCode Available | 1 |