| LLaQo: Towards a Query-Based Coach in Expressive Music Performance Assessment | Sep 13, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| MiniDrive: More Efficient Vision-Language Models with Multi-Level 2D Features as Text Tokens for Autonomous Driving | Sep 11, 2024 | Autonomous DrivingFeature Engineering | CodeCode Available | 2 |
| DetailCLIP: Detail-Oriented CLIP for Fine-Grained Tasks | Sep 10, 2024 | Contrastive LearningImage Reconstruction | CodeCode Available | 2 |
| TransformerRanker: A Tool for Efficiently Finding the Best-Suited Language Models for Downstream Classification Tasks | Sep 9, 2024 | ClassificationLanguage Modeling | CodeCode Available | 2 |
| The AdEMAMix Optimizer: Better, Faster, Older | Sep 5, 2024 | image-classificationImage Classification | CodeCode Available | 2 |
| Language Model Powered Digital Biology with BRAD | Sep 4, 2024 | ChatbotCode Generation | CodeCode Available | 2 |
| SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation | Sep 1, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Sample-Efficient Diffusion for Text-To-Speech Synthesis | Sep 1, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| MemLong: Memory-Augmented Retrieval for Long Text Modeling | Aug 30, 2024 | 4kDecoder | CodeCode Available | 2 |
| Law of Vision Representation in MLLMs | Aug 29, 2024 | cross-modal alignmentLanguage Modeling | CodeCode Available | 2 |