| OtterHD: A High-Resolution Multi-modality Model | Nov 7, 2023 | modelVisual Question Answering | CodeCode Available | 4 |
| ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge | Mar 24, 2023 | Information RetrievalLanguage Modeling | CodeCode Available | 4 |
| Multimodal Whole Slide Foundation Model for Pathology | Nov 29, 2024 | Cross-Modal Retrievalmodel | CodeCode Available | 4 |
| EdgeTAM: On-Device Track Anything Model | Jan 13, 2025 | modelVideo Segmentation | CodeCode Available | 4 |
| LLM Inference Unveiled: Survey and Roofline Model Insights | Feb 26, 2024 | Knowledge DistillationLanguage Modelling | CodeCode Available | 4 |
| Molecular-driven Foundation Model for Oncologic Pathology | Jan 28, 2025 | BenchmarkingDiagnostic | CodeCode Available | 4 |
| Diffusion Model-Based Image Editing: A Survey | Feb 27, 2024 | DenoisingImage Generation | CodeCode Available | 4 |
| DiffuEraser: A Diffusion Model for Video Inpainting | Jan 17, 2025 | modelOptical Flow Estimation | CodeCode Available | 4 |
| DiffusionDet: Diffusion Model for Object Detection | Nov 17, 2022 | Denoisingmodel | CodeCode Available | 4 |
| LLM2CLIP: Powerful Language Model Unlocks Richer Visual Representation | Nov 7, 2024 | Contrastive LearningImage Captioning | CodeCode Available | 4 |