| InstructPix2Pix: Learning to Follow Image Editing Instructions | Nov 17, 2022 | Image Editing | CodeCode Available | 5 | 5 |
| R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcement Learning | Mar 7, 2025 | Emotion RecognitionLanguage Modeling | CodeCode Available | 5 | 5 |
| HealthGPT: A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation | Feb 14, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 5 | 5 |
| DeTikZify: Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ | May 24, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 5 | 5 |
| Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities | Feb 2, 2024 | Acoustic Scene ClassificationAudio captioning | CodeCode Available | 5 | 5 |
| Ovis: Structural Embedding Alignment for Multimodal Large Language Model | May 31, 2024 | Language ModelingMultimodal Large Language Model | CodeCode Available | 5 | 5 |
| PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU | Dec 16, 2023 | CPUGPU | CodeCode Available | 5 | 5 |
| Randomized Autoregressive Visual Generation | Nov 1, 2024 | Image GenerationLanguage Modeling | CodeCode Available | 5 | 5 |
| Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models | Sep 21, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 5 | 5 |
| NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms | Feb 25, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 5 | 5 |