| FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs | Jul 4, 2024 | Emotion RecognitionEvent Detection | CodeCode Available | 11 | 5 |
| CosyVoice 2: Scalable Streaming Speech Synthesis with Large Language Models | Dec 13, 2024 | In-Context LearningQuantization | CodeCode Available | 11 | 5 |
| RWKV-7 "Goose" with Expressive Dynamic State Evolution | Mar 18, 2025 | In-Context LearningLanguage Modeling | CodeCode Available | 9 | 5 |
| YuE: Scaling Open Foundation Models for Long-Form Music Generation | Mar 11, 2025 | FormIn-Context Learning | CodeCode Available | 9 | 5 |
| Natural language guidance of high-fidelity text-to-speech with synthetic annotations | Feb 2, 2024 | In-Context LearningLanguage Modeling | CodeCode Available | 9 | 5 |
| Do Large Language Models Need a Content Delivery Network? | Sep 16, 2024 | In-Context Learning | CodeCode Available | 9 | 5 |
| SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers | Oct 14, 2024 | DecoderGPU | CodeCode Available | 9 | 5 |
| Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers | Jan 5, 2023 | In-Context LearningLanguage Modeling | CodeCode Available | 7 | 5 |
| HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance | Jan 16, 2024 | In-Context Learning | CodeCode Available | 7 | 5 |
| Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP | Dec 28, 2022 | In-Context LearningLanguage Modelling | CodeCode Available | 7 | 5 |