| FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs | Jul 4, 2024 | Emotion RecognitionEvent Detection | CodeCode Available | 11 |
| CosyVoice 2: Scalable Streaming Speech Synthesis with Large Language Models | Dec 13, 2024 | In-Context LearningQuantization | CodeCode Available | 11 |
| Natural language guidance of high-fidelity text-to-speech with synthetic annotations | Feb 2, 2024 | In-Context LearningLanguage Modeling | CodeCode Available | 9 |
| YuE: Scaling Open Foundation Models for Long-Form Music Generation | Mar 11, 2025 | FormIn-Context Learning | CodeCode Available | 9 |
| SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers | Oct 14, 2024 | DecoderGPU | CodeCode Available | 9 |
| RWKV-7 "Goose" with Expressive Dynamic State Evolution | Mar 18, 2025 | In-Context LearningLanguage Modeling | CodeCode Available | 9 |
| Do Large Language Models Need a Content Delivery Network? | Sep 16, 2024 | In-Context Learning | CodeCode Available | 9 |
| When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models | May 16, 2024 | In-Context LearningQuestion Answering | CodeCode Available | 7 |
| HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance | Jan 16, 2024 | In-Context Learning | CodeCode Available | 7 |
| Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP | Dec 28, 2022 | In-Context LearningLanguage Modelling | CodeCode Available | 7 |
| Seed-TTS: A Family of High-Quality Versatile Speech Generation Models | Jun 4, 2024 | In-Context LearningLanguage Modelling | CodeCode Available | 7 |
| Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers | Jan 5, 2023 | In-Context LearningLanguage Modeling | CodeCode Available | 7 |
| Zero-shot Voice Conversion with Diffusion Transformers | Nov 15, 2024 | In-Context LearningVoice Conversion | CodeCode Available | 7 |
| ECCO: Can We Improve Model-Generated Code Efficiency Without Sacrificing Functional Correctness? | Jul 19, 2024 | BenchmarkingCode Generation | CodeCode Available | 7 |
| Large Language Diffusion Models | Feb 14, 2025 | In-Context LearningInstruction Following | CodeCode Available | 7 |
| SymbolicAI: A framework for logic-based approaches combining generative models and solvers | Feb 1, 2024 | Few-Shot LearningIn-Context Learning | CodeCode Available | 5 |
| Long-context LLMs Struggle with Long In-context Learning | Apr 2, 2024 | 2kIn-Context Learning | CodeCode Available | 5 |
| Fundamental Components of Deep Learning: A category-theoretic approach | Mar 13, 2024 | Deep LearningDescriptive | CodeCode Available | 5 |
| LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models | Oct 9, 2023 | GSM8KIn-Context Learning | CodeCode Available | 5 |
| TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second | Jul 5, 2022 | AutoMLBayesian Inference | CodeCode Available | 5 |
| KBLaM: Knowledge Base augmented Language Model | Oct 14, 2024 | 8kGPU | CodeCode Available | 5 |
| LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks | Dec 19, 2024 | 8kIn-Context Learning | CodeCode Available | 5 |
| Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities | Feb 2, 2024 | Acoustic Scene ClassificationAudio captioning | CodeCode Available | 5 |
| Sample Design Engineering: An Empirical Study of What Makes Good Downstream Fine-Tuning Samples for LLMs | Apr 19, 2024 | Event ExtractionIn-Context Learning | CodeCode Available | 5 |
| Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling | Mar 7, 2023 | In-Context LearningLanguage Modeling | CodeCode Available | 5 |