| CosyVoice 2: Scalable Streaming Speech Synthesis with Large Language Models | Dec 13, 2024 | In-Context LearningQuantization | CodeCode Available | 11 |
| FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs | Jul 4, 2024 | Emotion RecognitionEvent Detection | CodeCode Available | 11 |
| RWKV-7 "Goose" with Expressive Dynamic State Evolution | Mar 18, 2025 | In-Context LearningLanguage Modeling | CodeCode Available | 9 |
| YuE: Scaling Open Foundation Models for Long-Form Music Generation | Mar 11, 2025 | FormIn-Context Learning | CodeCode Available | 9 |
| SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers | Oct 14, 2024 | DecoderGPU | CodeCode Available | 9 |
| Do Large Language Models Need a Content Delivery Network? | Sep 16, 2024 | In-Context Learning | CodeCode Available | 9 |
| Natural language guidance of high-fidelity text-to-speech with synthetic annotations | Feb 2, 2024 | In-Context LearningLanguage Modeling | CodeCode Available | 9 |
| Large Language Diffusion Models | Feb 14, 2025 | In-Context LearningInstruction Following | CodeCode Available | 7 |
| Zero-shot Voice Conversion with Diffusion Transformers | Nov 15, 2024 | In-Context LearningVoice Conversion | CodeCode Available | 7 |
| ECCO: Can We Improve Model-Generated Code Efficiency Without Sacrificing Functional Correctness? | Jul 19, 2024 | BenchmarkingCode Generation | CodeCode Available | 7 |
| Seed-TTS: A Family of High-Quality Versatile Speech Generation Models | Jun 4, 2024 | In-Context LearningLanguage Modelling | CodeCode Available | 7 |
| When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models | May 16, 2024 | In-Context LearningQuestion Answering | CodeCode Available | 7 |
| HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance | Jan 16, 2024 | In-Context Learning | CodeCode Available | 7 |
| Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers | Jan 5, 2023 | In-Context LearningLanguage Modeling | CodeCode Available | 7 |
| Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP | Dec 28, 2022 | In-Context LearningLanguage Modelling | CodeCode Available | 7 |
| LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks | Dec 19, 2024 | 8kIn-Context Learning | CodeCode Available | 5 |
| KBLaM: Knowledge Base augmented Language Model | Oct 14, 2024 | 8kGPU | CodeCode Available | 5 |
| Sample Design Engineering: An Empirical Study of What Makes Good Downstream Fine-Tuning Samples for LLMs | Apr 19, 2024 | Event ExtractionIn-Context Learning | CodeCode Available | 5 |
| Long-context LLMs Struggle with Long In-context Learning | Apr 2, 2024 | 2kIn-Context Learning | CodeCode Available | 5 |
| Fundamental Components of Deep Learning: A category-theoretic approach | Mar 13, 2024 | Deep LearningDescriptive | CodeCode Available | 5 |
| Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities | Feb 2, 2024 | Acoustic Scene ClassificationAudio captioning | CodeCode Available | 5 |
| SymbolicAI: A framework for logic-based approaches combining generative models and solvers | Feb 1, 2024 | Few-Shot LearningIn-Context Learning | CodeCode Available | 5 |
| LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models | Oct 9, 2023 | GSM8KIn-Context Learning | CodeCode Available | 5 |
| Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling | Mar 7, 2023 | In-Context LearningLanguage Modeling | CodeCode Available | 5 |
| TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second | Jul 5, 2022 | AutoMLBayesian Inference | CodeCode Available | 5 |
| A Preview of XiYan-SQL: A Multi-Generator Ensemble Framework for Text-to-SQL | Nov 13, 2024 | DiversityIn-Context Learning | CodeCode Available | 4 |
| Zero-shot forecasting of chaotic systems | Sep 24, 2024 | AttributeIn-Context Learning | CodeCode Available | 4 |
| WavCraft: Audio Editing and Generation with Large Language Models | Mar 14, 2024 | In-Context Learning | CodeCode Available | 4 |
| InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning | Feb 9, 2024 | Data AugmentationGSM8K | CodeCode Available | 4 |
| VILA: On Pre-training for Visual Language Models | Dec 12, 2023 | In-Context LearningLanguage Modelling | CodeCode Available | 4 |
| Eureka: Human-Level Reward Design via Coding Large Language Models | Oct 19, 2023 | Decision MakingIn-Context Learning | CodeCode Available | 4 |
| AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining | Aug 10, 2023 | Audio GenerationIn-Context Learning | CodeCode Available | 4 |
| MIMIC-IT: Multi-Modal In-Context Instruction Tuning | Jun 8, 2023 | In-Context LearningVisual Question Answering | CodeCode Available | 4 |
| Otter: A Multi-Modal Model with In-Context Instruction Tuning | May 5, 2023 | GPUIn-Context Learning | CodeCode Available | 4 |
| SegGPT: Segmenting Everything In Context | Apr 6, 2023 | Few-Shot Semantic SegmentationIn-Context Learning | CodeCode Available | 4 |
| Images Speak in Images: A Generalist Painter for In-Context Visual Learning | Dec 5, 2022 | In-Context LearningKeypoint Detection | CodeCode Available | 4 |
| Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning | May 11, 2022 | Few-Shot Text ClassificationIn-Context Learning | CodeCode Available | 4 |
| TiRex: Zero-Shot Forecasting Across Long and Short Horizons with Enhanced In-Context Learning | May 29, 2025 | In-Context LearningState Space Models | CodeCode Available | 3 |
| Token Reduction Should Go Beyond Efficiency in Generative Models -- From Vision, Language to Multimodality | May 23, 2025 | In-Context LearningToken Reduction | CodeCode Available | 3 |
| PromptDresser: Improving the Quality and Controllability of Virtual Try-On via Generative Textual Prompt and Prompt-aware Mask | Dec 22, 2024 | In-Context LearningVirtual Try-on | CodeCode Available | 3 |
| Embodied CoT Distillation From LLM To Off-the-shelf Agents | Dec 16, 2024 | Decision MakingIn-Context Learning | CodeCode Available | 3 |
| Inspiring the Next Generation of Segment Anything Models: Comprehensively Evaluate SAM and SAM 2 with Diverse Prompts Towards Context-Dependent Concepts under Different Scenes | Dec 2, 2024 | In-Context LearningVideo Segmentation | CodeCode Available | 3 |
| The Surprising Effectiveness of Test-Time Training for Few-Shot Learning | Nov 11, 2024 | ARCFew-Shot Learning | CodeCode Available | 3 |
| Scaling Diffusion Language Models via Adaptation from Autoregressive Models | Oct 23, 2024 | In-Context LearningLanguage Modeling | CodeCode Available | 3 |
| Foundation Models for Music: A Survey | Aug 26, 2024 | In-Context LearningRepresentation Learning | CodeCode Available | 3 |
| Revisiting VerilogEval: A Year of Improvements in Large-Language Models for Hardware Code Generation | Aug 20, 2024 | Code CompletionCode Generation | CodeCode Available | 3 |
| A Survey on Mixture of Experts | Jun 26, 2024 | In-Context LearningMixture-of-Experts | CodeCode Available | 3 |
| OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text | Jun 12, 2024 | In-Context Learning | CodeCode Available | 3 |
| Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks | Apr 2, 2024 | In-Context Learning | CodeCode Available | 3 |
| QuRating: Selecting High-Quality Data for Training Language Models | Feb 15, 2024 | In-Context Learning | CodeCode Available | 3 |