| MMSU: A Massive Multi-task Spoken Language Understanding and Reasoning Benchmark | Jun 5, 2025 | RhythmSpoken Language Understanding | CodeCode Available | 7 | 5 |
| LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT | Oct 7, 2023 | Audio captioningAutomatic Speech Recognition | CodeCode Available | 2 | 5 |
| Speech Model Pre-training for End-to-End Spoken Language Understanding | Apr 7, 2019 | Speech-to-TextSpoken Language Understanding | CodeCode Available | 2 | 5 |
| Using Speech Synthesis to Train End-to-End Spoken Language Understanding Models | Oct 21, 2019 | Data AugmentationNatural Language Understanding | CodeCode Available | 2 | 5 |
| SyllableLM: Learning Coarse Semantic Units for Speech Language Models | Oct 5, 2024 | ClusteringLanguage Modeling | CodeCode Available | 2 | 5 |
| A Hierarchical Decoding Model For Spoken Language Understanding From Unaligned Data | Apr 9, 2019 | Spoken Language Understanding | CodeCode Available | 1 | 5 |
| AISHELL-NER: Named Entity Recognition from Chinese Speech | Feb 17, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 | 5 |
| Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings | Oct 23, 2022 | Acoustic Unit DiscoveryContrastive Learning | CodeCode Available | 1 | 5 |
| A Co-Interactive Transformer for Joint Slot Filling and Intent Detection | Oct 8, 2020 | Intent Detectionslot-filling | CodeCode Available | 1 | 5 |
| Adapting Pretrained Transformer to Lattices for Spoken Language Understanding | Nov 2, 2020 | Natural Language Understandingspeech-recognition | CodeCode Available | 1 | 5 |