| Telephone Surveys Meet Conversational AI: Evaluating a LLM-Based Telephone Survey System at Scale | Feb 27, 2025 | AI AgentLarge Language Model | —Unverified | 0 |
| Nexus: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision | Feb 26, 2025 | Audio SynthesisAutomatic Speech Recognition | —Unverified | 0 |
| Balancing Speech Understanding and Generation Using Continual Pre-training for Codec-based Speech LLM | Feb 24, 2025 | Automatic Speech RecognitionLanguage Modeling | —Unverified | 0 |
| Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation | Feb 24, 2025 | Automatic Speech RecognitionDiversity | —Unverified | 0 |
| Measuring the Effect of Transcription Noise on Downstream Language Understanding Tasks | Feb 19, 2025 | Automatic Speech Recognitionspeech-recognition | CodeCode Available | 0 |
| DuplexMamba: Enhancing Real-time Speech Conversations with Duplex and Streaming Capabilities | Feb 16, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| SparQLe: Speech Queries to Text Translation Through LLMs | Feb 13, 2025 | Speech-to-TextSpeech-to-Text Translation | CodeCode Available | 0 |
| Speech to Speech Translation with Translatotron: A State of the Art Review | Feb 9, 2025 | speech-recognitionSpeech Recognition | —Unverified | 0 |
| High-Fidelity Simultaneous Speech-To-Speech Translation | Feb 5, 2025 | DecoderSimultaneous Speech-to-Speech Translation | CodeCode Available | 5 |
| When End-to-End is Overkill: Rethinking Cascaded Speech-to-Text Translation | Feb 1, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |