| Identifying Speakers in Dialogue Transcripts: A Text-based Approach Using Pretrained Language Models | Jul 16, 2024 | AttributeSpeaker Identification | CodeCode Available | 0 |
| VoxBlink2: A 100K+ Speaker Recognition Corpus and the Open-Set Speaker-Identification Benchmark | Jul 16, 2024 | DiversitySpeaker Identification | CodeCode Available | 5 |
| CoMix: A Comprehensive Benchmark for Multi-Task Comic Understanding | Jul 4, 2024 | Dialogue Generationobject-detection | CodeCode Available | 1 |
| DASB -- Discrete Audio and Speech Benchmark | Jun 20, 2024 | BenchmarkingEmotion Recognition | —Unverified | 0 |
| Evaluating Speaker Identity Coding in Self-supervised Models and Humans | Jun 14, 2024 | Speaker Identification | —Unverified | 0 |
| SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model | May 20, 2024 | Audio ClassificationGPU | CodeCode Available | 2 |
| TIMIT Speaker Profiling: A Comparison of Multi-task learning and Single-task learning Approaches | Apr 18, 2024 | Age EstimationClassification | —Unverified | 0 |
| Masked Modeling Duo: Towards a Universal Audio Pre-training Framework | Apr 9, 2024 | Audio Classification | CodeCode Available | 0 |
| Removing Speaker Information from Speech Representation using Variable-Length Soft Pooling | Apr 1, 2024 | Speaker IdentificationSpeech Synthesis | —Unverified | 0 |
| Hearing-Loss Compensation Using Deep Neural Networks: A Framework and Results From a Listening Test | Mar 15, 2024 | Music ClassificationSpeaker Identification | —Unverified | 0 |