SOTAVerified

No Audiogram: Leveraging Existing Scores for Personalized Speech Intelligibility Prediction

2025-05-31Unverified0· sign in to hype

Haoshuai Zhou, Changgeng Mo, Boxuan Cao, Linkai Li, Shan Xiang Wang

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Personalized speech intelligibility prediction is challenging. Previous approaches have mainly relied on audiograms, which are inherently limited in accuracy as they only capture a listener's hearing threshold for pure tones. Rather than incorporating additional listener features, we propose a novel approach that leverages an individual's existing intelligibility data to predict their performance on new audio. We introduce the Support Sample-Based Intelligibility Prediction Network (SSIPNet), a deep learning model that leverages speech foundation models to build a high-dimensional representation of a listener's speech recognition ability from multiple support (audio, score) pairs, enabling accurate predictions for unseen audio. Results on the Clarity Prediction Challenge dataset show that, even with a small number of support (audio, score) pairs, our method outperforms audiogram-based predictions. Our work presents a new paradigm for personalized speech intelligibility prediction.

Tasks

Reproductions