| Fine-Tuning Self-Supervised Learning Models for End-to-End Pronunciation Scoring | Sep 19, 2023 | Feature EngineeringPhone-level pronunciation scoring | CodeCode Available | 1 |
| Allophant: Cross-lingual Phoneme Recognition with Articulatory Attributes | Jun 7, 2023 | AttributeCross-Lingual Transfer | CodeCode Available | 1 |
| FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning | Jul 1, 2022 | Knowledge DistillationPhoneme Recognition | CodeCode Available | 1 |
| Text-Aware End-to-end Mispronunciation Detection and Diagnosis | Jun 15, 2022 | Contrastive LearningPhoneme Recognition | CodeCode Available | 1 |
| Improving Mispronunciation Detection with Wav2vec2-based Momentum Pseudo-Labeling for Accentedness and Intelligibility Assessment | Mar 29, 2022 | Phoneme RecognitionPseudo Label | CodeCode Available | 1 |
| Word Error Rate Estimation Without ASR Output: e-WER2 | Aug 8, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| WaveNet: A Generative Model for Raw Audio | Sep 12, 2016 | Audio Generationmodel | CodeCode Available | 1 |
| Attention-Based Models for Speech Recognition | Jun 24, 2015 | Machine TranslationPhoneme Recognition | CodeCode Available | 1 |
| Using Neurogram Similarity Index Measure (NSIM) to Model Hearing Loss and Cochlear Neural Degeneration | Jun 15, 2025 | Phoneme Recognition | —Unverified | 0 |
| Speech-to-Text Translation with Phoneme-Augmented CoT: Enhancing Cross-Lingual Transfer in Low-Resource Scenarios | May 30, 2025 | Cross-Lingual TransferPhoneme Recognition | —Unverified | 0 |
| Towards disentangling the contributions of articulation and acoustics in multimodal phoneme recognition | May 29, 2025 | Phoneme Recognition | —Unverified | 0 |
| Topological Deep Learning for Speech Data | May 27, 2025 | Deep LearningPhoneme Recognition | —Unverified | 0 |
| Self-Supervised Models for Phoneme Recognition: Applications in Children's Speech for Reading Learning | Mar 6, 2025 | Phoneme RecognitionSelf-Supervised Learning | —Unverified | 0 |
| SyntheticPop: Attacking Speaker Verification Systems With Synthetic VoicePops | Feb 13, 2025 | Face SwappingPhoneme Recognition | —Unverified | 0 |
| Improving Cross-Lingual Phonetic Representation of Low-Resource Languages Through Language Similarity Analysis | Jan 12, 2025 | Phoneme RecognitionSelf-Supervised Learning | —Unverified | 0 |
| A Joint Spectro-Temporal Relational Thinking Based Acoustic Modeling Framework | Sep 17, 2024 | Phoneme Recognitionspeech-recognition | —Unverified | 0 |
| SLiCK: Exploiting Subsequences for Length-Constrained Keyword Spotting | Sep 6, 2024 | Keyword SpottingMulti-Task Learning | —Unverified | 0 |
| DeepSpeech models show Human-like Performance and Processing of Cochlear Implant Inputs | Jul 30, 2024 | EEGPhoneme Recognition | —Unverified | 0 |
| An Adapter-Based Unified Model for Multiple Spoken Language Processing Tasks | Jun 20, 2024 | Automatic Speech RecognitionDecoder | —Unverified | 0 |
| LASER: Learning by Aligning Self-supervised Representations of Speech for Improving Content-related Tasks | Jun 13, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| TIPAA-SSL: Text Independent Phone-to-Audio Alignment based on Self-Supervised Learning and Knowledge Transfer | May 3, 2024 | Dimensionality ReductionPhoneme Recognition | —Unverified | 0 |
| More than words: Advancements and challenges in speech recognition for singing | Mar 14, 2024 | Keyword SpottingLanguage Identification | —Unverified | 0 |
| SCORE: Self-supervised Correspondence Fine-tuning for Improved Content Representations | Mar 10, 2024 | Automatic Speech RecognitionData Augmentation | CodeCode Available | 0 |
| Probing the Information Encoded in Neural-based Acoustic Models of Automatic Speech Recognition Systems | Feb 29, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Analysis of Self-Supervised Speech Models on Children's Speech and Infant Vocalizations | Feb 10, 2024 | Phoneme RecognitionSelf-Supervised Learning | —Unverified | 0 |
| Segment Boundary Detection via Class Entropy Measurements in Connectionist Phoneme Recognition | Jan 11, 2024 | Boundary DetectionPhoneme Recognition | —Unverified | 0 |
| Optimizing Two-Pass Cross-Lingual Transfer Learning: Phoneme Recognition and Phoneme to Grapheme Translation | Dec 6, 2023 | Cross-Lingual TransferPhoneme Recognition | —Unverified | 0 |
| Modeling of Speech-dependent Own Voice Transfer Characteristics for Hearables with In-ear Microphones | Oct 10, 2023 | Phoneme Recognition | —Unverified | 0 |
| Reduce, Reuse, Recycle: Is Perturbed Data better than Other Language augmentation for Low Resource Self-Supervised Speech Models | Sep 22, 2023 | Phoneme RecognitionRepresentation Learning | —Unverified | 0 |
| Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints | Sep 16, 2023 | AttributeAutomatic Speech Recognition | —Unverified | 0 |
| Speech-dependent Modeling of Own Voice Transfer Characteristics for In-ear Microphones in Hearables | Sep 15, 2023 | Bandwidth ExtensionPhoneme Recognition | —Unverified | 0 |
| L1-aware Multilingual Mispronunciation Detection Framework | Sep 14, 2023 | Phoneme Recognition | —Unverified | 0 |
| Enhancing Child Vocalization Classification with Phonetically-Tuned Embeddings for Assisting Autism Diagnosis | Sep 13, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Can We Trust Explainable AI Methods on ASR? An Evaluation on Phoneme Recognition | May 29, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| A Comparison of Speech Data Augmentation Methods Using S3PRL Toolkit | Feb 27, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Ensemble knowledge distillation of self-supervised speech models | Feb 24, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| German Phoneme Recognition with Text-to-Phoneme Data Augmentation | Nov 24, 2022 | Data AugmentationPhoneme Recognition | —Unverified | 0 |
| SAN: a robust end-to-end ASR model architecture | Oct 27, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Multilingual Zero Resource Speech Recognition Base on Self-Supervise Pre-Trained Acoustic Models | Oct 13, 2022 | Language ModelingLanguage Modelling | —Unverified | 0 |
| A Comparison of Transformer, Convolutional, and Recurrent Neural Networks on Phoneme Recognition | Oct 1, 2022 | Phoneme Recognitionspeech-recognition | —Unverified | 0 |
| Mlphon: A Multifunctional Grapheme-Phoneme Conversion Tool Using Finite State Transducers | Sep 5, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| SpeechEQ: Speech Emotion Recognition based on Multi-scale Unified Datasets and Multitask Learning | Jun 27, 2022 | Emotion RecognitionPhoneme Recognition | —Unverified | 0 |
| Predicting within and across language phoneme recognition performance of self-supervised learning speech pre-trained models | Jun 24, 2022 | Phoneme RecognitionSelf-Supervised Learning | CodeCode Available | 0 |
| Speech Data Augmentation for Improving Phoneme Transcriptions of Aphasic Speech Using Wav2Vec 2.0 for the PSST Challenge | Jun 1, 2022 | Automatic Phoneme RecognitionData Augmentation | —Unverified | 0 |
| Self-supervised Semantic-driven Phoneme Discovery for Zero-resource Speech Recognition | May 1, 2022 | Phoneme RecognitionRepresentation Learning | —Unverified | 0 |
| Phoneme transcription of endangered languages: an evaluation of recent ASR architectures in the single speaker scenario | May 1, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| STRATA: Word Boundaries & Phoneme Recognition From Continuous Urdu Speech using Transfer Learning, Attention, & Data Augmentation | Apr 16, 2022 | Data AugmentationPhoneme Recognition | —Unverified | 0 |
| Deep Neural Convolutive Matrix Factorization for Articulatory Representation Decomposition | Apr 1, 2022 | Phoneme RecognitionRepresentation Learning | CodeCode Available | 0 |
| Benchmarking Generative Latent Variable Models for Speech | Feb 22, 2022 | BenchmarkingImage Generation | CodeCode Available | 0 |
| Spanish and English Phoneme Recognition by Training on Simulated Classroom Audio Recordings of Collaborative Learning Environments | Feb 21, 2022 | Data AugmentationPhoneme Recognition | CodeCode Available | 0 |