SOTAVerified

Vibravox: A Dataset of French Speech Captured with Body-conduction Audio Sensors

2024-07-16Code Available1· sign in to hype

Julien Hauret, Malo Olivier, Thomas Joubaud, Christophe Langrenne, Sarah Poirée, Véronique Zimpfer, Éric Bavu

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Vibravox is a dataset compliant with the General Data Protection Regulation (GDPR) containing audio recordings using five different body-conduction audio sensors: two in-ear microphones, two bone conduction vibration pickups, and a laryngophone. The dataset also includes audio data from an airborne microphone used as a reference. The Vibravox corpus contains 45 hours per sensor of speech samples and physiological sounds recorded by 188 participants under different acoustic conditions imposed by a high order ambisonics 3D spatializer. Annotations about the recording conditions and linguistic transcriptions are also included in the corpus. We conducted a series of experiments on various speech-related tasks, including speech recognition, speech enhancement, and speaker verification. These experiments were carried out using state-of-the-art models to evaluate and compare their performances on signals captured by the different audio sensors offered by the Vibravox dataset, with the aim of gaining a better grasp of their individual characteristics.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
VibraVox (forehead accelerometer)ECAPA2Test EER0.01Unverified
VibraVox (headset microphone)ECAPA2Test EER0Unverified
VibraVox (rigid in-ear microphone)ECAPA2Test EER0.03Unverified
VibraVox (soft in-ear microphone)ECAPA2Test EER0.02Unverified
VibraVox (temple vibration pickup)ECAPA2Test EER0.08Unverified
VibraVox (throat microphone)ECAPA2Test EER0.04Unverified

Reproductions