SOTAVerified

Lip to Speech Synthesis

Given a silent video of a speaker, generate the corresponding speech that matches the lip movements.

Papers

Showing 113 of 13 papers

TitleStatusHype
NaturalL2S: End-to-End High-quality Multispeaker Lip-to-Speech Synthesis with Differential Digital Signal Processing0
Towards Accurate Lip-to-Speech Synthesis in-the-Wild0
RobustL2S: Speaker-Specific Lip-to-Speech Synthesis exploiting Self-Supervised Representations0
Intelligible Lip-to-Speech Synthesis with Speech UnitsCode1
Zero-shot personalized lip-to-speech synthesis with face image based voice control0
On the Audio-visual Synchronization for Lip-to-Speech Synthesis0
Lip-to-Speech Synthesis in the Wild with Multi-task LearningCode1
Lip-to-Speech Synthesis for Arbitrary Speakers in the Wild0
FastLTS: Non-Autoregressive End-to-End Unconstrained Lip-to-Speech SynthesisCode0
Show Me Your Face, And I'll Tell You How You SpeakCode1
Lip to Speech Synthesis with Visual Context Attentional GANCode1
Towards a practical lip-to-speech conversion system using deep neural networks and mobile application frontend0
Learning Individual Speaking Styles for Accurate Lip to Speech SynthesisCode1
Show:102550

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Lip2WavESTOI0.34Unverified