SOTAVerified

Automated Dubbing and Facial Synchronization using Deep Learning

2022-05-172nd International Conference on Artificial Intelligence (ICAI) 2022Code Available0· sign in to hype

Saad Ahmed Bazaz, AbdurRehman Subhani, Syed Zohair Abbas Hadi

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

With the recent global boom in video content creation and consumption during the pandemic, linguistics remains the only barrier in producing im-mersive content for global communities. To solve this, content creators use a manual dubbing process, where voice actors are hired to produce a “voiceover” over the video. We aim to break down the language barrier and thus make “videos for everyone”. We propose an end-to-end architecture that automatically translates videos and produces synchronized dubbed voices using deep learning models, in a specified target language. Our architecture takes a modular approach, allowing the user to tweak each component or replace it with a better one. We present our results from said architecture, and describe possible future motivations to scale this to accommodate multiple languages and multiple use cases. A sample of our results can be found here: https://youtu.be/eGB-gL6bDr4

Tasks

Reproductions