SOTAVerified

Talking Face Generation

Talking face generation aims to synthesize a sequence of face images that correspond to given speech semantics

( Image credit: Talking Face Generation by Adversarially Disentangled Audio-Visual Representation )

Papers

Showing 150 of 110 papers

TitleStatusHype
Real3D-Portrait: One-shot Realistic 3D Talking Portrait SynthesisCode5
Deepfake Generation and Detection: A Benchmark and SurveyCode4
GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face SynthesisCode4
JoyGen: Audio-Driven 3D Depth-Aware Talking-Face Video EditingCode3
SyncTalk: The Devil is in the Synchronization for Talking Head SynthesisCode3
DINet: Deformation Inpainting Network for Realistic Face Visually Dubbing on High Resolution VideoCode3
A Lip Sync Expert Is All You Need for Speech to Lip Generation In The WildCode3
HyperLips: Hyper Control Lips with High Resolution Decoder for Talking Face GenerationCode2
Identity-Preserving Talking Face Generation with Landmark and Appearance PriorsCode2
Seeing What You Said: Talking Face Generation Guided by a Lip Reading ExpertCode2
Emotionally Enhanced Talking Face GenerationCode2
DPE: Disentanglement of Pose and Expression for General Video Portrait EditingCode2
StyleTalk: One-shot Talking Head Generation with Controllable Speaking StylesCode2
Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial DecompositionCode2
Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head SynthesisCode2
StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGANCode2
Live Speech Portraits: Real-Time Photorealistic Talking-Head AnimationCode2
MakeItTalk: Speaker-Aware Talking-Head AnimationCode2
Joint Co-Speech Gesture and Expressive Talking Face Generation using Diffusion with AdaptersCode1
KAN-Based Fusion of Dual-Domain for Audio-Driven Facial Landmarks GenerationCode1
Controllable Talking Face Generation by Implicit Facial Keypoints EditingCode1
Neural Text to Articulate Talk: Deep Text to Audiovisual Speech Synthesis achieving both Auditory and Photo-realismCode1
DiffDub: Person-generic Visual Dubbing Using Inpainting Renderer with Diffusion Auto-encoderCode1
HDTR-Net: A Real-Time High-Definition Teeth Restoration Network for Arbitrary Talking Face Generation MethodsCode1
FNeVR: Neural Volume Rendering for Face AnimationCode1
Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel’s Weekly Video PodcastsCode1
Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel's Weekly Video PodcastsCode1
FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute LearningCode1
Parallel and High-Fidelity Text-to-Lip GenerationCode1
Txt2Vid: Ultra-Low Bitrate Compression of Talking-Head Videos via TextCode1
Flow-Guided One-Shot Talking Face Generation With a High-Resolution Audio-Visual DatasetCode1
Text2Video: Text-driven Talking-head Video Synthesis with Personalized Phoneme-Pose DictionaryCode1
Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual RepresentationCode1
Speech Driven Talking Face Generation from a Single Image and an Emotion ConditionCode1
MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face GenerationCode1
DisentTalk: Cross-lingual Talking Face Generation via Semantic Disentangled Diffusion Model0
UniSync: A Unified Framework for Audio-Visual Synchronization0
PC-Talk: Precise Facial Animation Control for Audio-Driven Talking Face Generation0
Playmate: Flexible Control of Portrait Animation via 3D-Implicit Space Guided Diffusion0
GLCF: A Global-Local Multimodal Coherence Analysis Framework for Talking Face Generation Detection0
VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization0
PortraitTalk: Towards Customizable One-Shot Audio-to-Talking Face Generation0
Sonic: Shifting Focus to Global Audio Perception in Portrait Animation0
MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes0
JEAN: Joint Expression and Audio-guided NeRF-based Talking Face Generation0
StyleTalk++: A Unified Framework for Controlling the Speaking Styles of Talking Heads0
SegTalker: Segmentation-based Talking Face Generation with Mask-guided Local Editing0
High-fidelity and Lip-synced Talking Face Synthesis via Landmark-based Diffusion Model0
Emotional Conversation: Empowering Talking Faces with Cohesive Expression, Gaze and Pose Generation0
OpFlowTalker: Realistic and Natural Talking Face Generation via Optical Flow Guidance0
Show:102550
← PrevPage 1 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1EmoGenEmoAcc83.2Unverified
#ModelMetricClaimedVerifiedStatus
1LipGANLMD0.6Unverified