SOTAVerified

Talking Face Generation

Talking face generation aims to synthesize a sequence of face images that correspond to given speech semantics

( Image credit: Talking Face Generation by Adversarially Disentangled Audio-Visual Representation )

Papers

Showing 51100 of 110 papers

TitleStatusHype
SegTalker: Segmentation-based Talking Face Generation with Mask-guided Local Editing0
Sonic: Shifting Focus to Global Audio Perception in Portrait Animation0
StableFace: Analyzing and Improving Motion Stability for Talking Face Generation0
Style2Talker: High-Resolution Talking Head Generation with Emotion Style and Art Style0
StyleTalk++: A Unified Framework for Controlling the Speaking Styles of Talking Heads0
Superior and Pragmatic Talking Face Generation with Teacher-Student Framework0
SwapTalk: Audio-Driven Talking Face Generation with One-Shot Customization in Latent Space0
SyncTalkFace: Talking Face Generation with Precise Lip-Syncing via Audio-Lip Memory0
Taiwanese-Accented Mandarin and English Multi-Speaker Talking-Face Synthesis System0
Talking Face Generation with Multilingual TTS0
That's What I Said: Fully-Controllable Talking Face Generation0
ToonTalker: Cross-Domain Face Reenactment0
UniFLG: Unified Facial Landmark Generator from Text or Speech0
UniSync: A Unified Framework for Audio-Visual Synchronization0
VAST: Vivify Your Talking Avatar via Zero-Shot Expressive Facial Style Transfer0
VectorTalker: SVG Talking Face Generation with Progressive Vectorisation0
VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization0
MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes0
An Audio-Visual Attention Based Multimodal Network for Fake Talking Face Videos Detection0
Attention-Based Lip Audio-Visual Synthesis for Talking Face Generation in the Wild0
Audio-Driven Talking Face Generation with Diverse yet Realistic Facial Animations0
Audio-Visual Speech Representation Expert for Enhanced Talking Face Video Generation and Evaluation0
A Unified Compression Framework for Efficient Speech-Driven Talking-Face Generation0
AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation0
CP-EB: Talking Face Generation with Controllable Pose and Eye Blinking Embedding0
CPNet: Exploiting CLIP-based Attention Condenser and Probability Map Guidance for High-fidelity Talking Face Generation0
Cut Inner Layers: A Structured Pruning Strategy for Efficient U-Net GANs0
DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder0
Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation0
DisentTalk: Cross-lingual Talking Face Generation via Semantic Disentangled Diffusion Model0
DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation0
EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model0
EMMN: Emotional Motion Memory Network for Audio-driven Emotional Talking Face Generation0
EmoSpeaker: One-shot Fine-grained Emotion-Controlled Talking Face Generation0
EmoTalker: Emotionally Editable Talking Face Generation via Diffusion Model0
Emotional Conversation: Empowering Talking Faces with Cohesive Expression, Gaze and Pose Generation0
Emotional Talking Faces: Making Videos More Expressive and Realistic0
Emotion-Controllable Generalized Talking Face Generation0
Exploring Phonetic Context-Aware Lip-Sync For Talking Face Generation0
Faces that Speak: Jointly Synthesising Talking Face and Speech from Text0
FlowVQTalker: High-Quality Emotional Talking Face Generation through Normalizing Flow and Quantization0
FT2TF: First-Person Statement Text-To-Talking Face Generation0
FTFDNet: Learning to Detect Talking Face Video Manipulation with Tri-Modality Interaction0
G4G:A Generic Framework for High Fidelity Talking Face Generation with Fine-grained Intra-modal Alignment0
GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation0
GLCF: A Global-Local Multimodal Coherence Analysis Framework for Talking Face Generation Detection0
GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting0
Hierarchical Cross-Modal Talking Face Generation With Dynamic Pixel-Wise Loss0
High-fidelity and Lip-synced Talking Face Synthesis via Landmark-based Diffusion Model0
High-fidelity Generalized Emotional Talking Face Generation with Multi-modal Emotion Space Learning0
Show:102550
← PrevPage 2 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1EmoGenEmoAcc83.2Unverified
#ModelMetricClaimedVerifiedStatus
1LipGANLMD0.6Unverified