SOTAVerified

Talking Face Generation

Talking face generation aims to synthesize a sequence of face images that correspond to given speech semantics

( Image credit: Talking Face Generation by Adversarially Disentangled Audio-Visual Representation )

Papers

Showing 150 of 110 papers

TitleStatusHype
DisentTalk: Cross-lingual Talking Face Generation via Semantic Disentangled Diffusion Model0
UniSync: A Unified Framework for Audio-Visual Synchronization0
PC-Talk: Precise Facial Animation Control for Audio-Driven Talking Face Generation0
Playmate: Flexible Control of Portrait Animation via 3D-Implicit Space Guided Diffusion0
JoyGen: Audio-Driven 3D Depth-Aware Talking-Face Video EditingCode3
Joint Co-Speech Gesture and Expressive Talking Face Generation using Diffusion with AdaptersCode1
GLCF: A Global-Local Multimodal Coherence Analysis Framework for Talking Face Generation Detection0
VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization0
PortraitTalk: Towards Customizable One-Shot Audio-to-Talking Face Generation0
Sonic: Shifting Focus to Global Audio Perception in Portrait Animation0
MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes0
JEAN: Joint Expression and Audio-guided NeRF-based Talking Face Generation0
StyleTalk++: A Unified Framework for Controlling the Speaking Styles of Talking Heads0
KAN-Based Fusion of Dual-Domain for Audio-Driven Facial Landmarks GenerationCode1
SegTalker: Segmentation-based Talking Face Generation with Mask-guided Local Editing0
High-fidelity and Lip-synced Talking Face Synthesis via Landmark-based Diffusion Model0
Emotional Conversation: Empowering Talking Faces with Cohesive Expression, Gaze and Pose Generation0
Controllable Talking Face Generation by Implicit Facial Keypoints EditingCode1
OpFlowTalker: Realistic and Natural Talking Face Generation via Optical Flow Guidance0
Faces that Speak: Jointly Synthesising Talking Face and Speech from Text0
SPEAK: Speech-Driven Pose and Emotion-Adjustable Talking Head Generation0
SwapTalk: Audio-Driven Talking Face Generation with One-Shot Customization in Latent Space0
Audio-Visual Speech Representation Expert for Enhanced Talking Face Video Generation and Evaluation0
GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting0
Superior and Pragmatic Talking Face Generation with Teacher-Student Framework0
Deepfake Generation and Detection: A Benchmark and SurveyCode4
Style2Talker: High-Resolution Talking Head Generation with Emotion Style and Art Style0
FlowVQTalker: High-Quality Emotional Talking Face Generation through Normalizing Flow and Quantization0
G4G:A Generic Framework for High Fidelity Talking Face Generation with Fine-grained Intra-modal Alignment0
AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation0
EmoSpeaker: One-shot Fine-grained Emotion-Controlled Talking Face Generation0
Real3D-Portrait: One-shot Realistic 3D Talking Portrait SynthesisCode5
EmoTalker: Emotionally Editable Talking Face Generation via Diffusion Model0
DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation0
VectorTalker: SVG Talking Face Generation with Progressive Vectorisation0
GSmoothFace: Generalized Smooth Talking Face Generation via Fine Grained 3D Face GuidanceCode0
Neural Text to Articulate Talk: Deep Text to Audiovisual Speech Synthesis achieving both Auditory and Photo-realismCode1
FT2TF: First-Person Statement Text-To-Talking Face Generation0
SyncTalk: The Devil is in the Synchronization for Talking Head SynthesisCode3
CP-EB: Talking Face Generation with Controllable Pose and Eye Blinking Embedding0
DiffDub: Person-generic Visual Dubbing Using Inpainting Renderer with Diffusion Auto-encoderCode1
HyperLips: Hyper Control Lips with High Resolution Decoder for Talking Face GenerationCode2
HDTR-Net: A Real-Time High-Definition Teeth Restoration Network for Arbitrary Talking Face Generation MethodsCode1
ToonTalker: Cross-Domain Face Reenactment0
VAST: Vivify Your Talking Avatar via Zero-Shot Expressive Facial Style Transfer0
Audio-driven Talking Face Generation with Stabilized Synchronization Loss0
FTFDNet: Learning to Detect Talking Face Video Manipulation with Tri-Modality Interaction0
Instruct-NeuralTalker: Editing Audio-Driven Talking Radiance Fields with Instructions0
Exploring Phonetic Context-Aware Lip-Sync For Talking Face Generation0
CPNet: Exploiting CLIP-based Attention Condenser and Probability Map Guidance for High-fidelity Talking Face Generation0
Show:102550
← PrevPage 1 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1EmoGenEmoAcc83.2Unverified
#ModelMetricClaimedVerifiedStatus
1LipGANLMD0.6Unverified