SOTAVerified

Talking Head Generation

Talking head generation is the task of generating a talking face from a set of images of a person.

( Image credit: Few-Shot Adversarial Learning of Realistic Neural Talking Head Models )

Papers

Showing 2650 of 119 papers

TitleStatusHype
AnimeCeleb: Large-Scale Animation CelebHeads Dataset for Head ReenactmentCode1
Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head MotionCode1
Talking-head Generation with Rhythmic Head MotionCode1
DiffDub: Person-generic Visual Dubbing Using Inpainting Renderer with Diffusion Auto-encoderCode1
Face Animation with an Attribute-Guided Diffusion ModelCode1
Fast Bi-layer Neural Synthesis of One-Shot Realistic Head AvatarsCode1
Silence is Golden: Leveraging Adversarial Examples to Nullify Audio Control in LDM-based Talking-Head GenerationCode1
Perceptual Conversational Head Generation with Regularized Driver and Enhanced RendererCode1
Write-a-speaker: Text-based Emotional and Rhythmic Talking-head GenerationCode1
What comprises a good talking-head video generation?: A Survey and BenchmarkCode1
MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face GenerationCode1
Comparative Analysis of Audio Feature Extraction for Real-Time Talking Portrait SynthesisCode1
Autoregressive GAN for Semantic Unconditional Head Motion GenerationCode1
Learning Landmarks Motion from Speech for Speaker-Agnostic 3D Talking Heads GenerationCode1
AI-generated characters for supporting personalized learning and well-beingCode1
Joint Co-Speech Gesture and Expressive Talking Face Generation using Diffusion with AdaptersCode1
DisCoHead: Audio-and-Video-Driven Talking Head Generation by Disentangled Control of Head Pose and Facial ExpressionsCode1
GoHD: Gaze-oriented and Highly Disentangled Portrait Animation with Rhythmic Poses and Realistic ExpressionCode1
A Comparative Study of Perceptual Quality Metrics for Audio-driven Talking Head VideosCode1
EmoVOCA: Speech-Driven Emotional 3D Talking HeadsCode1
Text-to-Video: a Two-stage Framework for Zero-shot Identity-agnostic Talking-head GenerationCode0
Neural Voice Puppetry: Audio-driven Facial ReenactmentCode0
A Comprehensive Multi-scale Approach for Speech and Dynamics Synchrony in Talking Head GenerationCode0
ReenactGAN: Learning to Reenact Faces via Boundary TransferCode0
EMOdiffhead: Continuously Emotional Control in Talking Head Generation via DiffusionCode0
Show:102550
← PrevPage 2 of 5Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Few-shot Adversarial ModelFID48.5Unverified
2CainGANFID35Unverified
3Fast Bi-layer Avatars (medium size)CSIM0.65Unverified
4First Order Motion Model (medium size)CSIM0.64Unverified
5Few-shot Vid-to-vid (medium size)CSIM0.6Unverified
#ModelMetricClaimedVerifiedStatus
1X2FaceFID45.8Unverified
2Few-shot Adversarial ModelFID43Unverified
#ModelMetricClaimedVerifiedStatus
1X2FaceFID56.5Unverified
2Few-shot Adversarial ModelFID29.5Unverified
#ModelMetricClaimedVerifiedStatus
1X2FaceFID51.5Unverified
2Few-shot Adversarial ModelFID38Unverified
#ModelMetricClaimedVerifiedStatus
1Few-shot Adversarial ModelFID42.2Unverified
2CainGANFID24.9Unverified
#ModelMetricClaimedVerifiedStatus
1Ashok10%12Unverified
#ModelMetricClaimedVerifiedStatus
1Few-shot Adversarial ModelFID30.6Unverified