From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations
Project page: ~evonne_ng/projects/audio2photoreal/
Code and data:
Arxiv: coming soon!
Abstract:
We present a framework for generating full-bodied photorealistic avatars that gesture according to the conversational dynamics of a dyadic interaction. Given speech audio, we output multiple possibilities of gestural motion for an individual, including face, body, and hands. The key behind our method is in combining the benefits of sample diversity from vector quantization with the high-frequency details obtained through diffusion to generate more dynamic, expressive motion. We visualize the generated motion using highly photorealistic avatars that can express crucial nuances in gestures (e.g. sneers and smirks). To facilitate this line of research, we introduce a first-of-its-kind multi-view conversational dataset that allows for photorealistic reconstruction. Experiments show our model generates appropriate and diverse gestures, outperforming both diffusion- and VQ-only methods. Furthermore, our perceptual evaluation highlights the importance of photorealism (vs. meshes) in accurately assessing subtle motion details in conversational gestures. Code and dataset will be publicly released.
Key parts:
00:15 project overview
00:40 dataset
00:47 method overview
00:55 face motion model
01:10 guide pose predictor
01:26 pose motion model
01:45 avatar renderer
02:31 results: guide poses, diffusion outputs, avatar
03:16 results: muti-sample results
04:15 results: ours vs. LDA vs. Random
04:53 results: ours vs. SHOW vs KNN
05:43 results: generalization to “Friends“ audio
06:10 results: motion editing
1 view
6
1
6 years ago 00:03:42 26
Fire From The Gods - Break The Cycle (Official Audio)
11 years ago 00:02:17 306
Sundown audio from Russia with Love
12 years ago 00:02:02 58
130719 Line Audio from Double Park ^^
8 years ago 00:05:50 152
AI Creates Facial Animation From Audio
8 years ago 00:04:20 30
Editing Multiple Audio Files From Internal Camera Audio
6 years ago 00:02:59 82
Arcade Fire - Baby Mine (From “Dumbo“/Official Audio)
6 years ago 00:04:16 23
New World Sound - Love From Coco (Official Audio)
12 years ago 00:03:55 16
Far Away From Home (Audio) (Explicit)
9 years ago 00:03:35 6K
Skillet - Back From The Dead [Official Audio]
5 years ago 00:00:38 148
RootOne from Leapwing Audio
4 years ago 00:04:28 142
Save The City (From “Hawkeye“/Audio Only)
6 years ago 00:08:24 27
MONO - Halo (Live From Electrical Audio)
4 years ago 00:10:03 37
Guess the Language #1 (From Audio)
7 years ago 00:03:34 33
Ladytron - Far From Home (Official Audio)
3 years ago 00:00:43 128
Dead Space Remake Audio Update from Motive Based On Livestream Feedback
5 years ago 00:02:57 166
Usher - California (from Songland) (Audio) ft. Tyga
13 years ago 00:07:45 176
: Getting Energy from a Bomb | Mark Bell Audio Commentary
10 years ago 00:11:04 24
Unity5: Using Audio (Multiple Audio Clips + Muting Sound from script)