This repository contains code to run MeshTalk for face animation from audio. If you use MeshTalk, please cite
@inproceedings{richard2021meshtalk,
author = {Richard, Alexander and Zollh\"ofer, Michael and Wen, Yandong and de la Torre, Fernando and Sheikh, Yaser},
title = {MeshTalk: 3D Face Animation From Speech Using Cross-Modality Disentanglement},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2021},
pages = {1173-1182}
}
ffmpeg
numpy
torch (tested with v1.10.0)
pytorch3d (tested with v0.4.0)
torchaudio (tested with v0.10.0)
Download the pretrained models and unzip them.
Make sure your python path contains the root directory (export PYTHONPATH=<your_meshtalk_root_directory>
).
Then, run
python animate_face.py --model_dir <your_pretrained_model_dir> --audio_file <your_speech_snippet.wav> --output <your_output_file.mp4>
See a description of command line arguments via python animate_face.py --help
. We provide a neutral face template mesh in assets/face_template.obj
. Note that the rendered results look slightly different than in the paper and supplemental video because we use a differnt (open source) rendering engine in this repository.
We are in the process of releasing high-quality 3D face captures of 16 subjects (a subset of the dataset used in this paper). We will link to the dataset here once it is available.
The code and dataset are released under CC-NC 4.0 International license.