train_audio

LSTM Training Notes

Data Preparation

Use over 10,000 short audio clips with diverse speakers, each at least 5 seconds long, in 16k single-channel wav format.
Place these files in the train_audio/train_data folder.
Download wav2lip.pth from the wav2lip project (https://github.com/Rudrabha/Wav2Lip) and place it in the train_audio/checkpoints folder.

Notes

Audio files must be in 16k single-channel wav format.

Steps

1. Face Video Generation

python preparation_step0.py <face_path> <wav_16K_path>
# Example: preparation_step0.py face.jpg train_data

2. Mouth Region Extraction and PCA Modeling

python preparation_step1.py <data_path>
# Example: preparation_step1.py train_data

Now ensure the file directory is as follows:

|--/train_audio
|  |--/checkpoints
|  |  |--/wav2lip.pth
|  |  |--/pca.pkl
|  |  |--/wav2lip_pca_all.gif
|  |--/train_data
|  |  |--/000001.wav
|  |  |--/000001.avi
|  |  |--/000001.txt
|  |  |--/000002.wav
|  |  |--/000002.avi
|  |  |--/000002.txt
|  |  |--/000003.wav
|  |  |--/000003.avi
|  |  |--/000003.txt

3. Training LSTM Model

python train_lstm.py <data_path>
# Example: train_lstm.py train_data

4. Testing Audio Accuracy

python test.py <wav_path> <ckpt_path>
# Example: python test.py D:/Code/py/test_wav/0013.wav checkpoints/audio.pkl

Name		Name	Last commit message	Last commit date
parent directory ..
models		models
README.md		README.md
audio.py		audio.py
face.jpg		face.jpg
hparams.py		hparams.py
preparation_step0.py		preparation_step0.py
preparation_step1.py		preparation_step1.py
test.py		test.py
train_lstm.py		train_lstm.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

train_audio

train_audio

README.md

LSTM Training Notes

Data Preparation

Notes

Steps

1. Face Video Generation

2. Mouth Region Extraction and PCA Modeling

3. Training LSTM Model

4. Testing Audio Accuracy

Files

train_audio

Directory actions

More options

Directory actions

More options

Latest commit

History

train_audio

Folders and files

parent directory

README.md

LSTM Training Notes

Data Preparation

Notes

Steps

1. Face Video Generation

2. Mouth Region Extraction and PCA Modeling

3. Training LSTM Model

4. Testing Audio Accuracy