DDSP Singing Vocoders

Authors: Da-Yi Wu*, Wen-Yi Hsiao*, Fu-Rong Yang*, Oscar Friedman, Warren Jackson, Scott Bruzenak, Yi-Wen Liu, Yi-Hsuan Yang

*equal contribution

Paper | Demo

Official PyTorch Implementation of ISMIR2022 paper "DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A Comprehensive Evaluation".

In this repository:

We propose a novel singing vocoders based on subtractive synthesizer: SawSing
We present a collection of different ddsp singing vocoders
We demonstrate that ddsp singing vocoders have relatively small model size but can generate satisfying results with limited resources (1 GPU, 3-hour training data). We also report the result of an even more stringent case training the vocoders with only 3-min training recordings for only 3-hour training time.

A. Installation

pip install -r requirements.txt

B. Dataset

Please refer to dataset.md for more details.

C. Training

Train vocoders from scratch.

Modify the configuration file ..config/<model_name>.yaml
Run the following command:

# SawSing as an example
python main.py --config ./configs/sawsinsub.yaml \
               --stage  training \
               --model SawSinSub

Change --model argument to try different vocoders. Currently, we have 5 models: SawSinSub (Sawsing), Sins (DDSP-Add), DWS (DWTS), Full, SawSub. For more details, please refer to our documentation - DDSP Vocoders.

Our training resources: single Nvidia RTX 3090 Ti GPU

D. Validation

Run validation: compute loss and real-time factor (RTF).

Modify the configuration file ..config/<model_name>.yaml
Run the following command:

# SawSing as an example
python main.py --config ./configs/sawsinsub.yaml  \
              --stage validation \
              --model SawSinSub \
              --model_ckpt ./exp/f1-full/sawsinsub-256/ckpts/vocoder_27740_70.0_params.pt \
              --output_dir ./test_gen

E. Inference

Synthesize audio file from existed mel-spectrograms. The code and specfication for extracting mel-spectrograms can be found in preprocess.py.

# SawSing as an example
python main.py --config ./configs/sawsinsub.yaml  \
              --stage inference \
              --model SawSinSub \
              --model_ckpt ./exp/f1-full/sawsinsub-256/ckpts/vocoder_27740_70.0_params.pt \
              --input_dir  ./path/to/mel
              --output_dir ./test_gen

F. Post-Processing

In Sawsing, we found there are buzzing artifacts in the harmonic part singals, so we develop a post-processing codes to remove them. The method is simple yet effective --- applying a voiced/unvoiced mask. For more details, please refer to here.

G. More Information

Checkpoints
- Sins (DDSP-Add): ./exp/f1-full/sins/ckpts/
- SawSinSub (Sawsing): ./exp/f1-full/sawsinsub-256/ckpts/
- The full experimental records, reports and checkpoints can be found under the exp folder.
Documentation
- DDSP Vocoders
- Synthesizer Design

H. Citation

@article{sawsing,
  title={DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A Comprehensive Evaluation},
  author={Da-Yi Wu, Wen-Yi Hsiao, Fu-Rong Yang, Oscar Friedman, Warren Jackson, Scott Bruzenak, Yi-Wen Liu, Yi-Hsuan Yang},
  journal = {Proc. International Society for Music Information Retrieval},
  year    = {2022},
}

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
configs		configs
data		data
ddsp		ddsp
docs		docs
exp		exp
logger		logger
postprocessing		postprocessing
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
compare.py		compare.py
data_cnpop.py		data_cnpop.py
main.py		main.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
solver.py		solver.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DDSP Singing Vocoders

A. Installation

B. Dataset

C. Training

D. Validation

E. Inference

F. Post-Processing

G. More Information

H. Citation

About

Releases

Packages

Languages

License

yqzhishen/ddsp-singing-vocoders

Folders and files

Latest commit

History

Repository files navigation

DDSP Singing Vocoders

A. Installation

B. Dataset

C. Training

D. Validation

E. Inference

F. Post-Processing

G. More Information

H. Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages