PCS-FIR-Filter

Based on the spectral perceptual gains from the official PCS repo, this project aims to derive the equivalent linear-time-invariant (LTI) finite-impulse-response (FIR) filter coefficients to allow Perceptual Contrast Stretching (PCS) be performed directly on waveforms.

FIR filtering is a differentiable operation, which makes it ideal for Deep Learning applications working directly on waveforms. The FIR filtering example in this project is performed with PyTorch 1-D convolution layer. Of course, the derived filter coefficients (in numpy format) can also be easily applied to other backends.

Requirements

torch >= 1.8
torchaudio
matplotlib
Soundfile
numpy
scipy

Available in requirements.txt

Usage

Filter design:

python PCS_coeffs_generate.py --mode='manual' generates FIR filter coefficients (in *.npy format) and impulse response plot under directory generated_freq_response/ with default spectral PCS coefficients.
Since the original PCS (spectral PCS) works on log-1-p spectrograms, the nonlinearity cannot be reproduced directly with LTI FIR filters; therefore, python PCS_coeffs_generate.py provides two additional statistical filter design methods to approximate the behavior of spectral PCS:
- python PCS_coeffs_generate.py --mode='statistical' --stat_mode='gaussian' measures and approximate spectral PCS's equivalent LTI impulse response with Gaussian signals of varying standard deviations.
- python PCS_coeffs_generate.py --mode='statistical' --stat_mode='wav' --wav_dir='*' measures and approximate spectral PCS's equivalent LTI impulse response with the .wav files you placed in wav_dir.

FIR Filtering with wave-PCS:

python test_PCS_wave.py performs wave-PCS with the FIR filter coefficients derived by PCS_coeffs_generate.py and outputs filtered audio.

Quick comparison to spectral PCS:

python test_PCS_spectral.py performs spectral PCS with official repo's PCS functions. This snippet is meant for comparing how the FIR wave-PCS's result compares to the original spectral PCS.

Example Results

Frequency response of the FIR filter coefficients derived from the default PCS settings with GAIN_SMOOTHING = 0.2:

- Spectra comparison of befer and after PCS:

Frequency response of the FIR filter coefficients derived with audio-wav-based statistical method with Mpop600 Mandarin singing voice dataset:

- Spectra comparison of befer and after PCS:

Reference

The official repo of PCS (https://github.com/RoyChao19477/PCS).
The original PCS paper: Rong Chao, Cheng Yu, Szu-Wei Fu, Xugang Lu, Yu Tsao, "Perceptual Contrast Stretching on Target Feature for Speech Enhancement," (http://arxiv.org/abs/2203.17152)
Mpop600 Mandarin singing voice dataset: C. -C. Chu, F. -R. Yang, Y. -J. Lee, Y. -W. Liu and S. -H. Wu, "MPop600: A Mandarin Popular Song Database with Aligned Audio, Lyrics, and Musical Scores for Singing Voice Synthesis," 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2020, pp. 1647-1652.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
audio_PCSed		audio_PCSed
audio_original		audio_original
generated_freq_response		generated_freq_response
roychao_audio_PCSed		roychao_audio_PCSed
statistical		statistical
.gitattributes		.gitattributes
.gitignore		.gitignore
PCS_coeffs_generate.py		PCS_coeffs_generate.py
PCS_filter.py		PCS_filter.py
README.md		README.md
requirements.txt		requirements.txt
test_PCS_spectral.py		test_PCS_spectral.py
test_PCS_wave.py		test_PCS_wave.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PCS-FIR-Filter

Requirements

Usage

Example Results

Reference

About

Releases

Packages

Languages

YinPing-Cho/PCS-FIR-Filter

Folders and files

Latest commit

History

Repository files navigation

PCS-FIR-Filter

Requirements

Usage

Example Results

Reference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages