Description
In the example for mel_to_stft we have the following code:
y, sr = librosa.load(librosa.ex('trumpet'))
S = np.abs(librosa.stft(y))
mel_spec = librosa.feature.melspectrogram(S=S, sr=sr)
S_inv = librosa.feature.inverse.mel_to_stft(mel_spec, sr=sr)
S is the magnitude, mel_spec is therefore a mel magnitude spectrogram (not mel power spectrogram). However, the code for librosa.feature.inverse.mel_to_stft
does the following:
M : np.ndarray [shape=(..., n_mels, n), non-negative]
The spectrogram as produced by `feature.melspectrogram`
inverse = nnls(mel_basis, M)
return np.power(inverse, 1.0 / power, out=inverse)
The power
variable is not passed in the example and is therefore the default value power = 2.0
. This would be correct when assuming a power spectrogram as input, which is not the case here.
My understanding is, that S_inv
is now the approximate square root of S
? However, in the example we proceed to compare the two assuming they should be equal, apart from the approximation error.
IMO we either need to convert magnitude into power with S = np.abs(librosa.stft(y))**2
or pass the power
parameter in S_inv = librosa.feature.inverse.mel_to_stft(mel_spec, sr=sr, power=1)
.