Inverse Mel-transform - Not getting original audio back

### 🐛 Describe the bug

Trying to compute Mel-spectrogram for audio signal and do inverse operation on this output to ensure we are getting the original audio back. 
`assert orig_audio == inverse_mel_spectrogram(mel_spectrogram(orig_audio))`  #Not an actual code, just for understanding.

I use standard torch's [STFT](https://pytorch.org/docs/stable/generated/torch.stft.html) and [mel-scale](https://pytorch.org/audio/stable/transforms.html#melscale) transforms for getting mel-spectrogram output, whereas for inverse mel-spectrogram I can't find an approach in torch similar to that of Librosa's [mel_to_stft](https://github.com/librosa/librosa/blob/25538adb3aed3485a06e60b6dad88be3d540f0c2/librosa/feature/inverse.py#L20).    
The reason why am looking for at [Librosa's](https://github.com/librosa/librosa/blob/25538adb3aed3485a06e60b6dad88be3d540f0c2/librosa/feature/inverse.py#L20) implementation instead of torch's offering such as [InverseMelScale](https://pytorch.org/audio/stable/transforms.html#inversemelscale) is because   
- Librosa is cheaper (I guess it uses LBFGS)  
- I don't care about the perfect approximation of the phase because I already have the output of STFT complex ( I have original magphase). For my use case, it's not necessary to get an accurate phase of audio signal, all that i care is converting from mel-scale to linear-scale. Hence, trying to avoid costlier methods such as [SGD](https://pytorch.org/audio/stable/transforms.html#inversemelscale) or [Griffin-Lim](https://pytorch.org/audio/stable/transforms.html#griffinlim). 

I tried to port the Librosa's mel_to_stft code using torch, but am not getting the original audio signal back. Librosa uses [np.linalg.lstsq](https://librosa.org/doc/0.7.2/_modules/librosa/util/_nnls.html#nnls), equivalent of the same in pytorch is [torch.linalg.lstsq()](https://pytorch.org/docs/stable/generated/torch.linalg.lstsq.html#torch.linalg.lstsq). The code I tried for the same is available in this [colab notebook](https://colab.research.google.com/drive/10Iex_6WlQfEiIzT4oIZ0M_AocA_mv5rf?usp=sharing).

Kindly let me know if I have made any mistake in [this](https://colab.research.google.com/drive/10Iex_6WlQfEiIzT4oIZ0M_AocA_mv5rf?usp=sharing), I will update and raise a PR for the same.   

Why am i doing all this ?   
I work on Speech Enhancement, hence i need to compute mel-features from the audio signal - pass it to my model - get TF-mask - multiply with mel-feature - do inverse melscale - reconstruct the audio using ISTFT and get denoised signal.


### Versions

torch 1.11.0+cu115
torchaudio 0.11.0+cu115

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inverse Mel-transform - Not getting original audio back #2541

🐛 Describe the bug

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development