Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recipe for Multi-Decoder DPRNN #463

Merged
merged 57 commits into from
Jul 30, 2021
Merged

recipe for Multi-Decoder DPRNN #463

merged 57 commits into from
Jul 30, 2021

Conversation

JunzheJosephZhu
Copy link
Contributor

Don't have the bash script and separate function yet

@JunzheJosephZhu
Copy link
Contributor Author

I just finished writing the eval script. Now only bash script is left. (I don't know how to write bash, btw, so I'd be nice if I could get some help)

@JunzheJosephZhu
Copy link
Contributor Author

Also, I'm not entirely sure how to share models on huggingface. Can someone send me a tutorial somewhere? I currently have both .ckpt and .pth

@mpariente
Copy link
Collaborator

You just need to create a repo there, pull, add your model, change the readme and push.
You can rename the last/best model to pytorch.bin and then fill in the README with some tags on the top. Have a look at this for example.

Thanks for your work on this recipe

Copy link
Collaborator

@mpariente mpariente left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Again, some changes before it can be merged.

Comment on lines 34 to 35
train_dir: "../../../dataset/{}speakers/wav8k/min/tr"
valid_dir: "../../../dataset/{}speakers/wav8k/min/cv"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I said this before, but relative path in the config file doesn't seem right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. I'll change this to data/{}speakers/wav8k/min/tr. I just wrote relative paths because how it was on my system

egs/wsj0-mix-var/Multi-Decoder-DPRNN/run.sh Show resolved Hide resolved

if [[ $stage -le 3 ]]; then
echo "Stage 3: Training"
echo "If you want to change n_srcs, please change the config file"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Printing this every time doesn't seem useful.
Also, why n_srcs is in the file if you can't change it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think bash supports adding an integer list as argument, it just gets processed as strings automatically.

@@ -0,0 +1,214 @@
import os
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clean unused imports

self.log("avg_sdr", avg_sdr)


class WeightedPITLoss(nn.Module):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a docstring and some comments for understanding what this does on the high level?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this fixed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you can see the docstring above the init line.

egs/wsj0-mix-var/Multi-Decoder-DPRNN/eval.py Outdated Show resolved Hide resolved
@JunzheJosephZhu
Copy link
Contributor Author

I fixed the stuff you mentioned, could we pass this PR now?

@mpariente
Copy link
Collaborator

Have you managed to share a model on HF Hub?

Copy link
Collaborator

@mpariente mpariente left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are getting close, thanks again for your dedication.

est_sources = model.separate(mix[None])
p_si_snr = Penalized_PIT_Wrapper(pairwise_neg_sisdr_loose)(est_sources, sources)
utt_metrics = {
"P-Si-SNR": p_si_snr.item(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's P-Si-SNR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's defined in the paper.

p_si_snr = Penalized_PIT_Wrapper(pairwise_neg_sisdr_loose)(est_sources, sources)
utt_metrics = {
"P-Si-SNR": p_si_snr.item(),
"Accuracy": float(sources.size(0) == est_sources.size(0)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a counting accuracy, then counting_accuracy is better.

from scipy.optimize import linear_sum_assignment


class PairwiseNegSDR_Loose(_Loss):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loss intead of Loose

self.take_log = take_log
self.EPS = EPS

def forward(self, est_targets, targets):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way not to duplicate all this code, but subclass it, with minor modif.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main problem is the line assert targets.size() == est_targets.size() in your original code, as it does not support variable number of speakers.

@JunzheJosephZhu
Copy link
Contributor Author

Do we have any further issue to fix, or should we pass this PR?

@mpariente
Copy link
Collaborator

We're close.

Have you managed to share a model on HF Hub?

@JunzheJosephZhu
Copy link
Contributor Author

Hi Pariente,
Sorry for the late reply. I added the model to huggingface.
https://huggingface.co/JunzheJosephZhu/MultiDecoderDPRNN

@mpariente mpariente merged commit 98d1241 into asteroid-team:master Jul 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants