Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ready]show dominant parameters #705

Merged
merged 3 commits into from
Nov 29, 2022
Merged

Conversation

glynpu
Copy link
Collaborator

@glynpu glynpu commented Nov 26, 2022

fixing issue #697

The main idea is passing names of each batch parameters through various functions.

An option "--show-dominant-parameters" is added, True for debugging and False for normal-case.

param_rms is updated with p in function ScaledAdam::_show_gradient_dominating_parameter

Does this design meet the requirement? @danpovey


for p in param_group:
for p, named_p in zip(param_group, group_params_names):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if group_param_names is None?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could not be None. The most import part of this pr is tracking name of each parameter.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value was None.

@glynpu
Copy link
Collaborator Author

glynpu commented Nov 28, 2022

Example log with command grep Dom log-train-2022-11-28-17-33-21:

2022-11-28 17:40:31,158 INFO [optim.py:451] Parameter Dominanting tot_sumsq encoder.encoders.2.out_combiner.weight1 with proportion 0.52, where dominant_sumsq=(grad_sumsq*orig_rms_sq)=1.601e+09, grad_sumsq = 1.601e+09, orig_rms_sq=1.000e+00
2022-11-28 17:40:40,443 INFO [optim.py:451] Parameter Dominanting tot_sumsq encoder.encoders.1.out_combiner.weight1 with proportion 0.98, where dominant_sumsq=(grad_sumsq*orig_rms_sq)=8.300e+06, grad_sumsq = 8.300e+06, orig_rms_sq=1.000e+00
2022-11-28 17:40:41,284 INFO [optim.py:451] Parameter Dominanting tot_sumsq encoder.encoder_embed.conv.0.weight with proportion 0.35, where dominant_sumsq=(grad_sumsq*orig_rms_sq)=6.137e+09, grad_sumsq = 3.982e+10, orig_rms_sq=1.541e-01
2022-11-28 17:40:51,205 INFO [optim.py:451] Parameter Dominanting tot_sumsq encoder.encoder_embed.conv.0.weight with proportion 0.44, where dominant_sumsq=(grad_sumsq*orig_rms_sq)=1.116e+07, grad_sumsq = 4.690e+07, orig_rms_sq=2.380e-01

@glynpu glynpu changed the title [draft]show dominant parameters [ready]show dominant parameters Nov 29, 2022
@danpovey danpovey merged commit 1d5c03f into k2-fsa:master Nov 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants