Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix missing scale attributes for GPTJ #3256

Merged
merged 7 commits into from
Apr 21, 2023
Merged

Conversation

cmikeh2
Copy link
Contributor

@cmikeh2 cmikeh2 commented Apr 15, 2023

This PR fixes two regressions introduced in the DeepSpeed chat release for GPT-J:

  1. Checks for the scale attribute on all parameters before accessing.
  2. Changes workspace offsets to avoid scenario where we are double using a buffer and over-writing data.

@mrwyattii mrwyattii enabled auto-merge (squash) April 20, 2023 21:59
@mrwyattii mrwyattii disabled auto-merge April 20, 2023 23:36
@mrwyattii mrwyattii enabled auto-merge (squash) April 21, 2023 00:00
@jeffra jeffra disabled auto-merge April 21, 2023 00:33
@jeffra jeffra merged commit 145c3a7 into master Apr 21, 2023
@jeffra jeffra deleted the cholmes/gptj-weight-scale-fix branch April 21, 2023 00:38
@conglongli conglongli added deepspeed-chat Related to DeepSpeed-Chat and removed deepspeed-chat Related to DeepSpeed-Chat labels Apr 30, 2023
@heroes999
Copy link

heroes999 commented Sep 15, 2023

@cmikeh2 I'm now using TP=8 with huggingface to inference llama, and I found memory overlap between input and output. I think it is due to the workspace offset? I see it is now 4 * buf_size,but may not be enough for TP=8?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants