Fix missing scale attributes for GPTJ #3256

cmikeh2 · 2023-04-15T18:23:53Z

This PR fixes two regressions introduced in the DeepSpeed chat release for GPT-J:

Checks for the scale attribute on all parameters before accessing.
Changes workspace offsets to avoid scenario where we are double using a buffer and over-writing data.

heroes999 · 2023-09-15T10:23:29Z

@cmikeh2 I'm now using TP=8 with huggingface to inference llama, and I found memory overlap between input and output. I think it is due to the workspace offset? I see it is now 4 * buf_size，but may not be enough for TP=8?

Fix missing scale attributes

565995a

cmikeh2 requested review from RezaYazdaniAminabadi, jeffra, mrwyattii, awan-10 and arashb as code owners April 15, 2023 18:23

cmikeh2 mentioned this pull request Apr 15, 2023

[BUG] AttributeError: 'Parameter' object has no attribute 'scale' #3242

Closed

cmikeh2 added 2 commits April 17, 2023 20:29

Fix double usage of scratch space

a488488

Merge branch 'master' into cholmes/gptj-weight-scale-fix

49f9d5b

This was linked to issues Apr 17, 2023

[BUG] AttributeError: 'Parameter' object has no attribute 'scale' #3242

Closed

[BUG] unable to run inference with gpt-j-6b #3254

Closed

cmikeh2 mentioned this pull request Apr 17, 2023

[BUG] unable to run inference with gpt-j-6b #3254

Closed

publicstaticvo mentioned this pull request Apr 18, 2023

self.qkv_gemm_func returns ValueError: The deleter and context arguments are mutually exclusive. #3284

Open

Merge branch 'master' into cholmes/gptj-weight-scale-fix

4c2b6c1

mrwyattii approved these changes Apr 20, 2023

View reviewed changes

Merge branch 'master' into cholmes/gptj-weight-scale-fix

0fd4a3f

mrwyattii enabled auto-merge (squash) April 20, 2023 21:59

Merge branch 'master' into cholmes/gptj-weight-scale-fix

4cc8faa

mrwyattii disabled auto-merge April 20, 2023 23:36

Merge branch 'master' into cholmes/gptj-weight-scale-fix

829496f

mrwyattii enabled auto-merge (squash) April 21, 2023 00:00

jeffra disabled auto-merge April 21, 2023 00:33

jeffra approved these changes Apr 21, 2023

View reviewed changes

jeffra merged commit 145c3a7 into master Apr 21, 2023

jeffra deleted the cholmes/gptj-weight-scale-fix branch April 21, 2023 00:38

conglongli added deepspeed-chat Related to DeepSpeed-Chat and removed deepspeed-chat Related to DeepSpeed-Chat labels Apr 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix missing scale attributes for GPTJ #3256

Fix missing scale attributes for GPTJ #3256

cmikeh2 commented Apr 15, 2023 •

edited

Loading

heroes999 commented Sep 15, 2023 •

edited

Loading

Fix missing scale attributes for GPTJ #3256

Fix missing scale attributes for GPTJ #3256

Conversation

cmikeh2 commented Apr 15, 2023 • edited Loading

heroes999 commented Sep 15, 2023 • edited Loading

cmikeh2 commented Apr 15, 2023 •

edited

Loading

heroes999 commented Sep 15, 2023 •

edited

Loading