-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hotfix/Colossalai layers #92
Conversation
kurisusnowdeng
commented
Dec 28, 2021
•
edited
Loading
edited
- Optimized 1D layer apis
- Reorganized nn.layer modules
- Added colossalai version dropout layer
- Fixed tests
|
56d6523
to
cf6b1e3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @kurisusnowdeng , I have reviewed the code and found some problems.
colossalai/engine/schedule/_base_schedule.py
split_batch
only splits the batch dimension. Where can the user to split other dimensions? e.g. user needs to split the last dimension if using 2D- The current implementation assumes that the batch data consists of an input and an ouput and does not support multiple inputs and ouputs like those in BERT. If the user wants to use BERT inputs and outputs, they have to build a new schedule class and overrides the
load_batch method
. I think this might cause inconvenience for the user. - @ver217 is updating the
load_batch
function as well, I think you might talk to him to avoid conflicts.
colossalai/nn/layer/parallel_2p5d/_operation.py
- In line 123 and 125, we need to call
contiguous()
on reshaped tensors
colossalai/nn/layer/parallel_1d/layers.py
- In line 382,
conditional_context
is still used. I would suggest to use a simple if statement as suggested in the last comment. Do look out for similar cases too.
colossalai/nn/layer/parallel_1d/layers.py
- In line 48, if
gather_output
is True forLinear1D_Col
,set_parallel_input(True)
should be False.