Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hotfix/Colossalai layers #92

Merged
merged 3 commits into from
Dec 29, 2021
Merged

Hotfix/Colossalai layers #92

merged 3 commits into from
Dec 29, 2021

Conversation

kurisusnowdeng
Copy link
Member

@kurisusnowdeng kurisusnowdeng commented Dec 28, 2021

  • Optimized 1D layer apis
  • Reorganized nn.layer modules
  • Added colossalai version dropout layer
  • Fixed tests

@FrankLeeeee
Copy link
Contributor

colossalai/nn/layer/colossalai_layer/dropout.py

  1. why need to import these modules in line 7 - 13

colossalai/nn/layer/parallel_1d/_utils.py

  1. in set_parallel_input, the environment variable is set to empty or true, seems a bit strange to me.

colossalai/nn/layer/parallel_1d/layers.py

  1. In line 68, as the input argument for both col and row linear are the same, we can just have def forward(self, input_. Using *args does not help with linting.

colossalai/nn/layer/parallel_1d/layers.py

  1. changing RNG states on CUDA is expensive, it is better to use a simple if statement instead of conditional context in line 381

tests/test_context/test_2d_init.py, tests/test_context/test_2p5d_init.py, tests/test_context/test_3d_init.py and some other testing scripts

  1. why change the port?

@kurisusnowdeng kurisusnowdeng requested review from FrankLeeeee and removed request for FrankLeeeee December 28, 2021 15:25
@kurisusnowdeng kurisusnowdeng force-pushed the main branch 2 times, most recently from 56d6523 to cf6b1e3 Compare December 29, 2021 08:34
@kurisusnowdeng kurisusnowdeng requested review from FrankLeeeee and removed request for FrankLeeeee December 29, 2021 08:37
Copy link
Contributor

@FrankLeeeee FrankLeeeee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kurisusnowdeng , I have reviewed the code and found some problems.

colossalai/engine/schedule/_base_schedule.py

  1. split_batch only splits the batch dimension. Where can the user to split other dimensions? e.g. user needs to split the last dimension if using 2D
  2. The current implementation assumes that the batch data consists of an input and an ouput and does not support multiple inputs and ouputs like those in BERT. If the user wants to use BERT inputs and outputs, they have to build a new schedule class and overrides the load_batch method. I think this might cause inconvenience for the user.
  3. @ver217 is updating the load_batch function as well, I think you might talk to him to avoid conflicts.

colossalai/nn/layer/parallel_2p5d/_operation.py

  1. In line 123 and 125, we need to call contiguous() on reshaped tensors

Screenshot 2021-12-29 at 9 08 27 PM

colossalai/nn/layer/parallel_1d/layers.py

  1. In line 382, conditional_context is still used. I would suggest to use a simple if statement as suggested in the last comment. Do look out for similar cases too.

colossalai/nn/layer/parallel_1d/layers.py

  1. In line 48, if gather_output is True for Linear1D_Col, set_parallel_input(True) should be False.

@FrankLeeeee FrankLeeeee merged commit 01a80cd into hpcaitech:main Dec 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants