Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix missing scale attributes for GPTJ #3256

Merged
merged 7 commits into from
Apr 21, 2023
Prev Previous commit
Next Next commit
Fix double usage of scratch space
  • Loading branch information
cmikeh2 committed Apr 17, 2023
commit a4884888321d1be052d32e21033788bba18ba8d0
4 changes: 2 additions & 2 deletions csrc/transformer/inference/csrc/pt_binding.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -462,9 +462,9 @@ std::vector<at::Tensor> ds_softmax_context(at::Tensor& query_key_value,

T* workspace = (T*)InferenceContext::Instance().GetWorkSpace();
size_t buf_size = bsz * seq_len * hidden_dim;
auto output = torch::from_blob(workspace + 3 * buf_size, {bsz, seq_len, hidden_dim}, options);
auto output = torch::from_blob(workspace + 4 * buf_size, {bsz, seq_len, hidden_dim}, options);

auto query_cont = workspace + 4 * buf_size;
auto query_cont = workspace + 5 * buf_size;
size_t offset =
10 * (hidden_dim * bsz * InferenceContext::Instance().GetMaxTokenLenght()) +
layer_id * 2 * bsz * InferenceContext::Instance().GetMaxTokenLenght() * hidden_dim;
Expand Down