-
Notifications
You must be signed in to change notification settings - Fork 27.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bugfix: torch.export failure caused by _make_causal_mask
#35291
Conversation
Recent changes in torch dynamo prevent mutations on tensors converted with aten::_to_copy. To address this, we can clone such tensor before performing in-place operation `masked_fill_` only when the code is being compiled by torch dynamo. (relevant issue: pytorch/pytorch#127571)
This seems legit to me, but cc @ArthurZucker for core maintainer review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jiwoong-choi, thanks for the update! I faced a similar issue with vision models and can confirm that this should fix torch.export
. Alternatively, we could use a non-inplace operation for masked_fill
, but your solution seems better because it does not change the original behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing
# Recent changes in PyTorch prevent mutations on tensors converted with aten::_to_copy | ||
# See https://github.com/pytorch/pytorch/issues/127571 | ||
if is_torchdynamo_compiling(): | ||
mask = mask.clone() | ||
mask.masked_fill_(context_mask, torch.finfo(dtype).min) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @qubvel, tho I don't mind modifying to have an inplace operation!
What does this PR do?
Fix the
torch.export
failure caused byAttentionMaskConverter._make_causal_mask
.Recent changes in torch dynamo prevent mutations on tensors converted with aten::_to_copy. To address this, we can clone such tensor before performing in-place operation
masked_fill_
only when the code is being compiled by torch dynamo. (relevant issue on PyTorch: pytorch/pytorch#127571)Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
PyTorch: @gante @Rocketknight1