Skip to content

Commit

Permalink
Fix illegal memory access with multi_tensor_apply size above INT_MAX (#…
Browse files Browse the repository at this point in the history
…1825)

Currently, multi_tensor_apply causes an illegal memory access due to
an overflow in the `size` field of `TensorListMetadata`. This can be
reproduced using the following standalone script:

```python
import torch, amp_C
from apex.multi_tensor_apply import multi_tensor_applier
multi_tensor_adam = amp_C.multi_tensor_adam

size = 2**32+1
g_32 = [torch.zeros(size, dtype=torch.float32, device='cuda')]
p_32 = [torch.zeros(size, dtype=torch.float32, device='cuda')]
m_32 = [torch.zeros(size, dtype=torch.float32, device='cuda')]
v_32 = [torch.zeros(size, dtype=torch.float32, device='cuda')]
_dummy_overflow_buf = torch.zeros(1, dtype=torch.int32, device='cuda')

multi_tensor_applier(multi_tensor_adam, _dummy_overflow_buf, [g_32, p_32, m_32, v_32], 0.0, 0.9, 0.95, 1e-08, 1, 1, 1, 0.1)
print(g_32)
```
  • Loading branch information
gdb authored Aug 17, 2024
1 parent 59b80ee commit 79e3dc4
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion csrc/multi_tensor_apply.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ constexpr int depth_to_max_blocks[6] = {320, 320, 320, 320, 320, 320};
template<int n> struct TensorListMetadata
{
void* addresses[n][depth_to_max_tensors[n-1]];
int sizes[depth_to_max_tensors[n-1]];
int64_t sizes[depth_to_max_tensors[n-1]];
unsigned char block_to_tensor[depth_to_max_blocks[n-1]];
int block_to_chunk[depth_to_max_blocks[n-1]]; // I fear this needs to be a full int.
int start_tensor_this_launch;
Expand Down

0 comments on commit 79e3dc4

Please sign in to comment.