Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ResourceWarning shows up when doing multigpu graphbolt example #7777

Open
TristonC opened this issue Sep 5, 2024 · 4 comments
Open

ResourceWarning shows up when doing multigpu graphbolt example #7777

TristonC opened this issue Sep 5, 2024 · 4 comments
Labels
bug:unconfirmed May be a bug. Need further investigation. stale-issue

Comments

@TristonC
Copy link
Collaborator

TristonC commented Sep 5, 2024

Runs the example under dgl/examples/multigpu/graphbolt, it shows a bit annoying ResourceWarning.

 python node_classification.py --gpu 0,1,2,3,4,5,6,7
16it [00:01, 12.86it/s]
Epoch 00009 | Average Loss 1.1217 | Accuracy 0.6525 | Time 14.4035
Testing...
25it [00:02, 11.18it/s]/usr/lib/python3.10/tempfile.py:1008: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmpnqgx4sch'>
  _warnings.warn(warn_message, ResourceWarning)
27it [00:02, 11.99it/s]
Test Accuracy 0.6272
/usr/lib/python3.10/tempfile.py:1008: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmpns5n0f16'>
  _warnings.warn(warn_message, ResourceWarning)
/usr/lib/python3.10/tempfile.py:1008: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmp4sbe1b2l'>
  _warnings.warn(warn_message, ResourceWarning)
/usr/lib/python3.10/tempfile.py:1008: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmpgtm6h_av'>
  _warnings.warn(warn_message, ResourceWarning)
/usr/lib/python3.10/tempfile.py:1008: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmpbebuzjfx'>
  _warnings.warn(warn_message, ResourceWarning)
/usr/lib/python3.10/tempfile.py:1008: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmpnp2bys02'>
  _warnings.warn(warn_message, ResourceWarning)
/usr/lib/python3.10/tempfile.py:1008: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmp0mbq_2c8'>
  _warnings.warn(warn_message, ResourceWarning)
/usr/lib/python3.10/tempfile.py:1008: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmpcirawy77'>
  _warnings.warn(warn_message, ResourceWarning)
@mfbalin
Copy link
Collaborator

mfbalin commented Sep 5, 2024

I will get to this issue if I can find the time. Meanwhile let me unassign myself so that if anyone else is free, they can take care of this issue.

My priority is to finish the remaining features right now.

@mfbalin mfbalin removed their assignment Sep 5, 2024
@mfbalin mfbalin added bug:confirmed Something isn't working Work Item Work items tracked in project tracker labels Sep 5, 2024
@mfbalin
Copy link
Collaborator

mfbalin commented Sep 5, 2024

I can't reproduce the issue on the pytorch 24-07 container.

root@a100cse:/localscratch/dgl-3/examples/graphbolt/pyg/labor# python ../../../multigpu/graphbolt/node_classification.py --gpu 0,1
Training with 2 gpus.
The dataset is already preprocessed.
[W905 21:16:53.916054495 CUDAAllocatorConfig.h:28] Warning: expandable_segments not supported on this platform (function operator())
[W905 21:16:53.918450461 CUDAAllocatorConfig.h:28] Warning: expandable_segments not supported on this platform (function operator())
Training...
96it [00:02, 37.52it/s]
Validating...
20it [00:00, 48.91it/s]
Epoch 00000 | Average Loss 1.6953 | Accuracy 0.8335 | Time 3.0184
96it [00:02, 45.63it/s]
Validating...
20it [00:00, 50.92it/s]
Epoch 00001 | Average Loss 0.7198 | Accuracy 0.8611 | Time 2.5459
96it [00:02, 45.87it/s]
Validating...
20it [00:00, 51.33it/s]
Epoch 00002 | Average Loss 0.5773 | Accuracy 0.8751 | Time 2.5282
96it [00:02, 45.91it/s]
Validating...
20it [00:00, 51.38it/s]
Epoch 00003 | Average Loss 0.5099 | Accuracy 0.8815 | Time 2.5286
96it [00:02, 46.05it/s]
Validating...
20it [00:00, 51.62it/s]
Epoch 00004 | Average Loss 0.4715 | Accuracy 0.8868 | Time 2.5248
96it [00:02, 45.91it/s]
Validating...
20it [00:00, 51.77it/s]
Epoch 00005 | Average Loss 0.4428 | Accuracy 0.8908 | Time 2.5248
96it [00:02, 45.82it/s]
Validating...
20it [00:00, 51.51it/s]
Epoch 00006 | Average Loss 0.4214 | Accuracy 0.8927 | Time 2.5373
96it [00:02, 45.92it/s]
Validating...
20it [00:00, 51.54it/s]
Epoch 00007 | Average Loss 0.4086 | Accuracy 0.8960 | Time 2.5373
96it [00:02, 45.86it/s]
Validating...
20it [00:00, 51.94it/s]
Epoch 00008 | Average Loss 0.3919 | Accuracy 0.8970 | Time 2.5347
96it [00:02, 45.57it/s]
Validating...
20it [00:00, 53.51it/s]
Epoch 00009 | Average Loss 0.3831 | Accuracy 0.8984 | Time 2.5382
Testing...
1081it [00:18, 57.49it/s]
Test Accuracy 0.7647

@mfbalin mfbalin added bug:unconfirmed May be a bug. Need further investigation. and removed bug:confirmed Something isn't working labels Sep 5, 2024
@mfbalin
Copy link
Collaborator

mfbalin commented Sep 5, 2024

@TristonC Can you provide more details on your environment? I can't reproduce the issue on my machines.

@mfbalin mfbalin removed the Work Item Work items tracked in project tracker label Sep 5, 2024
Copy link

github-actions bot commented Oct 6, 2024

This issue has been automatically marked as stale due to lack of activity. It will be closed if no further activity occurs. Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug:unconfirmed May be a bug. Need further investigation. stale-issue
Projects
None yet
Development

No branches or pull requests

2 participants