You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(CogAgent) (.conda) (base) wpg@node7gpu:/workspace/kkkjr/Item/CogVLM/basic_demo$ torchrun --standalone --nnodes=1 --nproc-per-node=2 cli_demo_sat.py --from_pretrained cogagent-chat --version chat --bf16
/home/wpg/.local/lib/python3.10/site-packages/torch/_subclasses/functional_tensor.py:295: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
cpu = _conversion_method_template(device=torch.device("cpu"))
W1129 05:48:09.364000 64664 torch/distributed/run.py:793]
W1129 05:48:09.364000 64664 torch/distributed/run.py:793] *****************************************
W1129 05:48:09.364000 64664 torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
W1129 05:48:09.364000 64664 torch/distributed/run.py:793] *****************************************
/home/wpg/.local/lib/python3.10/site-packages/torch/_subclasses/functional_tensor.py:295: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
cpu = _conversion_method_template(device=torch.device("cpu"))
/home/wpg/.local/lib/python3.10/site-packages/torch/_subclasses/functional_tensor.py:295: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
cpu = _conversion_method_template(device=torch.device("cpu"))
Traceback (most recent call last):
Traceback (most recent call last):
File "/workspace/kkkjr/Item/CogVLM/basic_demo/cli_demo_sat.py", line 7, in
File "/workspace/kkkjr/Item/CogVLM/basic_demo/cli_demo_sat.py", line 7, in
from sat.model.mixins import CachedAutoregressiveMixinfrom sat.model.mixins import CachedAutoregressiveMixin
File "/home/wpg/.local/lib/python3.10/site-packages/sat/init.py", line 1, in
File "/home/wpg/.local/lib/python3.10/site-packages/sat/init.py", line 1, in
from .arguments import get_args, update_args_with_filefrom .arguments import get_args, update_args_with_file
File "/home/wpg/.local/lib/python3.10/site-packages/sat/arguments.py", line 23, in
File "/home/wpg/.local/lib/python3.10/site-packages/sat/arguments.py", line 23, in
import numpy as np
ModuleNotFoundError: No module named 'numpy'import numpy as np
ModuleNotFoundError: No module named 'numpy'
E1129 05:48:11.400000 64664 torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 0 (pid: 64829) of binary: /usr/bin/python3
Traceback (most recent call last):
File "/home/wpg/.local/bin/torchrun", line 8, in
sys.exit(main())
File "/home/wpg/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 355, in wrapper
return f(*args, **kwargs)
File "/home/wpg/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main
run(args)
File "/home/wpg/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run
elastic_launch(
File "/home/wpg/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/wpg/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
(CogAgent) (.conda) (base) wpg@node7gpu:/workspace/kkkjr/Item/CogVLM/basic_demo$ torchrun --standalone --nnodes=1 --nproc-per-node=2 cli_demo_sat.py --from_pretrained cogagent-chat --version chat --bf16
/home/wpg/.local/lib/python3.10/site-packages/torch/_subclasses/functional_tensor.py:295: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
cpu = _conversion_method_template(device=torch.device("cpu"))
W1129 05:48:09.364000 64664 torch/distributed/run.py:793]
W1129 05:48:09.364000 64664 torch/distributed/run.py:793] *****************************************
W1129 05:48:09.364000 64664 torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
W1129 05:48:09.364000 64664 torch/distributed/run.py:793] *****************************************
/home/wpg/.local/lib/python3.10/site-packages/torch/_subclasses/functional_tensor.py:295: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
cpu = _conversion_method_template(device=torch.device("cpu"))
/home/wpg/.local/lib/python3.10/site-packages/torch/_subclasses/functional_tensor.py:295: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
cpu = _conversion_method_template(device=torch.device("cpu"))
Traceback (most recent call last):
Traceback (most recent call last):
File "/workspace/kkkjr/Item/CogVLM/basic_demo/cli_demo_sat.py", line 7, in
File "/workspace/kkkjr/Item/CogVLM/basic_demo/cli_demo_sat.py", line 7, in
from sat.model.mixins import CachedAutoregressiveMixinfrom sat.model.mixins import CachedAutoregressiveMixin
File "/home/wpg/.local/lib/python3.10/site-packages/sat/init.py", line 1, in
File "/home/wpg/.local/lib/python3.10/site-packages/sat/init.py", line 1, in
from .arguments import get_args, update_args_with_filefrom .arguments import get_args, update_args_with_file
File "/home/wpg/.local/lib/python3.10/site-packages/sat/arguments.py", line 23, in
File "/home/wpg/.local/lib/python3.10/site-packages/sat/arguments.py", line 23, in
import numpy as np
ModuleNotFoundError: No module named 'numpy'import numpy as np
ModuleNotFoundError: No module named 'numpy'
E1129 05:48:11.400000 64664 torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 0 (pid: 64829) of binary: /usr/bin/python3
Traceback (most recent call last):
File "/home/wpg/.local/bin/torchrun", line 8, in
sys.exit(main())
File "/home/wpg/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 355, in wrapper
return f(*args, **kwargs)
File "/home/wpg/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main
run(args)
File "/home/wpg/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run
elastic_launch(
File "/home/wpg/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/wpg/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
cli_demo_sat.py FAILED
Failures:
[1]:
time : 2024-11-29_05:48:11
host : node7gpu
rank : 1 (local_rank: 1)
exitcode : 1 (pid: 64830)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
Root Cause (first observed failure):
[0]:
time : 2024-11-29_05:48:11
host : node7gpu
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 64829)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
这里我的已经有了numpy,不知道为啥还是出现这个情况
The text was updated successfully, but these errors were encountered: