Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dask fails with IPv6 network #11500

Open
nambar12 opened this issue Nov 6, 2024 · 0 comments
Open

Dask fails with IPv6 network #11500

nambar12 opened this issue Nov 6, 2024 · 0 comments
Labels
needs triage Needs a response from a contributor

Comments

@nambar12
Copy link

nambar12 commented Nov 6, 2024

We've implemented a custom scheduler for dask and configured it with:

jobqueue:
    <ourscheduler>:
      interface: eth0
      protocol: "tcp://"

this works OK on machine which have eth0 with IPv4 or both IPv4 and IPv6 addresses. It doesn't work on machines which have only IPv6 address. The error message is:

it fails in distributed/utils.get_ip_interface() with "interface eth0 doesn't have an IPv4 address" message

Traceback (most recent call last):
  File "/infrastructure/nambar/pyapitest/test_dask.py", line 7, in <module>
    cluster = NetbatchCluster(queue='iil_critical', qslot="/admin/nambar", log_directory="/tmp",
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/infrastructure/nambar/pyapitest/.venv/lib/python3.12/site-packages/nbdask/nbdask.py", line 144, in __init__
    super().__init__(name=name, config_name="netbatch", log_directory=log_directory,
  File "/infrastructure/nambar/pyapitest/.venv/lib/python3.12/site-packages/dask_jobqueue/core.py", line 663, in __init__
    super().__init__(
  File "/infrastructure/nambar/pyapitest/.venv/lib/python3.12/site-packages/distributed/deploy/spec.py", line 284, in __init__
    self.sync(self._start)
  File "/infrastructure/nambar/pyapitest/.venv/lib/python3.12/site-packages/distributed/utils.py", line 364, in sync
    return sync(
           ^^^^^
  File "/infrastructure/nambar/pyapitest/.venv/lib/python3.12/site-packages/distributed/utils.py", line 440, in sync
    raise error
  File "/infrastructure/nambar/pyapitest/.venv/lib/python3.12/site-packages/distributed/utils.py", line 414, in f
    result = yield future
             ^^^^^^^^^^^^
  File "/infrastructure/nambar/pyapitest/.venv/lib/python3.12/site-packages/tornado/gen.py", line 766, in run
    value = future.result()
            ^^^^^^^^^^^^^^^
  File "/infrastructure/nambar/pyapitest/.venv/lib/python3.12/site-packages/distributed/deploy/spec.py", line 335, in _start
    raise RuntimeError(f"Cluster failed to start: {e}") from e
RuntimeError: Cluster failed to start: interface 'eth0' doesn't have an IPv4 address

Example of machine with both IPv4 and IPv6

Example of machine with IPv6 only:
2: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP group default qlen 1000
link/ether 30:3e:a7:00:67:62 brd ff:ff:ff:ff:ff:ff
altname enp8s0f0
altname ens2f0

The distributed package code seems to indeed check only for IPv6 but Dask documentation states that both IPv4 and IPv6 are supported. Is IPv6 expected to be supported?

  • Dask version: 2024.10.0
  • dask-jobqueue 0.9.0
  • Python version: 3.12.3
  • Operating System: SUSE Linux Enterprise Server 15 SP4
  • Install method (conda, pip, source): pip
@github-actions github-actions bot added the needs triage Needs a response from a contributor label Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs triage Needs a response from a contributor
Projects
None yet
Development

No branches or pull requests

1 participant