Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

External cluster configuration script does not support IPv6 mgrs #11602

Closed
heliochronix opened this issue Feb 1, 2023 · 6 comments
Closed

External cluster configuration script does not support IPv6 mgrs #11602

heliochronix opened this issue Feb 1, 2023 · 6 comments
Assignees
Labels

Comments

@heliochronix
Copy link
Contributor

heliochronix commented Feb 1, 2023

Is this a bug report or feature request?

  • Bug Report

Deviation from expected behavior:
create-external-cluster-resources.py fails to execute when the ceph cluster returns IPv6 addresses in the mgr services call due to use of IPv4-only socket.gethostbyname function in _convert_hostname_to_ip. There is also an assumption of IPv4 when parsing a returned address in the _invalid_endpoint check.

Expected behavior:
Execution of the external cluster configuration script should handle returning of IPv6 addresses for mgrs

How to reproduce it (minimal and precise):
Create an IPv6 cluster consisting of at least one mgr and the following: (set IPv6 network appropriately)

ms bind ipv4 = false
ms bind ipv6 = true
cluster network = fd07:aaaa:bbbb:cccc::/64

Enable prometheus:

ceph mgr module enable prometheus

Create a RBD pool (rbd here) and attempt to configure the cluster for consumption by rook-ceph using the following:

python3 create-external-cluster-resources.py --rbd-data-pool-name rbd --verbose

File(s) to submit:

  • None

Logs to submit:

➜ cat create-external-cluster-resources.log
➜ python3 create-external-cluster-resources.py --cephfs-filesystem-name datastore --rbd-data-pool-name rbd --verbose
Command Input: {"format": "json", "prefix": "fs ls"}
Return Val: 0
Command Output: b'[{"name":"storage","metadata_pool":"cephfs.storage.meta","metadata_pool_id":5,"data_pool_ids":[4],"data_pools":["cephfs.storage.data"]},{"name":"datastore","metadata_pool":"cephfs.datastore.meta","metadata_pool_id":12,"data_pool_ids":[13],"data_pools":["cephfs.datastore.data"]}]\n'
Error Message:
----------

Command Input: {"format": "json", "prefix": "quorum_status"}
Return Val: 0
Command Output: b'{"election_epoch":8324,"quorum":[0,1,2],"quorum_names":["node0","node1","node2"],"quorum_leader_name":"node0","quorum_age":8074,"features":{"quorum_con":"4540138320759226367","quorum_mon":["kraken","luminous","mimic","osdmap-prune","nautilus","octopus","pacific","elector-pinging","quincy"]},"monmap":{"epoch":18,"fsid":"<REDACTED>","modified":"2022-12-16T07:21:44.770321Z","created":"2020-05-31T23:14:49.288853Z","min_mon_release":17,"min_mon_release_name":"quincy","election_strategy":1,"disallowed_leaders: ":"","stretch_mode":false,"tiebreaker_mon":"","features":{"persistent":["kraken","luminous","mimic","osdmap-prune","nautilus","octopus","pacific","elector-pinging","quincy"],"optional":[]},"mons":[{"rank":0,"name":"node0","public_addrs":{"addrvec":[{"type":"v2","addr":"[fd07:aaaa:bbbb:cccc::10]:3300","nonce":0},{"type":"v1","addr":"[fd07:aaaa:bbbb:cccc::10]:6789","nonce":0}]},"addr":"[fd07:aaaa:bbbb:cccc::10]:6789/0","public_addr":"[fd07:aaaa:bbbb:cccc::10]:6789/0","priority":0,"weight":0,"crush_location":"{}"},{"rank":1,"name":"node1","public_addrs":{"addrvec":[{"type":"v2","addr":"[fd07:aaaa:bbbb:cccc::11]:3300","nonce":0},{"type":"v1","addr":"[fd07:aaaa:bbbb:cccc::11]:6789","nonce":0}]},"addr":"[fd07:aaaa:bbbb:cccc::11]:6789/0","public_addr":"[fd07:aaaa:bbbb:cccc::11]:6789/0","priority":0,"weight":0,"crush_location":"{}"},{"rank":2,"name":"node2","public_addrs":{"addrvec":[{"type":"v2","addr":"[fd07:aaaa:bbbb:cccc::12]:3300","nonce":0},{"type":"v1","addr":"[fd07:aaaa:bbbb:cccc::12]:6789","nonce":0}]},"addr":"[fd07:aaaa:bbbb:cccc::12]:6789/0","public_addr":"[fd07:aaaa:bbbb:cccc::12]:6789/0","priority":0,"weight":0,"crush_location":"{}"}]}}\n'
Error Message:
----------

Command Input: {"caps": ["mon", "allow r, allow command quorum_status, allow command version", "mgr", "allow command config", "osd", "allow rwx pool=default.rgw.meta, allow r pool=.rgw.root, allow rw pool=default.rgw.control, allow rx pool=default.rgw.log, allow x pool=default.rgw.buckets.index"], "entity": "client.healthchecker", "format": "json", "prefix": "auth get-or-create"}
Return Val: 0
Command Output: b'[{"entity":"client.healthchecker","key":"<REDACTED>","caps":{"mgr":"allow command config","mon":"allow r, allow command quorum_status, allow command version","osd":"allow rwx pool=default.rgw.meta, allow r pool=.rgw.root, allow rw pool=default.rgw.control, allow rx pool=default.rgw.log, allow x pool=default.rgw.buckets.index"}}]'
Error Message:
----------

Command Input: {"format": "json", "prefix": "mgr services"}
Return Val: 0
Command Output: b'{"prometheus":"http://[fd07:aaaa:bbbb:cccc::11]:9283/"}'
Error Message:
----------

Command Input: {"entity": "client.csi-rbd-node", "format": "json", "prefix": "auth get"}
Return Val: 0
Command Output: b'[{"entity":"client.csi-rbd-node","key":"<REDACTED>","caps":{"mon":"profile rbd, allow command \'osd blocklist\'","osd":"profile rbd"}}]'
Error Message: exported keyring for client.csi-rbd-node
----------

Command Input: {"entity": "client.csi-rbd-provisioner", "format": "json", "prefix": "auth get"}
Return Val: 0
Command Output: b'[{"entity":"client.csi-rbd-provisioner","key":"<REDACTED>","caps":{"mgr":"allow rw","mon":"profile rbd, allow command \'osd blocklist\'","osd":"profile rbd"}}]'
Error Message: exported keyring for client.csi-rbd-provisioner
----------

Command Input: {"entity": "client.csi-cephfs-node", "format": "json", "prefix": "auth get"}
Return Val: 0
Command Output: b'[{"entity":"client.csi-cephfs-node","key":"<REDACTED>","caps":{"mds":"allow rw","mgr":"allow rw","mon":"allow r, allow command \'osd blocklist\'","osd":"allow rw tag cephfs *=*"}}]'
Error Message: exported keyring for client.csi-cephfs-node
----------

Command Input: {"entity": "client.csi-cephfs-provisioner", "format": "json", "prefix": "auth get"}
Return Val: 0
Command Output: b'[{"entity":"client.csi-cephfs-provisioner","key":"<REDACTED>","caps":{"mgr":"allow rw","mon":"allow r, allow command \'osd blocklist\'","osd":"allow rw tag cephfs metadata=*"}}]'
Error Message: exported keyring for client.csi-cephfs-provisioner
----------

Command Input: {"format": "json", "prefix": "status"}
Return Val: 0
Command Output: b'{"fsid":"<REDACTED>","health":{"status":"HEALTH_WARN","checks":{"POOL_TOO_MANY_PGS":{"severity":"HEALTH_WARN","summary":{"message":"3 pools have too many placement groups","count":3},"muted":false}},"mutes":[]},"election_epoch":8324,"quorum":[0,1,2],"quorum_names":["node0","node1","node2"],"quorum_age":8074,"monmap":{"epoch":18,"min_mon_release_name":"quincy","num_mons":3},"osdmap":{"epoch":78537,"num_osds":10,"num_up_osds":10,"osd_up_since":1674273594,"num_in_osds":10,"osd_in_since":1673688204,"num_remapped_pgs":0},"pgmap":{"pgs_by_state":[{"state_name":"active+clean","count":673}],"num_pgs":673,"num_pools":6,"num_objects":4760815,"data_bytes":19237301853037,"bytes_used":26394698907648,"bytes_avail":25611030036480,"bytes_total":52005728944128},"fsmap":{"epoch":21762,"id":1,"up":1,"in":1,"max":1,"id":3,"up":1,"in":1,"max":1,"by_rank":[{"filesystem_id":1,"rank":0,"name":"storage.node2.vrmjsf","status":"up:active","gid":17676143},{"filesystem_id":3,"rank":0,"name":"datastore.node2.fvqndp","status":"up:active","gid":17676194}],"up:standby":2},"mgrmap":{"available":true,"num_standbys":1,"modules":["cephadm","iostat","pg_autoscaler","prometheus","restful"],"services":{"prometheus":"http://[fd07:aaaa:bbbb:cccc::11]:9283/"}},"servicemap":{"epoch":505506,"modified":"2023-02-01T01:03:26.045333+0000","services":{}},"progress_events":{}}\n'
Error Message:
----------

Execution Failed: Conversion of host: fd07:aaaa:bbbb:cccc::11 to IP failed. Please enter the IP addresses of all the ceph-mgrs with the '--monitoring-endpoint' flag
Traceback (most recent call last):
  File "/root/ceph/create-external-cluster-resources.py", line 754, in get_active_and_standby_mgrs
    monitoring_endpoint_ip = self._convert_hostname_to_ip(
  File "/root/ceph/create-external-cluster-resources.py", line 694, in _convert_hostname_to_ip
    ip = socket.gethostbyname(host_name)
socket.gaierror: [Errno -9] Address family for hostname not supported

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/ceph/create-external-cluster-resources.py", line 1685, in <module>
    raise err
  File "/root/ceph/create-external-cluster-resources.py", line 1682, in <module>
    rjObj.main()
  File "/root/ceph/create-external-cluster-resources.py", line 1662, in main
    generated_output = self.gen_json_out()
  File "/root/ceph/create-external-cluster-resources.py", line 1387, in gen_json_out
    self._gen_output_map()
  File "/root/ceph/create-external-cluster-resources.py", line 1356, in _gen_output_map
    ) = self.get_active_and_standby_mgrs()
  File "/root/ceph/create-external-cluster-resources.py", line 763, in get_active_and_standby_mgrs
    raise ExecutionFailureException(
__main__.ExecutionFailureException: Conversion of host: fd07:aaaa:bbbb:cccc::11 to IP failed. Please enter the IP addresses of all the ceph-mgrs with the '--monitoring-endpoint' flag

Cluster Status to submit:

➜ ceph -s
  cluster:
    id:     <REDACTED>
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum node0,node1,node2 (age 2h)
    mgr: node1(active, since 56m), standbys: node2
    mds: 2/2 daemons up, 2 standby
    osd: 10 osds: 10 up (since 10d), 10 in (since 2w)

  data:
    volumes: 2/2 healthy
    pools:   6 pools, 673 pgs
    objects: 4.76M objects, 17 TiB
    usage:   24 TiB used, 23 TiB / 47 TiB avail
    pgs:     673 active+clean

Environment:

  • IPv6 cephadm managed external cluster
@github-actions
Copy link

github-actions bot commented Apr 2, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@heliochronix
Copy link
Contributor Author

This still doesn't work

@parth-gr parth-gr removed the wontfix label Apr 3, 2023
@parth-gr
Copy link
Member

parth-gr commented Apr 3, 2023

This still doesn't work

Yes, currently it is still not supported,
I would be taking it as the next priority, it would take a little more time
Can you please use IPV4 if valid till then
Thanks

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@travisn
Copy link
Member

travisn commented Jul 6, 2023

@heliochronix Let us know if you see any issues with the updated script from #12143!

@parth-gr
Copy link
Member

@heliochronix please re-open if you found any problem in using it,
Currently closing as we have a PR that adds its supports #12143

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants