Skip to content

ceph osd status AssertionError #15222

Open
@reefland

Description

Is this a bug report or feature request?

  • Bug Report

Deviation from expected behavior:

Trying to run ceph osd status results in AssertionError.

Expected behavior:

Expected to see status of osds.

How to reproduce it (minimal and precise):

bash-5.1$ ceph health detail 
HEALTH_OK
bash-5.1$ ceph -s
  cluster:
    id:     cb82340a-2eaf-4597-b83e-cc0e62a9d019
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum aa,aq,as (age 3d)
    mgr: a(active, since 11d), standbys: b
    mds: 1/1 daemons up, 1 hot standby
    osd: 11 osds: 9 up (since 43h), 9 in (since 42h)
    rgw: 2 daemons active (2 hosts, 1 zones)
 
  data:
    volumes: 1/1 healthy
    pools:   13 pools, 225 pgs
    objects: 124.90k objects, 411 GiB
    usage:   1.1 TiB used, 5.8 TiB / 6.9 TiB avail
    pgs:     225 active+clean
 
  io:
    client:   852 B/s rd, 171 KiB/s wr, 9 op/s rd, 9 op/s wr

I know I have a node down and 2 OSDs are offline, but I still expected this to work:

bash-5.1$ ceph osd status
Error EINVAL: Traceback (most recent call last):
  File "/usr/share/ceph/mgr/mgr_module.py", line 1864, in _handle_command
    return CLICommand.COMMANDS[cmd['prefix']].call(self, cmd, inbuf)
  File "/usr/share/ceph/mgr/mgr_module.py", line 499, in call
    return self.func(mgr, **kwargs)
  File "/usr/share/ceph/mgr/status/module.py", line 337, in handle_osd_status
    assert metadata
AssertionError

Expected messages in operator log:

2024-12-28 17:14:51.957900 I | clusterdisruption-controller: osd is down in failure domain "k3s05". pg health: "all PGs in cluster are clean"
2024-12-28 17:15:22.185870 I | clusterdisruption-controller: osd "rook-ceph-osd-4" is down but no node drain is detected
2024-12-28 17:15:22.185966 I | clusterdisruption-controller: osd "rook-ceph-osd-8" is down but no node drain is detected
2024-12-28 17:15:22.670477 I | clusterdisruption-controller: osd is down in failure domain "k3s05". pg health: "all PGs in cluster are clean"
2024-12-28 17:15:52.910985 I | clusterdisruption-controller: osd "rook-ceph-osd-4" is down but no node drain is detected
2024-12-28 17:15:52.911092 I | clusterdisruption-controller: osd "rook-ceph-osd-8" is down but no node drain is detected
2024-12-28 17:15:53.193231 I | clusterdisruption-controller: osd is down in failure domain "k3s05". pg health: "all PGs in cluster are clean"
2024-12-28 17:16:23.463682 I | clusterdisruption-controller: osd "rook-ceph-osd-4" is down but no node drain is detected
2024-12-28 17:16:23.463794 I | clusterdisruption-controller: osd "rook-ceph-osd-8" is down but no node drain is detected
2024-12-28 17:16:23.869491 I | clusterdisruption-controller: osd is down in failure domain "k3s05". pg health: "all PGs in cluster are clean"
  • Ceph Version:
bash-5.1$ ceph versions
{
    "mon": {
        "ceph version 19.2.0 (16063ff2022298c9300e49a547a16ffda59baf13) squid (stable)": 3
    },
    "mgr": {
        "ceph version 19.2.0 (16063ff2022298c9300e49a547a16ffda59baf13) squid (stable)": 2
    },
    "osd": {
        "ceph version 19.2.0 (16063ff2022298c9300e49a547a16ffda59baf13) squid (stable)": 9
    },
    "mds": {
        "ceph version 19.2.0 (16063ff2022298c9300e49a547a16ffda59baf13) squid (stable)": 2
    },
    "rgw": {
        "ceph version 19.2.0 (16063ff2022298c9300e49a547a16ffda59baf13) squid (stable)": 2
    },
    "overall": {
        "ceph version 19.2.0 (16063ff2022298c9300e49a547a16ffda59baf13) squid (stable)": 18
    }
}

Environment:

  • OS (e.g. from /etc/os-release): ubuntu 22.04.5
  • Kernel (e.g. uname -a): 5.15.0-106-generic
  • Cloud provider or hardware configuration: bare metal
  • Rook version (use rook version inside of a Rook Pod): rook-version=v1.15.6
  • Storage backend version (e.g. for ceph do ceph -v): ceph-version=19.2.0-0
  • Kubernetes version (use kubectl version): v1.30.8+k3s1
  • Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): K3S

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions