Skip to content

ansible.builtin.service_facts state is not accurate when the source is systemd #84606

Closed
@NomakCooper

Description

Summary

As indicated in the title, the ansible.builtin.service_facts module does not accurately report the state of services when the source is systemd.

On hosts using systemd, the state exited or dead indicates that the service is inactive.
However, this does not confirm that the service is properly in the stopped state.

According to the code in the SystemctlScanService class here :

class SystemctlScanService(BaseService):

    BAD_STATES = frozenset(['not-found', 'masked', 'failed'])

    def systemd_enabled(self):
        return is_systemd_managed(self.module)

    def _list_from_units(self, systemctl_path, services):

        # list units as systemd sees them
        rc, stdout, stderr = self.module.run_command("%s list-units --no-pager --type service --all --plain" % systemctl_path, use_unsafe_shell=True)
        if rc != 0:
            self.module.warn("Could not list units from systemd: %s" % stderr)
        else:
            for line in [svc_line for svc_line in stdout.split('\n') if '.service' in svc_line]:

                state_val = "stopped"
                status_val = "unknown"
                fields = line.split()

                # systemd sometimes gives misleading status
                # check all fields for bad states
                for bad in self.BAD_STATES:
                    # except description
                    if bad in fields[:-1]:
                        status_val = bad
                        break
                else:
                    # active/inactive
                    status_val = fields[2]

                service_name = fields[0]
                if fields[3] == "running":
                    state_val = "running"

                services[service_name] = {"name": service_name, "state": state_val, "status": status_val, "source": "systemd"}

the service state on hosts is determined by the SUB parameter from the following command.

$ systemctl list-units --no-pager --type service --all --plain

Unfortunately, the SUB parameter, which is referenced as fields[3], is later overridden by the state_val variable, which is set to stopped by default at the beginning.

As a result, this leads to the module always returning only two possible outcomes: stopped or running, rather than accurately reporting the true state of the services.

Issue Type

Bug Report

Component Name

lib/ansible/modules/service_facts.py

Ansible Version

$ ansible --version
ansible [core 2.15.12]
  config file = /root/.ansible/ansible.cfg
  configured module search path = ['/root/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /root/.local/lib/python3.9/site-packages/ansible
  ansible collection location = /root/.ansible/collections:/usr/share/ansible/collections
  executable location = /root/.local/bin/ansible
  python version = 3.9.19 (main, Jul 18 2024, 00:00:00) [GCC 11.4.1 20231218 (Red Hat 11.4.1-3)] (/usr/bin/python3)
  jinja version = 3.1.4
  libyaml = True

Configuration

# if using a version older than ansible-core 2.12 you should omit the '-t all'
$ ansible-config dump --only-changed -t all
CONFIG_FILE() = /root/.ansible/ansible.cfg
DEFAULT_BECOME(/root/.ansible/ansible.cfg) = False
DEFAULT_HOST_LIST(/root/.ansible/ansible.cfg) = ['/etc/ansible/hosts']
DEFAULT_REMOTE_USER(/root/.ansible/ansible.cfg) = root

OS / Environment

  • CentOS Linux 7
  • CentOS Linux 8
  • CentOS Stream 9
  • Red Hat Enterprise Linux 7
  • Red Hat Enterprise Linux 7
  • Red Hat Enterprise Linux 9

Steps to Reproduce

This playbook captures the state of services both before and after kill the rsyslog service.
After kill, the service enters a dead state, but the service_facts module reports it as stopped.

# systemd_test.yml
- name: Test systemd service_facts
  hosts: all
  gather_facts: no

  # Global Vars
  vars:
    servname: "rsyslog"

  tasks:

  - name: state of services now
    command: systemctl list-units --no-pager --type service --all --plain
    register: service_list_pre

  - name: set fact
    set_fact:
      rsyslog_state_pre: "{{ service_list_pre.stdout_lines | select('search', servname) | list | first | regex_replace('\\s+', ' ') | split(' ') }}"

  - name: print state from command
    debug:
      msg: "rsyslog is {{ rsyslog_state_pre[4]}}"

  - name: kill rsyslog
    command: killall rsyslogd

  - name: new state of services
    command: systemctl list-units --no-pager --type service --all --plain
    register: service_list_post

  - name: set fact
    set_fact:
      rsyslog_state_now: "{{ service_list_post.stdout_lines | select('search', servname) | list | first | regex_replace('\\s+', ' ') | split(' ') }}"

  - name: print new state from command
    debug:
      msg: "rsyslog is {{ rsyslog_state_now[4]}}"

  - name: populate service facts
    service_facts:

  - name: print state from service_facts
    debug:
      msg: "rsyslog is {{ ansible_facts['services'].values() | selectattr('name', 'equalto', servname + '.service') | map(attribute='state') }}"

Unfortunately, I cannot provide an output with the -vvv option as it exceeds 65536 characters.

Expected Results

This module should report the true state of the services without overwriting it.

Accurately determining the state of a service is essential when using Ansible, particularly in enterprise environments.

Here is a suggested modification to the SystemctlScanService class to ensure that the value of fields[3] is directly passed to the state_val variable.

class SystemctlScanService(BaseService):

    BAD_STATES = frozenset(['not-found', 'masked', 'failed'])

    def systemd_enabled(self):
        return is_systemd_managed(self.module)

    def _list_from_units(self, systemctl_path, services):

        # list units as systemd sees them
        rc, stdout, stderr = self.module.run_command("%s list-units --no-pager --type service --all --plain" % systemctl_path, use_unsafe_shell=True)
        if rc != 0:
            self.module.warn("Could not list units from systemd: %s" % stderr)
        else:
            for line in [svc_line for svc_line in stdout.split('\n') if '.service' in svc_line]:

                state_val = "unknown"  # Default to unknown
                status_val = "unknown"
                fields = line.split()

                # systemd sometimes gives misleading status
                # check all fields for bad states
                for bad in self.BAD_STATES:
                    # except description
                    if bad in fields[:-1]:
                        status_val = bad
                        break
                else:
                    # active/inactive
                    status_val = fields[2]

                service_name = fields[0]
                state_val = fields[3]  # Set state_val to the real value of fields[3]

                services[service_name] = {"name": service_name, "state": state_val, "status": status_val, "source": "systemd"}

Actual Results

PLAY [Test systemd service_facts] **********************************************************************************************************************************************************************************

TASK [state of rsyslog now] ****************************************************************************************************************************************************************************************
changed: [localhost]

TASK [set fact] ****************************************************************************************************************************************************************************************************
ok: [localhost]

TASK [print state from command] ************************************************************************************************************************************************************************************
ok: [localhost] => {
    "msg": "rsyslog is System"
}

TASK [kill rsyslog] ************************************************************************************************************************************************************************************************
changed: [localhost]

TASK [new state of rsyslog] ****************************************************************************************************************************************************************************************
changed: [localhost]

TASK [set fact] ****************************************************************************************************************************************************************************************************
ok: [localhost]

TASK [print new state from command] ********************************************************************************************************************************************************************************
ok: [localhost] => {
    "msg": "rsyslog is System"
}

TASK [populate service facts] **************************************************************************************************************************************************************************************
ok: [localhost]

TASK [print state from service_facts] ******************************************************************************************************************************************************************************
ok: [localhost] => {
    "msg": "rsyslog is ['stopped']"
}

PLAY RECAP *********************************************************************************************************************************************************************************************************
localhost                  : ok=9    changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

Code of Conduct

  • I agree to follow the Ansible Code of Conduct

Metadata

Assignees

No one assigned

    Labels

    affects_2.15bugThis issue/PR relates to a bug.moduleThis issue/PR relates to a module.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions