Docker Swarm healthcheck.test property does not validate for CMD/CMD-SHELL/NONE, results in strange behavior when invalid. #49034
Description
Description
healthcheck.test takes an array where the first element needs to be CMD, CMD-SHELL, or NONE. And in docker compose at least, if you try to compose up with an invalid value, it gives you a nice error message:
healthcheck.test must start either by "CMD", "CMD-SHELL" or "NONE"
HOWEVER, if you try to deploy an invalid value with the docker stack command, it does NOT give any sort of error message. Not only is that the case, but it also results in some unusual behavior.
Since it successfully creates the containers asked for (and always considers them healthy, regardless of what you put in the health test), but fails to properly start/create the service completely (presumably the service defaults to considering it unhealthy), resulting in it showing 0 replicas when using docker service ls, and also failing to make the service available to other services networked to it as a result as well. This odd contradictory behavior made things very confusing to debug.
Reproduce
- Create the following docker-compose.yml file:
services:
db:
image: postgres:alpine
environment:
POSTGRES_PASSWORD: password
healthcheck:
test: ["SOMETHINGINVALID", "pg_isready"]
start_period: 3s
interval: 1s
timeout: 5s
retries: 3
1.5.) Try running docker compose up
, notice how it gives an error message.
2.) Initialize a swarm if you don't already have one with docker swarm init
.
3.) Use docker stack up -c docker-compose.yml healthtest --detach=false
, notice how it gives no error message, and hangs when trying to create the service.
4.) Exit out from the attached log to view what got created.
5.) Docker Desktop (or other methods) should show a container for the postgres image has been created, and is not being restarted or recreated.
6.) Use docker service ls
, and note how it has 0/1 replicas started for the service, despite the clearly running container.
7.) (clean up) Use docker stack down healthtest
when finished.
Expected behavior
When deploying to the swarm it should ideally validate the first healthcheck.test parameter like docker compose already does. If this is not viable for whatever reason, then it should AT LEAST give more consistent behavior for the invalid test condition, such as having the container fail its (invalid) health tests and see constant restarts (a clearer sign that something is wrong).
docker version
Client:
Version: 27.3.1
API version: 1.47
Go version: go1.22.7
Git commit: ce12230
Built: Fri Sep 20 11:38:18 2024
OS/Arch: darwin/arm64
Context: desktop-linux
Server: Docker Desktop 4.36.0 (175267)
Engine:
Version: 27.3.1
API version: 1.47 (minimum version 1.24)
Go version: go1.22.7
Git commit: 41ca978
Built: Fri Sep 20 11:41:19 2024
OS/Arch: linux/arm64
Experimental: false
containerd:
Version: 1.7.21
GitCommit: 472731909fa34bd7bc9c087e4c27943f9835f111
runc:
Version: 1.1.13
GitCommit: v1.1.13-0-g58aa920
docker-init:
Version: 0.19.0
GitCommit: de40ad0
docker info
Client:
Version: 27.3.1
Context: desktop-linux
Debug Mode: false
Plugins:
ai: Ask Gordon - Docker Agent (Docker Inc.)
Version: v0.1.0
Path: /Users/ryanhaney/.docker/cli-plugins/docker-ai
buildx: Docker Buildx (Docker Inc.)
Version: v0.18.0-desktop.2
Path: /Users/ryanhaney/.docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.30.3-desktop.1
Path: /Users/ryanhaney/.docker/cli-plugins/docker-compose
debug: Get a shell into any image or container (Docker Inc.)
Version: 0.0.37
Path: /Users/ryanhaney/.docker/cli-plugins/docker-debug
desktop: Docker Desktop commands (Alpha) (Docker Inc.)
Version: v0.0.15
Path: /Users/ryanhaney/.docker/cli-plugins/docker-desktop
dev: Docker Dev Environments (Docker Inc.)
Version: v0.1.2
Path: /Users/ryanhaney/.docker/cli-plugins/docker-dev
extension: Manages Docker extensions (Docker Inc.)
Version: v0.2.27
Path: /Users/ryanhaney/.docker/cli-plugins/docker-extension
feedback: Provide feedback, right in your terminal! (Docker Inc.)
Version: v1.0.5
Path: /Users/ryanhaney/.docker/cli-plugins/docker-feedback
init: Creates Docker-related starter files for your project (Docker Inc.)
Version: v1.4.0
Path: /Users/ryanhaney/.docker/cli-plugins/docker-init
sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
Version: 0.6.0
Path: /Users/ryanhaney/.docker/cli-plugins/docker-sbom
scout: Docker Scout (Docker Inc.)
Version: v1.15.0
Path: /Users/ryanhaney/.docker/cli-plugins/docker-scout
Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 24
Server Version: 27.3.1
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
Swarm: active
NodeID: pjuuugw26u7nlvsrsr628oms1
Is Manager: true
ClusterID: vizrmfw3ddm5sqmztzmfpxu9g
Managers: 1
Nodes: 1
Default Address Pool: 10.0.0.0/8
SubnetSize: 24
Data Path Port: 4789
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Number of Old Snapshots to Retain: 0
Heartbeat Tick: 1
Election Tick: 10
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Force Rotate: 0
Autolock Managers: false
Root Rotation In Progress: false
Node Address: 192.168.65.3
Manager Addresses:
192.168.65.3:2377
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 472731909fa34bd7bc9c087e4c27943f9835f111
runc version: v1.1.13-0-g58aa920
init version: de40ad0
Security Options:
seccomp
Profile: unconfined
cgroupns
Kernel Version: 6.10.14-linuxkit
Operating System: Docker Desktop
OSType: linux
Architecture: aarch64
CPUs: 8
Total Memory: 3.827GiB
Name: docker-desktop
ID: 3ab93200-4d61-4aaf-9261-7f848b076cf2
Docker Root Dir: /var/lib/docker
Debug Mode: false
HTTP Proxy: http.docker.internal:3128
HTTPS Proxy: http.docker.internal:3128
No Proxy: hubproxy.docker.internal
Labels:
com.docker.desktop.address=unix:///Users/ryanhaney/Library/Containers/com.docker.docker/Data/docker-cli.sock
Experimental: false
Insecure Registries:
hubproxy.docker.internal:5555
127.0.0.0/8
Live Restore Enabled: false
WARNING: daemon is not using the default seccomp profile
Additional Info
- I originally encountered this bug when I mistyped CMD-SHELL as CMD_SHELL. I feel like this is an easy enough mistake to make, and would be much harder to debug if such a mistake is made in a larger environment, especially given the odd behavior that results from it.
- "pg_ready" in the above example could be anything, even something guaranteed to fail like "cat notafile" would result in the same behavior.
- I don't think the other healthcheck parameters matter much, those are just what I was using.