Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

command execution timed out in go-client when Docker container is restarted #405

Open
mikhailbolshakov opened this issue Jun 11, 2023 · 2 comments

Comments

@mikhailbolshakov
Copy link

mikhailbolshakov commented Jun 11, 2023

Hello,
I have deployed aerospike:ce-6.3.0.1 under docker-compose with the aerospike.conf

service {
	
}

logging {
	file /var/log/aerospike/aerospike.log {
            context any info
        }
	console {
		context any info
	}
}

network {
	service {
		address any
		port 3000
	}

	heartbeat {
		mode mesh
		address local
		port 3002
		interval 150
		timeout 10
	}

	fabric {
		address local
		port 3001
	}

}

namespace cache {
	default-ttl 30d # use 0 to never expire/evict.
	memory-size 4G
	nsup-period 120
	replication-factor 1
	storage-engine memory
}

I'm using go-client v6.12.0 in my go aplication also deployed as docker container in the same docker network.
When my go application container restarts due to redeployment I start getting the following error on go-client side on just creating a new client connection

clientPolicy := aero.NewClientPolicy()
	client, err := aero.NewClientWithPolicy(clientPolicy, cfg.Host, cfg.Port)
	if err != nil {
		return ErrAeroConn(err, ctx)
	}

client logs look like this

2023-06-11T17:04:55.930+0000 Alias 172.18.0.3:3000 failed: ResultCode: NETWORK_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: network error. Checked the wrapped error for detail
ResultCode: TIMEOUT, Iteration: 0, InDoubt: false, Node: <nil>: Timeout
read tcp 172.18.0.30:55270->172.18.0.3:3000: i/o timeout
2023-06-11T17:04:55.930+0000 Seed 172.18.0.3:3000 failed: ResultCode: NETWORK_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: network error. Checked the wrapped error for detail
ResultCode: TIMEOUT, Iteration: 0, InDoubt: false, Node: <nil>: Timeout
read tcp 172.18.0.30:55270->172.18.0.3:3000: i/o timeout
2023-06-11T17:04:55.930+0000 ResultCode: INVALID_NODE_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: Failed to connect to hosts: [172.18.0.3:3000]
ResultCode: NETWORK_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: network error. Checked the wrapped error for detail
ResultCode: TIMEOUT, Iteration: 0, InDoubt: false, Node: <nil>: Timeout
read tcp 172.18.0.30:55270->172.18.0.3:3000: i/o timeout
2023-06-11T17:04:55.931+0000 No nodes available; seeding...
2023-06-11T17:04:55.932+0000 Seeding the cluster. Seeds count: 1
2023-06-11T17:04:55.932+0000 Node Validator has 1 nodes and they are: [172.18.0.3:3000]
2023-06-11T17:05:10.922+0000 Failed to connect to host(s): [aerospike:3000]; error: ResultCode: TIMEOUT, Iteration: 0, InDoubt: false, Node: <nil>: command execution timed out on client: See 
ResultCode: TIMEOUT, Iteration: 0, InDoubt: false, Node: <nil>: command execution timed out on client: See Policy.Timeout

Error is reproduced until aerospike container is restarted.

@khaf
Copy link
Collaborator

khaf commented Jun 13, 2023

What is the IP of the newly restarted container? Is that IP the same as the seed you provided? Can you ssh into your client container and ping the server container?

@mikhailbolshakov
Copy link
Author

here docker inspect of aerospike container

[
    {
        "Id": "d1d67eeb96050773be3cdd94aa17db0bc162269be7c35a49a705e8223a00e097",
        "Created": "2023-06-07T09:22:06.585535635Z",
        "Path": "/usr/bin/as-tini-static",
        "Args": [
            "-r",
            "SIGUSR1",
            "-t",
            "SIGTERM",
            "--",
            "/entrypoint.sh",
            "--config-file",
            "/opt/aerospike/etc/aerospike.conf"
        ],
        "State": {
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 3128355,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2023-06-13T09:37:37.889941354Z",
            "FinishedAt": "2023-06-13T09:36:53.955858771Z"
        },
        "Image": "sha256:dc5db6293cbe39fa7edfdc14cb4cf3998e5d6a48a871d06187df8f2147aaf3b6",
        "ResolvConfPath": "/var/lib/docker/containers/d1d67eeb96050773be3cdd94aa17db0bc162269be7c35a49a705e8223a00e097/resolv.conf",
        "HostnamePath": "/var/lib/docker/containers/d1d67eeb96050773be3cdd94aa17db0bc162269be7c35a49a705e8223a00e097/hostname",
        "HostsPath": "/var/lib/docker/containers/d1d67eeb96050773be3cdd94aa17db0bc162269be7c35a49a705e8223a00e097/hosts",
        "LogPath": "",
        "Name": "/prod-aerospike",
        "RestartCount": 0,
        "Driver": "overlay2",
        "Platform": "linux",
        "MountLabel": "",
        "ProcessLabel": "",
        "AppArmorProfile": "docker-default",
        "ExecIDs": null,
        "HostConfig": {
            "Binds": [
                "compose_spike-logs:/var/log/aerospike:rw",
                "/opt/services/compose/aerospike/aerospike.conf:/opt/aerospike/etc/aerospike.conf:rw",
                "/var/ev2go/docker/volumes/aerospike/data:/opt/aerospike/data:rw"
            ],
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "syslog",
                "Config": {
                    "syslog-address": "tcp://localhost:5141"
                }
            },
            "NetworkMode": "prod",
            "PortBindings": {},
            "RestartPolicy": {
                "Name": "always",
                "MaximumRetryCount": 0
            },
            "AutoRemove": false,
            "VolumeDriver": "",
            "VolumesFrom": [],
            "ConsoleSize": [
                0,
                0
            ],
            "CapAdd": null,
            "CapDrop": null,
            "CgroupnsMode": "private",
            "Dns": [],
            "DnsOptions": [],
            "DnsSearch": [],
            "ExtraHosts": null,
            "GroupAdd": null,
            "IpcMode": "private",
            "Cgroup": "",
            "Links": null,
            "OomScoreAdj": 0,
            "PidMode": "",
            "Privileged": false,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": null,
            "UTSMode": "",
            "UsernsMode": "",
            "ShmSize": 67108864,
            "Runtime": "runc",
            "Isolation": "",
            "CpuShares": 0,
            "Memory": 0,
            "NanoCpus": 0,
            "CgroupParent": "",
            "BlkioWeight": 0,
            "BlkioWeightDevice": null,
            "BlkioDeviceReadBps": null,
            "BlkioDeviceWriteBps": null,
            "BlkioDeviceReadIOps": null,
            "BlkioDeviceWriteIOps": null,
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": null,
            "DeviceCgroupRules": null,
            "DeviceRequests": null,
            "MemoryReservation": 0,
            "MemorySwap": 0,
            "MemorySwappiness": null,
            "OomKillDisable": null,
            "PidsLimit": null,
            "Ulimits": null,
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0,
            "MaskedPaths": [
                "/proc/asound",
                "/proc/acpi",
                "/proc/kcore",
                "/proc/keys",
                "/proc/latency_stats",
                "/proc/timer_list",
                "/proc/timer_stats",
                "/proc/sched_debug",
                "/proc/scsi",
                "/sys/firmware"
            ],
            "ReadonlyPaths": [
                "/proc/bus",
                "/proc/fs",
                "/proc/irq",
                "/proc/sys",
                "/proc/sysrq-trigger"
            ]
        },
        "GraphDriver": {
            "Data": {
                "LowerDir": "/var/lib/docker/overlay2/a09570f596feb69e3eb1b31fbb41dd4136b9be3b9a55789c88f649514e7cd1c6-init/diff:/var/lib/docker/overlay2/11e226e4130e48c7b825974e8d33a94f6b4d5f1eb1c7d355a044c04326506a4a/diff:/var/lib/docker/overlay2/d9a53a3e2ad2ffa735a285e4e76100d5643c1c61a9aaf889de9feeb93b1c9703/diff:/var/lib/docker/overlay2/5b6d25baf452df16024fca8c0751e848c32a84d1e78e99ca43bb850b2e5f3b84/diff:/var/lib/docker/overlay2/73c8091e9d806858c31290df93c752366cc8fefe608e2f6f2f3a209489a611a6/diff",
                "MergedDir": "/var/lib/docker/overlay2/a09570f596feb69e3eb1b31fbb41dd4136b9be3b9a55789c88f649514e7cd1c6/merged",
                "UpperDir": "/var/lib/docker/overlay2/a09570f596feb69e3eb1b31fbb41dd4136b9be3b9a55789c88f649514e7cd1c6/diff",
                "WorkDir": "/var/lib/docker/overlay2/a09570f596feb69e3eb1b31fbb41dd4136b9be3b9a55789c88f649514e7cd1c6/work"
            },
            "Name": "overlay2"
        },
        "Mounts": [
            {
                "Type": "bind",
                "Source": "/var/ev2go/docker/volumes/aerospike/data",
                "Destination": "/opt/aerospike/data",
                "Mode": "rw",
                "RW": true,
                "Propagation": "rprivate"
            },
            {
                "Type": "bind",
                "Source": "/opt/services/compose/aerospike/aerospike.conf",
                "Destination": "/opt/aerospike/etc/aerospike.conf",
                "Mode": "rw",
                "RW": true,
                "Propagation": "rprivate"
            },
            {
                "Type": "volume",
                "Name": "compose_spike-logs",
                "Source": "/var/lib/docker/volumes/compose_spike-logs/_data",
                "Destination": "/var/log/aerospike",
                "Driver": "local",
                "Mode": "rw",
                "RW": true,
                "Propagation": ""
            }
        ],
        "Config": {
            "Hostname": "d1d67eeb9605",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "ExposedPorts": {
                "3000/tcp": {},
                "3001/tcp": {},
                "3002/tcp": {}
            },
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "NAMESPACE=cache",
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
            ],
            "Cmd": [
                "--config-file",
                "/opt/aerospike/etc/aerospike.conf"
            ],
            "Image": "prod.dev:5050/reg/aerospike:ce-6.3.0.1",
            "Volumes": {
                "/opt/aerospike/data": {},
                "/opt/aerospike/etc/aerospike.conf": {},
                "/var/log/aerospike": {}
            },
            "WorkingDir": "",
            "Entrypoint": [
                "/usr/bin/as-tini-static",
                "-r",
                "SIGUSR1",
                "-t",
                "SIGTERM",
                "--",
                "/entrypoint.sh"
            ],
            "OnBuild": null,
            "Labels": {
                "com.docker.compose.config-hash": "a4fd0c71d15d60d8c1b49745120f12a003a7ea7d842fa55ffbb7a4cbfa3cb4dd",
                "com.docker.compose.container-number": "1",
                "com.docker.compose.oneoff": "False",
                "com.docker.compose.project": "compose",
                "com.docker.compose.project.config_files": "docker-compose.infra.yml",
                "com.docker.compose.project.environment_file": "env/prod/infra.env",
                "com.docker.compose.project.working_dir": "/opt/services/compose",
                "com.docker.compose.service": "aerospike",
                "com.docker.compose.version": "1.28.6"
            }
        },
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "1660a77ff6a58cb73c3f00c611ecfca11926b0b5e92e3d3a66a2304c96f8bc54",
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": {
                "3000/tcp": null,
                "3001/tcp": null,
                "3002/tcp": null
            },
            "SandboxKey": "/var/run/docker/netns/1660a77ff6a5",
            "SecondaryIPAddresses": null,
            "SecondaryIPv6Addresses": null,
            "EndpointID": "",
            "Gateway": "",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "IPAddress": "",
            "IPPrefixLen": 0,
            "IPv6Gateway": "",
            "MacAddress": "",
            "Networks": {
                "prod": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": [
                        "aerospike",
                        "d1d67eeb9605"
                    ],
                    "NetworkID": "1992039b92f260505173794e1328e89ca4be6e9ffca0721d96da96c21e7d3c20",
                    "EndpointID": "1c9ca009d77d02edcd991dd7f58b6f35ffe0b2ebab42bb0fbba24e96c8fa7cb0",
                    "Gateway": "172.18.0.1",
                    "IPAddress": "172.18.0.32",
                    "IPPrefixLen": 16,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "MacAddress": "**********",
                    "DriverOpts": null
                }
            }
        }
    }
]

here docker inspect of our app container

[
    {
        "Id": "88440de5cfb14a28945c81a8a2dc717b06ae69c53d0bd6531ffa9a715a556f22",
        "Created": "2023-06-13T09:29:34.024007382Z",
        "Path": "./bin/main",
        "Args": [],
        "State": {
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 3131351,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2023-06-13T09:38:00.840956627Z",
            "FinishedAt": "2023-06-13T09:37:58.303078528Z"
        },
        "Image": "sha256:4087882da24758d99ab6b63f0e71c32e5f6078828f9dcba2633f706f20d39803",
        "ResolvConfPath": "/var/lib/docker/containers/88440de5cfb14a28945c81a8a2dc717b06ae69c53d0bd6531ffa9a715a556f22/resolv.conf",
        "HostnamePath": "/var/lib/docker/containers/88440de5cfb14a28945c81a8a2dc717b06ae69c53d0bd6531ffa9a715a556f22/hostname",
        "HostsPath": "/var/lib/docker/containers/88440de5cfb14a28945c81a8a2dc717b06ae69c53d0bd6531ffa9a715a556f22/hosts",
        "LogPath": "/var/lib/docker/containers/88440de5cfb14a28945c81a8a2dc717b06ae69c53d0bd6531ffa9a715a556f22/88440de5cfb14a28945c81a8a2dc717b06ae69c53d0bd6531ffa9a715a556f22-json.log",
        "Name": "/ev2go-charging",
        "RestartCount": 0,
        "Driver": "overlay2",
        "Platform": "linux",
        "MountLabel": "",
        "ProcessLabel": "",
        "AppArmorProfile": "docker-default",
        "ExecIDs": null,
        "HostConfig": {
            "Binds": [],
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "json-file",
                "Config": {}
            },
            "NetworkMode": "ev2go",
            "PortBindings": {},
            "RestartPolicy": {
                "Name": "always",
                "MaximumRetryCount": 0
            },
            "AutoRemove": false,
            "VolumeDriver": "",
            "VolumesFrom": [],
            "ConsoleSize": [
                0,
                0
            ],
            "CapAdd": null,
            "CapDrop": null,
            "CgroupnsMode": "private",
            "Dns": [],
            "DnsOptions": [],
            "DnsSearch": [],
            "ExtraHosts": null,
            "GroupAdd": null,
            "IpcMode": "private",
            "Cgroup": "",
            "Links": null,
            "OomScoreAdj": 0,
            "PidMode": "",
            "Privileged": false,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": null,
            "UTSMode": "",
            "UsernsMode": "",
            "ShmSize": 67108864,
            "Runtime": "runc",
            "Isolation": "",
            "CpuShares": 0,
            "Memory": 0,
            "NanoCpus": 0,
            "CgroupParent": "",
            "BlkioWeight": 0,
            "BlkioWeightDevice": null,
            "BlkioDeviceReadBps": null,
            "BlkioDeviceWriteBps": null,
            "BlkioDeviceReadIOps": null,
            "BlkioDeviceWriteIOps": null,
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": null,
            "DeviceCgroupRules": null,
            "DeviceRequests": null,
            "MemoryReservation": 0,
            "MemorySwap": 0,
            "MemorySwappiness": null,
            "OomKillDisable": null,
            "PidsLimit": null,
            "Ulimits": null,
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0,
            "MaskedPaths": [
                "/proc/asound",
                "/proc/acpi",
                "/proc/kcore",
                "/proc/keys",
                "/proc/latency_stats",
                "/proc/timer_list",
                "/proc/timer_stats",
                "/proc/sched_debug",
                "/proc/scsi",
                "/sys/firmware"
            ],
            "ReadonlyPaths": [
                "/proc/bus",
                "/proc/fs",
                "/proc/irq",
                "/proc/sys",
                "/proc/sysrq-trigger"
            ]
        },
        "GraphDriver": {
            "Data": {
                "LowerDir": "/var/lib/docker/overlay2/b7181f937109308de48b9dc4374b346219618872005dd188fd638da2b60f8be0-init/diff:/var/lib/docker/overlay2/c6ad0d26880be3a153195cb09ae0c54dc86f94a015b50135e989c789d3fd7006/diff:/var/lib/docker/overlay2/449acbcf25e7bdd9fbd524bc9732f677a843e9502ade8ccb231f1680ee9447c5/diff:/var/lib/docker/overlay2/3b81eddb7cae2c9bdcee1b597c5c2925b725b30448a28744e7c1c7621d6cc39f/diff:/var/lib/docker/overlay2/91d410e7ceb084faac50ca05bd2cfcf8d0172d397b97dcaf1dcda02561ad4c72/diff:/var/lib/docker/overlay2/8e65a8ebbffa204cf8e19d3a944576735eb9d02fd2c7c243a6873fe45dcc54e4/diff:/var/lib/docker/overlay2/319ecfff57b841c1c44bcee9f831fb36efe1b51fef144609c8f73aa419a0ec3c/diff",
                "MergedDir": "/var/lib/docker/overlay2/b7181f937109308de48b9dc4374b346219618872005dd188fd638da2b60f8be0/merged",
                "UpperDir": "/var/lib/docker/overlay2/b7181f937109308de48b9dc4374b346219618872005dd188fd638da2b60f8be0/diff",
                "WorkDir": "/var/lib/docker/overlay2/b7181f937109308de48b9dc4374b346219618872005dd188fd638da2b60f8be0/work"
            },
            "Name": "overlay2"
        },
        "Mounts": [],
        "Config": {
            "Hostname": "*******",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "AERO_HOST=aerospike",
                "AERO_PORT=3000",
            ],
            "Cmd": null,
            "Image": "prod.dev:5050/prod/server/app:v1.1.4",
            "Volumes": null,
            "WorkingDir": "/usr/local/root/app",
            "Entrypoint": [
                "./bin/main"
            ],
            "OnBuild": null,
            "Labels": {
                "com.docker.compose.config-hash": "cedfa1cc76acedc141be90334ed503fc33665621f85635b71233e3f7a190ec94",
                "com.docker.compose.container-number": "1",
                "com.docker.compose.oneoff": "False",
                "com.docker.compose.project": "compose",
                "com.docker.compose.project.config_files": "docker-compose.services.yml",
                "com.docker.compose.project.working_dir": "/opt/services/compose",
                "com.docker.compose.service": "charging",
                "com.docker.compose.version": "1.28.6",
                "logging": "promtail",
                "logging_jobname": "containerlogs"
            }
        },
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "8b5bfc8bfdf558e9611563c54646a14e65d89d17ee5a9dc4393407940a97d2ef",
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": {},
            "SandboxKey": "/var/run/docker/netns/8b5bfc8bfdf5",
            "SecondaryIPAddresses": null,
            "SecondaryIPv6Addresses": null,
            "EndpointID": "",
            "Gateway": "",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "IPAddress": "",
            "IPPrefixLen": 0,
            "IPv6Gateway": "",
            "MacAddress": "",
            "Networks": {
                "prod": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": [
                        "app",
                        "88440de5cfb1"
                    ],
                    "NetworkID": "1992039b92f260505173794e1328e89ca4be6e9ffca0721d96da96c21e7d3c20",
                    "EndpointID": "0be7434bfa5b623b41ae0a468c469aaae7ca39cd1ebc05a15417039e80924577",
                    "Gateway": "172.18.0.1",
                    "IPAddress": "172.18.0.30",
                    "IPPrefixLen": 16,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "MacAddress": "************",
                    "DriverOpts": null
                }
            }
        }
    }
]

IP of a new container might be different of course.

Here some more points of investigation which might be helpful:
the problem is reproduced on 2 of our 3 environments. On those environments where problem exists we witness very weird scenario:

  • have aerospike, aerospike-tools and app (connected to aerospike via go client) containers running
  • restart app container 2 times in a row
  • app container crashes on open connection with the stack above
  • aerospike-tools also no longer able to connect to aerospike and crushes with timeout error
  • no longer able to stop aerospike container with docker-compose stop
  • docker daemon restart fixes all the problem unless the app container is restarted again

On the environment where there is no problem the app container can be restarted without any problems

Unfortunately, I cannot play around with this error because it's reproduced on prod environment while not reproduced on dev.
But if you have some ideas to check you could try them in the downtime

Can you ssh into your client container and ping the server container
nc shows open connection is ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants