WebSocketConnectionClosedException
Connection to remote host was lost. #38
Description
_Migrated from internal repo.
Complete stack trace and logs (sensitive) https://github.com/AI-Safety-Institute/aisi-inspect-tools/issues/142
Original date: 23 Oct 2024
Originally raised by @willpayne23
"Traceback (most recent call last):\n\n File \"redacted/.venv/lib/python3.12/site-packages/inspect_ai/_eval/task/run.py\", line 260, in task_run\
│ n sample_results = await asyncio.gather(*sample_coroutines)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n File \"redacted/.venv/lib/python
│ 3.12/site-packages/inspect_ai/_eval/task/run.py\", line 424, in task_run_sample\n error = sample_error(ex)\n ^^^^^^^^^^^^^^^^\n\n File \"redacted/.v
│ env/lib/python3.12/site-packages/inspect_ai/_eval/task/error.py\", line 22, in __call__\n raise ex\n\n File \"redacted/.venv/lib/python3.12/site-packages/inspe
│ ct_ai/_eval/task/run.py\", line 416, in task_run_sample\n state = await plan(state, generate)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n File \"redacted/.ven
│ v/lib/python3.12/site-packages/inspect_ai/solver/_plan.py\", line 105, in __call__\n state = await solver(state, generate)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n File \"/
│ /.venv/lib/python3.12/site-packages/inspect_ai/solver/_basic_agent.py\", line 159, in solve\n tool_results = await call_tools(state.output.message, state
│ .tools)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n File \"redacted/.venv/lib/python3.12/site-packages/inspect_ai/model/_call_tool
│ s.py\", line 149, in call_tools\n results = await asyncio.gather(*tasks)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n File \"redacted/.venv/lib/python3.12/s
│ ite-packages/inspect_ai/model/_call_tools.py\", line 75, in call_tool_task\n result = await call_tool(tdefs, call)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n File \"/home/ubu
│ /.venv/lib/python3.12/site-packages/inspect_ai/model/_call_tools.py\", line 203, in call_tool\n result = await tool_def.tool(**arguments)\n ^^^^^^^^^
│ ^^^^^^^^^^^^^^^^^^^^^^^\n\n File \"redacted/run/src/agents/tools/python.py\", line 29, in execute\n result = await sandbox().exec(\n ^^^^^^^^^^^^^^
│ ^^^^^^^\n\n File \"redacted/.venv/lib/python3.12/site-packages/aisitools/k8s_sandbox/sandbox_environment.py\", line 105, in exec\n return await self._pod.exec(
│ cmd, input, timeout)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n File \"redacted/.venv/lib/python3.12/site-packages/aisitools/k8s_sandbox/pod.py\",
│ line 56, in exec\n result = await self._run_asynchronously(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n File \"redacted/.venv/lib/python3.12/site-package
│ s/aisitools/k8s_sandbox/pod.py\", line 100, in _run_asynchronously\n return await loop.run_in_executor(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n File \"/usr/lib/python3.12/con
│ current/futures/thread.py\", line 58, in run\n result = self.fn(*self.args, **self.kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n File \"redacted
│ /.venv/lib/python3.12/site-packages/aisitools/k8s_sandbox/pod.py\", line 57, in <lambda>\n lambda: executor.exec(cmd, stdin, timeout)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
│ ^^\n\n File \"redacted/.venv/lib/python3.12/site-packages/aisitools/k8s_sandbox/pod.py\", line 175, in exec\n result = self._handle_stream_output(response, tim
│ eout is not None)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n File \"redacted/.venv/lib/python3.12/site-packages/aisitools/k8s_san
│ dbox/pod.py\", line 208, in _handle_stream_output\n response.run_forever()\n\n File \"redacted/.venv/lib/python3.12/site-packages/kubernetes/stream/ws_client.p
│ y\", line 229, in run_forever\n self.update(timeout=None)\n\n File \"redacted/.venv/lib/python3.12/site-packages/kubernetes/stream/ws_client.py\", line 197, in
│ update\n op_code, frame = self.sock.recv_data_frame(True)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n File \"redacted/.venv/lib/python3.12/site-
│ packages/websocket/_core.py\", line 437, in recv_data_frame\n frame = self.recv_frame()\n ^^^^^^^^^^^^^^^^^\n\n File \"redacted/.venv/lib/python3.12
│ /site-packages/websocket/_core.py\", line 478, in recv_frame\n return self.frame_buffer.recv_frame()\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n File \"redacted
│ /.venv/lib/python3.12/site-packages/websocket/_abnf.py\", line 377, in recv_frame\n payload = self.recv_strict(length)\n ^^^^^^^^^^^^^^^^^^^^^^^^\n\n File \"/hom
│ /.venv/lib/python3.12/site-packages/websocket/_abnf.py\", line 398, in recv_strict\n bytes_ = self.recv(min(16384, shortage))\n ^^^^^^^^^^^^^^^^
│ ^^^^^^^^^^^^^^^\n\n File \"redacted/.venv/lib/python3.12/site-packages/websocket/_core.py\", line 563, in _recv\n return recv(self.sock, bufsize)\n ^
│ ^^^^^^^^^^^^^^^^^^^^^^^\n\n File \"redacted/.venv/lib/python3.12/site-packages/websocket/_socket.py\", line 132, in recv\n raise WebSocketConnectionClosedExcep
│ tion(\"Connection to remote host was lost.\")\n\nwebsocket._exceptions.WebSocketConnectionClosedException: Connection to remote host was lost.\n
Another instance of this, with the improved logging below (26 Oct 2024)
WebSocketConnectionClosedException: Connection to remote host was lost.
...
K8sError: Error during: Execute command in pod. {"pod": "agent-env-nqmhh6q4-default-0", ...
With timestamps (from py log file). Why did nearly an hour elapse between starting the command and the failure?
2024-10-26 23:53:24,982 - SANDBOX - K8S: Starting: Execute command in pod. {"pod": "agent-env-nqmhh6q4-default-0", ...
2024-10-27 00:48:42,974 - ERROR - K8S: Error during: Execute command in pod. {"cause": "Connection to remote host was lost.", "pod": "agent-env-nqmhh6q4-default-0", ...
Kubernetes cluster events. Note the "node not ready".
2024-10-26T23:50:36Z Normal agent-env-nqmhh6q4-default-0 Scheduled Successfully assigned agent/agent-env-nqmhh6q4-default-0 to ip-192-168-102-178.eu-west-2.compute.internal
2024-10-26T23:50:37Z Normal agent-env-nqmhh6q4-default-0 Started Started container resolve-coredns-ip
2024-10-26T23:50:37Z Normal agent-env-nqmhh6q4-default-0 Created Created container resolve-coredns-ip
2024-10-26T23:50:37Z Normal agent-env-nqmhh6q4-default-0 Pulled Container image "toolbelt/dig:2024-09-23" already present on machine
2024-10-26T23:50:38Z Normal agent-env-nqmhh6q4-default-0 Pulled Container image "redacted" already present on machine
2024-10-26T23:50:39Z Normal agent-env-nqmhh6q4-default-0 Created Created container default
2024-10-26T23:50:39Z Normal agent-env-nqmhh6q4-default-0 Started Started container default
2024-10-26T23:55:46Z Warning agent-env-nqmhh6q4-default-0 NodeNotReady Node is not ready
2024-10-27T00:00:51Z Normal agent-env-nqmhh6q4-default SuccessfulCreate create Pod agent-env-nqmhh6q4-default-0 in StatefulSet agent-env-nqmhh6q4-default successful
2024-10-27T00:00:51Z Normal agent-env-nqmhh6q4-default-0 TaintManagerEviction Marking for deletion Pod agent/agent-env-nqmhh6q4-default-0
2024-10-27T00:00:51Z Normal agent-env-nqmhh6q4-default-0 Scheduled Successfully assigned agent/agent-env-nqmhh6q4-default-0 to ip-192-168-108-64.eu-west-2.compute.internal
2024-10-27T00:00:52Z Normal agent-env-nqmhh6q4-default-0 Pulled Container image "toolbelt/dig:2024-09-23" already present on machine
2024-10-27T00:00:53Z Normal agent-env-nqmhh6q4-default-0 Created Created container resolve-coredns-ip
2024-10-27T00:00:53Z Normal agent-env-nqmhh6q4-default-0 Started Started container resolve-coredns-ip
2024-10-27T00:00:54Z Normal agent-env-nqmhh6q4-default-0 Pulled Container image "redacted" already present on machine
2024-10-27T00:00:54Z Normal agent-env-nqmhh6q4-default-0 Started Started container default
2024-10-27T00:00:54Z Normal agent-env-nqmhh6q4-default-0 Created Created container default
2024-10-27T00:48:44Z Normal agent-env-nqmhh6q4-default-0 Killing Stopping container default
Another instance of this (02 Nov 2024)
│ redacted/.venv/lib/python3.12/site-packages/websocket/_socket.py:132 in recv │
│ │
│ 129 │ │ │ raise │
│ 130 │ │
│ 131 │ if not bytes_: │
│ > 132 │ │ raise WebSocketConnectionClosedException("Connection to remote host was lost.") │
│ 133 │ │
│ 134 │ return bytes_ │
│ 135 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
WebSocketConnectionClosedException: Connection to remote host was lost.
...
K8sError: Error during: Execute command in pod. {"pod": "agent-env-z2z8u7np-default-0", ...
2024-11-02 12:31:25,541 - ERROR - K8S: Error during: Execute command in pod. {"cause": "Connection to remote host was lost.", "pod": "agent-env-z2z8u7np-default-0",...
cluster events:
2024-11-02T11:36:42Z Normal agent-env-z2z8u7np-default-0 Scheduled Successfully assigned agent/agent-env-z2z8u7np-default-0 to ip-192-168-156-230.eu-west-2.compute.internal
2024-11-02T11:36:43Z Normal agent-env-z2z8u7np-default-0 Created Created container resolve-coredns-ip
2024-11-02T11:36:43Z Normal agent-env-z2z8u7np-default-0 Started Started container resolve-coredns-ip
2024-11-02T11:36:43Z Normal agent-env-z2z8u7np-default-0 Pulled Container image "toolbelt/dig:2024-09-23" already present on machine
2024-11-02T11:36:44Z Normal agent-env-z2z8u7np-default-0 Pulled Container image "redacted" already present on machine
2024-11-02T11:36:44Z Normal agent-env-z2z8u7np-default-0 Started Started container default
2024-11-02T11:36:44Z Normal agent-env-z2z8u7np-default-0 Created Created container default
2024-11-02T11:39:42Z Warning agent-env-z2z8u7np-default-0 NodeNotReady Node is not ready
2024-11-02T11:44:47Z Normal agent-env-z2z8u7np-default-0 TaintManagerEviction Marking for deletion Pod agent/agent-env-z2z8u7np-default-0
2024-11-02T11:44:48Z Normal agent-env-z2z8u7np-default SuccessfulCreate create Pod agent-env-z2z8u7np-default-0 in StatefulSet agent-env-z2z8u7np-default successful
2024-11-02T11:44:48Z Normal agent-env-z2z8u7np-default-0 Scheduled Successfully assigned agent/agent-env-z2z8u7np-default-0 to ip-192-168-129-237.eu-west-2.compute.internal
2024-11-02T11:44:49Z Normal agent-env-z2z8u7np-default-0 Pulled Container image "toolbelt/dig:2024-09-23" already present on machine
2024-11-02T11:44:49Z Normal agent-env-z2z8u7np-default-0 Created Created container resolve-coredns-ip
2024-11-02T11:44:49Z Normal agent-env-z2z8u7np-default-0 Started Started container resolve-coredns-ip
2024-11-02T11:44:50Z Normal agent-env-z2z8u7np-default-0 Created Created container default
2024-11-02T11:44:50Z Normal agent-env-z2z8u7np-default-0 Started Started container default
2024-11-02T11:44:50Z Normal agent-env-z2z8u7np-default-0 Pulled Container image "rwedacted" already present on machine
06 November 2024
│ redacted/.venv/lib/python3.12/site-packages/websocket/_core.py:563 in _recv │
│ │
│ 560 │ │
│ 561 │ def _recv(self, bufsize): │
│ 562 │ │ try: │
│ > 563 │ │ │ return recv(self.sock, bufsize) │
│ 564 │ │ except WebSocketConnectionClosedException: │
│ 565 │ │ │ if self.sock: │
│ 566 │ │ │ │ self.sock.close() │
│ │
│ redacted/.venv/lib/python3.12/site-packages/websocket/_socket.py:132 in recv │
│ │
│ 129 │ │ │ raise │
│ 130 │ │
│ 131 │ if not bytes_: │
│ > 132 │ │ raise WebSocketConnectionClosedException("Connection to remote host was lost.") │
│ 133 │ │
│ 134 │ return bytes_ │
│ 135 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
WebSocketConnectionClosedException: Connection to remote host was lost.
...
K8sError: Error during: Execute command in pod. {"pod": "agent-env-r8s8i9gs-default-0", ...
2024-11-06 01:32:23,797 - SANDBOX - K8S: Starting: Write file to pod. {"pod": "agent-env-r8s8i9gs-default-0", ...
2024-11-06 01:32:23,944 - SANDBOX - K8S: Completed: Write file to pod. {"pod": "agent-env-r8s8i9gs-default-0", ...
2024-11-06 01:32:23,944 - SANDBOX - K8S: Starting: Write file to pod. {"pod": "agent-env-r8s8i9gs-default-0", ...
2024-11-06 01:32:24,107 - SANDBOX - K8S: Completed: Write file to pod. {"pod": "agent-env-r8s8i9gs-default-0", ...
2024-11-06 01:32:25,185 - SANDBOX - K8S: Starting: Execute command in pod. {"pod": "agent-env-r8s8i9gs-default-0", ...
2024-11-06 01:32:25,398 - SANDBOX - K8S: Completed: Execute command in pod. {"result": "ExecResult(success=True, returncode=0, ...
2024-11-06 01:32:25,398 - SANDBOX - K8S: Starting: Read file from pod. {"pod": "agent-env-r8s8i9gs-default-0", ...
2024-11-06 01:32:25,533 - SANDBOX - K8S: Completed: Read file from pod. {"pod": "agent-env-r8s8i9gs-default-0", ...
2024-11-06 01:32:26,484 - SANDBOX - K8S: Starting: Execute command in pod. {"pod": "agent-env-r8s8i9gs-default-0",...
2024-11-06 01:32:26,649 - SANDBOX - K8S: Completed: Execute command in pod. {"result": "ExecResult(success=True, returncode=0, ...
2024-11-06 01:32:26,649 - SANDBOX - K8S: Starting: Read file from pod. {"pod": "agent-env-r8s8i9gs-default-0", ...
2024-11-06 01:32:26,775 - SANDBOX - K8S: Completed: Read file from pod. {"pod": "agent-env-r8s8i9gs-default-0", ...
2024-11-06 01:32:39,921 - SANDBOX - K8S: Starting: Execute command in pod. {"pod": "agent-env-r8s8i9gs-default-0", ...
2024-11-06 01:39:47,174 - SANDBOX - K8S: Error during: Execute command in pod. {"cause": "Command timed out after 300s. ExecResult(success=False, returncode=124, ...
2024-11-06 01:39:52,744 - SANDBOX - K8S: Starting: Execute command in pod. {"pod": "agent-env-r8s8i9gs-default-0", ...
2024-11-06 01:39:52,922 - SANDBOX - K8S: Completed: Execute command in pod. {"result": "ExecResult(success=False, returncode=1, ...
2024-11-06 01:40:07,160 - SANDBOX - K8S: Starting: Execute command in pod. {"pod": "agent-env-r8s8i9gs-default-0", ...
2024-11-06 02:32:15,986 - ERROR - K8S: Error during: Execute command in pod. {"cause": "Connection to remote host was lost.", "pod": "agent-env-r8s8i9gs-default-0", ...
Note the 50 minutes between starting (or technically queueing) the command the when the actual error was raised.
2024-11-06T01:32:19Z Normal agent-env-r8s8i9gs-coredns ScalingReplicaSet Scaled up replica set agent-env-r8s8i9gs-coredns-84dcf44548 to 1
2024-11-06T01:32:19Z Normal agent-env-r8s8i9gs-coredns-84dcf44548 SuccessfulCreate Created pod: agent-env-r8s8i9gs-coredns-84dcf44548-vxb8l
2024-11-06T01:32:19Z Normal agent-env-r8s8i9gs-shared-volume Provisioning External provisioner is provisioning volume for claim "agent/agent-env-r8s8i9gs-shared-volume"
2024-11-06T01:32:19Z Normal agent-env-r8s8i9gs-ghidra SuccessfulCreate create Pod agent-env-r8s8i9gs-ghidra-0 in StatefulSet agent-env-r8s8i9gs-ghidra successful
2024-11-06T01:32:19Z Normal agent-env-r8s8i9gs-shared-volume ExternalProvisioning Waiting for a volume to be created either by the external provisioner 'nfs.csi.k8s.io' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.
2024-11-06T01:32:19Z Normal agent-env-r8s8i9gs-shared-volume ProvisioningSucceeded Successfully provisioned volume pvc-13d0b8e9-9e15-4ced-b627-69ea8416d136
2024-11-06T01:32:19Z Normal agent-env-r8s8i9gs-ghidra-0 Scheduled Successfully assigned agent/agent-env-r8s8i9gs-ghidra-0 to ip-192-168-104-100.eu-west-2.compute.internal
2024-11-06T01:32:19Z Normal agent-env-r8s8i9gs-coredns-84dcf44548-vxb8l Scheduled Successfully assigned agent/agent-env-r8s8i9gs-coredns-84dcf44548-vxb8l to ip-192-168-104-100.eu-west-2.compute.internal
2024-11-06T01:32:20Z Normal agent-env-r8s8i9gs-default-0 Scheduled Successfully assigned agent/agent-env-r8s8i9gs-default-0 to ip-192-168-181-54.eu-west-2.compute.internal
2024-11-06T01:32:20Z Normal agent-env-r8s8i9gs-ghidra-0 Started Started container resolve-coredns-ip
2024-11-06T01:32:20Z Normal agent-env-r8s8i9gs-ghidra-0 Created Created container resolve-coredns-ip
2024-11-06T01:32:20Z Normal agent-env-r8s8i9gs-ghidra-0 Pulled Container image "toolbelt/dig:2024-09-23" already present on machine
2024-11-06T01:32:20Z Normal agent-env-r8s8i9gs-coredns-84dcf44548-vxb8l Pulled Container image "coredns/coredns:1.8.3" already present on machine
2024-11-06T01:32:20Z Normal agent-env-r8s8i9gs-coredns-84dcf44548-vxb8l Created Created container coredns
2024-11-06T01:32:20Z Normal agent-env-r8s8i9gs-coredns-84dcf44548-vxb8l Started Started container coredns
2024-11-06T01:32:21Z Normal agent-env-r8s8i9gs-default-0 Created Created container resolve-coredns-ip
2024-11-06T01:32:21Z Normal agent-env-r8s8i9gs-default-0 Pulled Container image "toolbelt/dig:2024-09-23" already present on machine
2024-11-06T01:32:21Z Normal agent-env-r8s8i9gs-ghidra-0 Started Started container ghidra
2024-11-06T01:32:21Z Normal agent-env-r8s8i9gs-ghidra-0 Created Created container ghidra
2024-11-06T01:32:21Z Normal agent-env-r8s8i9gs-default-0 Started Started container resolve-coredns-ip
2024-11-06T01:32:21Z Normal agent-env-r8s8i9gs-ghidra-0 Pulled Container image "redacted" already present on machine
2024-11-06T01:32:22Z Normal agent-env-r8s8i9gs-default-0 Created Created container default
2024-11-06T01:32:22Z Normal agent-env-r8s8i9gs-default-0 Started Started container default
2024-11-06T01:32:22Z Normal agent-env-r8s8i9gs-default-0 Pulled Container image "redacted" already present on machine
2024-11-06T01:33:37Z Warning agent-env-r8s8i9gs-default-0 NodeNotReady Node is not ready
2024-11-06T01:38:42Z Normal agent-env-r8s8i9gs-default SuccessfulCreate create Pod agent-env-r8s8i9gs-default-0 in StatefulSet agent-env-r8s8i9gs-default successful
2024-11-06T01:38:42Z Normal agent-env-r8s8i9gs-default-0 TaintManagerEviction Marking for deletion Pod agent/agent-env-r8s8i9gs-default-0
2024-11-06T01:38:42Z Normal agent-env-r8s8i9gs-default-0 Scheduled Successfully assigned agent/agent-env-r8s8i9gs-default-0 to ip-192-168-129-171.eu-west-2.compute.internal
2024-11-06T01:38:42Z Normal agent-env-r8s8i9gs-default-0 TaintManagerEviction Cancelling deletion of Pod agent/agent-env-r8s8i9gs-default-0
2024-11-06T01:38:43Z Normal agent-env-r8s8i9gs-default-0 Created Created container resolve-coredns-ip
2024-11-06T01:38:43Z Normal agent-env-r8s8i9gs-default-0 Pulled Container image "toolbelt/dig:2024-09-23" already present on machine
2024-11-06T01:38:43Z Normal agent-env-r8s8i9gs-default-0 Started Started container resolve-coredns-ip
2024-11-06T01:38:44Z Normal agent-env-r8s8i9gs-default-0 Started Started container default
2024-11-06T01:38:44Z Normal agent-env-r8s8i9gs-default-0 Created Created container default
2024-11-06T01:38:44Z Normal agent-env-r8s8i9gs-default-0 Pulled Container image "redacted" already present on machine
2024-11-06T01:39:48Z Normal agent-env-r8s8i9gs-default-0 Killing Stopping container default
2024-11-06T01:40:57Z Warning agent-env-r8s8i9gs-default-0 NodeNotReady Node is not ready
2024-11-06T02:32:16Z Normal agent-env-r8s8i9gs-ghidra-0 Killing Stopping container ghidra
2024-11-06T02:32:16Z Normal agent-env-r8s8i9gs-coredns-84dcf44548-vxb8l Killing Stopping container coredns
It looks like the default container was started at 01:32:22, then the node was marked as not ready at 01:33:37 (by which point some write_files had already taken place). Note that the exec started at 01:32:39 errored at 01:39:47.
06 Nov 2024
│ redacted/.venv/lib/python3.12/site-packages/websocket/_socket.py:132 in recv │
│ │
│ 129 │ │ │ raise │
│ 130 │ │
│ 131 │ if not bytes_: │
│ > 132 │ │ raise WebSocketConnectionClosedException("Connection to remote host was lost.") │
│ 133 │ │
│ 134 │ return bytes_ │
│ 135 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
WebSocketConnectionClosedException: Connection to remote host was lost.
...
K8sError: Error during: Execute command in pod. {"pod": "agent-env-san7tnhu-default-0", ...
2024-11-05T22:40:18Z Warning agent-env-san7tnhu-default-0 FailedScheduling 0/46 nodes are available: 2 node(s) had untolerated taint {CriticalAddonsOnly: true}, 2 node(s) had untolerated taint {aisi.gov.uk/dev: true}, 2 node(s) had untolerated taint {aisi.gov.uk/devpods: true}, 2 node(s) had untolerated taint {node.kubernetes.io/unreachable: }, 38 Insufficient memory. preemption: 0/46 nodes are available: 38 No preemption victims found for incoming pod, 8 Preemption is not helpful for scheduling.
2024-11-05T22:40:22Z Normal agent-env-san7tnhu-default-0 Scheduled Successfully assigned agent/agent-env-san7tnhu-default-0 to ip-192-168-160-39.eu-west-2.compute.internal
2024-11-05T22:40:23Z Normal agent-env-san7tnhu-default-0 Pulled Container image "toolbelt/dig:2024-09-23" already present on machine
2024-11-05T22:40:23Z Normal agent-env-san7tnhu-default-0 Created Created container resolve-coredns-ip
2024-11-05T22:40:24Z Normal agent-env-san7tnhu-default-0 Created Created container default
2024-11-05T22:40:24Z Normal agent-env-san7tnhu-default-0 Started Started container resolve-coredns-ip
2024-11-05T22:40:24Z Normal agent-env-san7tnhu-default-0 Pulled Container image "redacted" already present on machine
2024-11-05T22:40:25Z Normal agent-env-san7tnhu-default-0 Started Started container default
2024-11-05T22:56:46Z Warning agent-env-san7tnhu-default-0 NodeNotReady Node is not ready
2024-11-05T23:01:51Z Normal agent-env-san7tnhu-default-0 TaintManagerEviction Cancelling deletion of Pod agent/agent-env-san7tnhu-default-0
2024-11-05T23:01:51Z Normal agent-env-san7tnhu-default-0 TaintManagerEviction Marking for deletion Pod agent/agent-env-san7tnhu-default-0
2024-11-05T23:01:51Z Normal agent-env-san7tnhu-default SuccessfulCreate create Pod agent-env-san7tnhu-default-0 in StatefulSet agent-env-san7tnhu-default successful
2024-11-05T23:08:20Z Warning agent-env-san7tnhu-default-0 FailedScheduling 0/46 nodes are available: 2 node(s) had untolerated taint {CriticalAddonsOnly: true}, 2 node(s) had untolerated taint {aisi.gov.uk/dev: true}, 2 node(s) had untolerated taint {aisi.gov.uk/devpods: true}, 36 Insufficient memory, 4 node(s) had untolerated taint {node.kubernetes.io/unreachable: }. preemption: 0/46 nodes are available: 10 Preemption is not helpful for scheduling, 36 No preemption victims found for incoming pod.
2024-11-05T23:09:01Z Normal agent-env-san7tnhu-default-0 Pulled Container image "toolbelt/dig:2024-09-23" already present on machine
2024-11-05T23:09:01Z Normal agent-env-san7tnhu-default-0 Started Started container resolve-coredns-ip
2024-11-05T23:09:01Z Normal agent-env-san7tnhu-default-0 Created Created container resolve-coredns-ip
2024-11-05T23:09:02Z Normal agent-env-san7tnhu-default-0 Created Created container default
2024-11-05T23:09:02Z Normal agent-env-san7tnhu-default-0 Pulled Container image "redacted" already present on machine
2024-11-05T23:09:02Z Normal agent-env-san7tnhu-default-0 Started Started container default
2024-11-05T23:40:06Z Normal agent-env-san7tnhu-default-0 Killing Stopping container default