cilium do not restore some endpoint and leave them stale

### Is there an existing issue for this?

- [x] I have searched the existing issues

### Version

equal or higher than v1.14.18 and lower than v1.15.0

### What happened?

In my environment, after a cilium pod reconstruction on a node, some pods on the node entered the crash state. It was found that some local Pods could not be pinged on the node, and the ping output was similar to the following output:

```
# ping 172.16.0.13
PING 172.16.0.13 (172.16.0.13) 56(84) bytes of data.
From 172.16.0.1 icmp_seq=1 Time to live exceeded
From 172.16.0.1 icmp_seq=2 Time to live exceeded
```

I searched cilium endpoint in the `cilium-agent` container on the node through commands such as `cilium endpoint list | grep $podIP`, and the result was that no corresponding endpoint could be found. But I can capture the log of creating cilium endpoint when the pod is created:

```
level=debug msg="Endpoint successfully created" containerID=a7fe558a0ace4be4e6bf06895272cf5eecdc9b1adb41b52b0a230a7b12943cf0 eventUUID=e7a3754e-2222-48ad-af43-733d0db20b1a subsys=cilium-cni
```

Also I found some restore log in `cilium-agent`:

```
2025-01-15T01:35:04.106977887+08:00 stdout F level=info msg="Restored endpoint" endpointID=3718 ipAddr="[172.16.0.13 ]" subsys=endpoint
```

However, after the cilium-agent container is rebuilt, a large number of the following errors appear in the logs:

![Image](https://github.com/user-attachments/assets/2e73f63c-8aec-485d-bf38-987064782d47)

When I check the endpoint in the `/var/run/cilium/state` directory, I find that the directory situation is as follows, and it cannot be recovered

### How can we reproduce the issue?

It is not clear how this problem is triggered, but perhaps the cilium-agent container restart is an important condition

### Cilium Version

v1.12.5

### Kernel Version

 4.14.105-19-0019

### Kubernetes Version

v1.22.5

### Regression

_No response_

### Sysdump

_No response_

### Relevant log output

```shell

```

### Anything else?

_No response_

### Cilium Users Document

- [x] Are you a user of Cilium? Please add yourself to the [Users doc](https://github.com/cilium/cilium/blob/main/USERS.md)

### Code of Conduct

- [x] I agree to follow this project's Code of Conduct

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cilium do not restore some endpoint and leave them stale #37077

Is there an existing issue for this?

Version

What happened?

How can we reproduce the issue?

Cilium Version

Kernel Version

Kubernetes Version

Regression

Sysdump

Relevant log output

Anything else?

Cilium Users Document

Code of Conduct

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development