integration: regression test for issue 10589 #10649

samuelkarp · 2024-08-28T04:09:51Z

This issue was caused by a race between init exits and new exec process tracking inside the shim. The test operates by controlling the time between when the shim invokes "runc exec" and when the actual "runc exec" is triggered. This allows validating that races for shim state tracking between pre- and post-start of the exec process do not exist.

Relates to #10589

/cc @laurazard @corhere

k8s-ci-robot · 2024-08-28T04:09:54Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

k8s-ci-robot · 2024-08-28T04:09:55Z

@samuelkarp: GitHub didn't allow me to request PR reviews from the following users: corhere.

Note that only containerd members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

This issue was caused by a race between init exits and new exec process tracking inside the shim. The test operates by controlling the time between when the shim invokes "runc exec" and when the actual "runc exec" is triggered. This allows validating that races for shim state tracking between pre- and post-start of the exec process do not exist.

Relates to #10589

Marked as WIP until the fix is merged.

/cc @laurazard @corhere

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

samuelkarp · 2024-08-28T04:12:45Z

How to run and output

samuelkarp@containerd:~/go/src/github.com/containerd/containerd$ sudo PATH=$PATH ./script/setup/install-failpoint-binaries 
+ bin/cni-bridge-fp
+ bin/containerd-shim-runc-fp-v1
+ bin/runc-fp
samuelkarp@containerd:~/go/src/github.com/containerd/containerd$ sudo -E PATH=$PATH make integration EXTRA_TESTFLAGS='-run TestIssue10589'
+ integration
INFO[0000] Using the following image list: {Alpine:ghcr.io/containerd/alpine:3.14.0 BusyBox:ghcr.io/containerd/busybox:1.36 Pause:registry.k8s.io/pause:3.10 ResourceConsumer:registry.k8s.io/e2e-test-images/resource-consumer:1.10 VolumeCopyUp:ghcr.io/containerd/volume-copy-up:2.2 VolumeOwnership:ghcr.io/containerd/volume-ownership:2.1 ArgsEscaped:cplatpublic.azurecr.io/args-escaped-test-image-ns:1.0 DockerSchema1:registry.k8s.io/busybox@sha256:4bdd623e848417d96127e16037743f0cd8b528c026e9175e22a84f639eca58ff} 
INFO[0000] running tests against containerd              revision=d1fa7450b9848a40c6379a3740a38b72bf34d59a runtime= snapshotter= version=v2.0.0-rc.3-222-gd1fa7450b
INFO[0000] start to pull seed image                      image="ghcr.io/containerd/busybox:1.36"
=== RUN   TestIssue10589
    container_linux_test.go:1597: 1. Create a sleeping container
    log_hook.go:47: time="2024-08-28T04:11:38.779645316Z" level=debug msg="remote introspection plugin filters" func="proxy.(*introspectionRemote).Plugins" file="/home/samuelkarp/go/src/github.com/containerd/containerd/core/introspection/proxy/remote.go:57" filters="[type==io.containerd.snapshotter.v1, id==overlayfs]" testcase=TestIssue10589
    container_linux_test.go:1631: 2. Create exec1
    container_linux_test.go:1656: 3. Wait until exec1 is waiting to start
    container_linux_test.go:1649: Starting exec1
    container_linux_test.go:1661: 4. Kill the container init process
    container_linux_test.go:1666: Killing main pid (95403) of container TestIssue10589
    container_linux_test.go:1672: container status running
    container_linux_test.go:1675: 5. Create exec2
    container_linux_test.go:1703: 6. Wait until exec2 is waiting to start
    container_linux_test.go:1696: Starting exec2
    container_linux_test.go:1708: 7. Allow exec=1 to proceed
    container_linux_test.go:1713: container status running
    container_linux_test.go:1652: error starting exec1: OCI runtime exec failed: exec failed: cannot exec in a stopped container: unknown
    container_linux_test.go:1717: container status running
    container_linux_test.go:1720: 8. Allow exec=2 to proceed
    container_linux_test.go:1725: container status running
    container_linux_test.go:1699: error starting exec2: OCI runtime exec failed: exec failed: cannot exec in a stopped container: unknown
    container_linux_test.go:1729: container status running
    container_linux_test.go:1732: 9. Validate
    container_linux_test.go:1735: exec1 status: created
    container_linux_test.go:1739: exec2 status: created
    container_linux_test.go:1742: task status: running
    container_linux_test.go:1744: 
        	Error Trace:	/home/samuelkarp/go/src/github.com/containerd/containerd/integration/client/container_linux_test.go:1744
        	Error:      	Not equal: 
        	            	expected: "stopped"
        	            	actual  : "running"
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1,2 +1,2 @@
        	            	-(client.ProcessStatus) (len=7) "stopped"
        	            	+(client.ProcessStatus) (len=7) "running"
        	            	 
        	Test:       	TestIssue10589
--- FAIL: TestIssue10589 (0.18s)
FAIL
time="2024-08-28T04:11:37.013425349Z" level=info msg="starting containerd" revision=d1fa7450b9848a40c6379a3740a38b72bf34d59a version=v2.0.0-rc.3-222-gd1fa7450b
time="2024-08-28T04:11:37.021898040Z" level=warning msg="Configuration migrated from version 2, use `containerd config migrate` to avoid migration" t="5.69µs"
time="2024-08-28T04:11:37.021973660Z" level=info msg="loading plugin" id=io.containerd.image-verifier.v1.bindir type=io.containerd.image-verifier.v1
time="2024-08-28T04:11:37.022123140Z" level=info msg="loading plugin" id=io.containerd.internal.v1.opt type=io.containerd.internal.v1
time="2024-08-28T04:11:37.022169900Z" level=info msg="loading plugin" id=io.containerd.warning.v1.deprecations type=io.containerd.warning.v1
time="2024-08-28T04:11:37.022186400Z" level=info msg="loading plugin" id=io.containerd.content.v1.content type=io.containerd.content.v1
time="2024-08-28T04:11:37.022467100Z" level=info msg="loading plugin" id=io.containerd.event.v1.exchange type=io.containerd.event.v1
time="2024-08-28T04:11:37.022510350Z" level=info msg="loading plugin" id=io.containerd.monitor.task.v1.cgroups type=io.containerd.monitor.task.v1
time="2024-08-28T04:11:37.022824590Z" level=info msg="loading plugin" id=io.containerd.snapshotter.v1.blockfile type=io.containerd.snapshotter.v1
time="2024-08-28T04:11:37.022900730Z" level=info msg="skip loading plugin" error="no scratch file generator: skip plugin" id=io.containerd.snapshotter.v1.blockfile type=io.containerd.snapshotter.v1
time="2024-08-28T04:11:37.022917710Z" level=info msg="loading plugin" id=io.containerd.snapshotter.v1.btrfs type=io.containerd.snapshotter.v1
time="2024-08-28T04:11:37.023144200Z" level=info msg="skip loading plugin" error="path /var/lib/containerd-test/io.containerd.snapshotter.v1.btrfs (ext4) must be a btrfs filesystem to be used with the btrfs snapshotter: skip plugin" id=io.containerd.snapshotter.v1.btrfs type=io.containerd.snapshotter.v1
time="2024-08-28T04:11:37.023163110Z" level=info msg="loading plugin" id=io.containerd.snapshotter.v1.devmapper type=io.containerd.snapshotter.v1
time="2024-08-28T04:11:37.023173600Z" level=info msg="skip loading plugin" error="devmapper not configured: skip plugin" id=io.containerd.snapshotter.v1.devmapper type=io.containerd.snapshotter.v1
time="2024-08-28T04:11:37.023180670Z" level=info msg="loading plugin" id=io.containerd.snapshotter.v1.native type=io.containerd.snapshotter.v1
time="2024-08-28T04:11:37.023251880Z" level=info msg="loading plugin" id=io.containerd.snapshotter.v1.overlayfs type=io.containerd.snapshotter.v1
time="2024-08-28T04:11:37.023455210Z" level=info msg="loading plugin" id=io.containerd.metadata.v1.bolt type=io.containerd.metadata.v1
time="2024-08-28T04:11:37.023512270Z" level=info msg="metadata content store policy set" policy=shared
time="2024-08-28T04:11:37.033084420Z" level=info msg="loading plugin" id=io.containerd.gc.v1.scheduler type=io.containerd.gc.v1
time="2024-08-28T04:11:37.033172101Z" level=info msg="loading plugin" id=io.containerd.differ.v1.walking type=io.containerd.differ.v1
time="2024-08-28T04:11:37.033190230Z" level=info msg="loading plugin" id=io.containerd.lease.v1.manager type=io.containerd.lease.v1
time="2024-08-28T04:11:37.033218361Z" level=info msg="loading plugin" id=io.containerd.service.v1.containers-service type=io.containerd.service.v1
time="2024-08-28T04:11:37.033242930Z" level=info msg="loading plugin" id=io.containerd.service.v1.content-service type=io.containerd.service.v1
time="2024-08-28T04:11:37.033254330Z" level=info msg="loading plugin" id=io.containerd.service.v1.diff-service type=io.containerd.service.v1
time="2024-08-28T04:11:37.033265851Z" level=info msg="loading plugin" id=io.containerd.service.v1.images-service type=io.containerd.service.v1
time="2024-08-28T04:11:37.033284951Z" level=info msg="loading plugin" id=io.containerd.service.v1.introspection-service type=io.containerd.service.v1
time="2024-08-28T04:11:37.033300340Z" level=info msg="loading plugin" id=io.containerd.service.v1.namespaces-service type=io.containerd.service.v1
time="2024-08-28T04:11:37.033312540Z" level=info msg="loading plugin" id=io.containerd.service.v1.snapshots-service type=io.containerd.service.v1
time="2024-08-28T04:11:37.033338560Z" level=info msg="loading plugin" id=io.containerd.shim.v1.manager type=io.containerd.shim.v1
time="2024-08-28T04:11:37.033352120Z" level=info msg="loading plugin" id=io.containerd.runtime.v2.task type=io.containerd.runtime.v2
time="2024-08-28T04:11:37.033499541Z" level=debug msg="loading tasks in namespace" namespace=testing
time="2024-08-28T04:11:37.033744600Z" level=info msg="connecting to shim TestIssue10589" address="unix:///run/containerd/s/b409d80ab8498366dd90ee591ff37adaa8bfe4302953978e2801b6158cc41403" namespace=testing protocol=ttrpc version=3
time="2024-08-28T04:11:37.033854621Z" level=error msg="copy shim log after reload" error="read /proc/self/fd/7: file already closed" namespace=testing
time="2024-08-28T04:11:37.033903201Z" level=warning msg="cleaning up after shim disconnected" id=TestIssue10589 namespace=testing
time="2024-08-28T04:11:37.033909921Z" level=info msg="cleaning up dead shim" namespace=testing
time="2024-08-28T04:11:37.201476835Z" level=warning msg="cleanup warnings time=\"2024-08-28T04:11:37Z\" level=debug msg=\"starting signal loop\" namespace=testing pid=95323 runtime=io.containerd.runc.v2\n" namespace=testing
time="2024-08-28T04:11:37.202085355Z" level=error msg="failed to load shim /run/containerd-test/io.containerd.runtime.v2.task/testing/TestIssue10589" error="unable to load shim \"TestIssue10589\": unable to make connection: failed to create TTRPC connection: dial unix /run/containerd/s/b409d80ab8498366dd90ee591ff37adaa8bfe4302953978e2801b6158cc41403: connect: connection refused" namespace=testing
time="2024-08-28T04:11:37.202152375Z" level=error msg="cleanup working directory in namespace" error="open /var/lib/containerd-test/io.containerd.runtime.v2.task/testing: no such file or directory" namespace=testing
time="2024-08-28T04:11:37.202181265Z" level=info msg="loading plugin" id=io.containerd.service.v1.tasks-service type=io.containerd.service.v1
time="2024-08-28T04:11:37.202203375Z" level=debug msg="No blockio config file specified, blockio not configured"
time="2024-08-28T04:11:37.202208585Z" level=debug msg="No RDT config file specified, RDT not configured"
time="2024-08-28T04:11:37.202216545Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.containers type=io.containerd.grpc.v1
time="2024-08-28T04:11:37.202228705Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.content type=io.containerd.grpc.v1
time="2024-08-28T04:11:37.202239035Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.diff type=io.containerd.grpc.v1
time="2024-08-28T04:11:37.202252095Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.events type=io.containerd.grpc.v1
time="2024-08-28T04:11:37.202263835Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.images type=io.containerd.grpc.v1
time="2024-08-28T04:11:37.202274065Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.introspection type=io.containerd.grpc.v1
time="2024-08-28T04:11:37.202291115Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.leases type=io.containerd.grpc.v1
time="2024-08-28T04:11:37.202308425Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.namespaces type=io.containerd.grpc.v1
time="2024-08-28T04:11:37.202318735Z" level=info msg="loading plugin" id=io.containerd.sandbox.store.v1.local type=io.containerd.sandbox.store.v1
time="2024-08-28T04:11:37.202327725Z" level=info msg="loading plugin" id=io.containerd.cri.v1.images type=io.containerd.cri.v1
time="2024-08-28T04:11:37.202384495Z" level=info msg="Get image filesystem path \"/var/lib/containerd-test/io.containerd.snapshotter.v1.overlayfs\" for snapshotter \"overlayfs\""
time="2024-08-28T04:11:37.202401495Z" level=info msg="Start snapshots syncer"
time="2024-08-28T04:11:37.202430995Z" level=info msg="loading plugin" id=io.containerd.cri.v1.runtime type=io.containerd.cri.v1
time="2024-08-28T04:11:37.202604625Z" level=info msg="starting cri plugin" config="{\"containerd\":{\"defaultRuntimeName\":\"runc\",\"runtimes\":{\"runc\":{\"runtimeType\":\"io.containerd.runc.v2\",\"runtimePath\":\"\",\"PodAnnotations\":null,\"ContainerAnnotations\":null,\"options\":{\"BinaryName\":\"\",\"CriuImagePath\":\"\",\"CriuWorkPath\":\"\",\"IoGid\":0,\"IoUid\":0,\"NoNewKeyring\":false,\"Root\":\"\",\"ShimCgroup\":\"\"},\"privileged_without_host_devices\":false,\"privileged_without_host_devices_all_devices_allowed\":false,\"baseRuntimeSpec\":\"\",\"cniConfDir\":\"\",\"cniMaxConfNum\":0,\"snapshotter\":\"\",\"sandboxer\":\"podsandbox\",\"io_type\":\"\"}},\"ignoreBlockIONotEnabledErrors\":false,\"ignoreRdtNotEnabledErrors\":false},\"cni\":{\"binDir\":\"/opt/cni/bin\",\"confDir\":\"/etc/cni/net.d\",\"maxConfNum\":1,\"setupSerially\":false,\"confTemplate\":\"\",\"ipPref\":\"\",\"useInternalLoopback\":false},\"enableSelinux\":false,\"selinuxCategoryRange\":1024,\"maxContainerLogSize\":16384,\"disableCgroup\":false,\"disableApparmor\":false,\"restrictOOMScoreAdj\":false,\"disableProcMount\":false,\"unsetSeccompProfile\":\"\",\"tolerateMissingHugetlbController\":true,\"disableHugetlbController\":true,\"device_ownership_from_security_context\":false,\"ignoreImageDefinedVolumes\":false,\"netnsMountsUnderStateDir\":false,\"enableUnprivilegedPorts\":true,\"enableUnprivilegedICMP\":true,\"enableCDI\":true,\"cdiSpecDirs\":[\"/etc/cdi\",\"/var/run/cdi\"],\"drainExecSyncIOTimeout\":\"0s\",\"ignoreDeprecationWarnings\":null,\"containerdRootDir\":\"/var/lib/containerd-test\",\"containerdEndpoint\":\"/run/containerd-test/containerd.sock\",\"rootDir\":\"/var/lib/containerd-test/io.containerd.grpc.v1.cri\",\"stateDir\":\"/run/containerd-test/io.containerd.grpc.v1.cri\"}"
time="2024-08-28T04:11:37.202651135Z" level=info msg="loading plugin" id=io.containerd.sandbox.controller.v1.podsandbox type=io.containerd.sandbox.controller.v1
time="2024-08-28T04:11:37.202716755Z" level=info msg="loading plugin" id=io.containerd.sandbox.controller.v1.shim type=io.containerd.sandbox.controller.v1
time="2024-08-28T04:11:37.202824215Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.sandbox-controllers type=io.containerd.grpc.v1
time="2024-08-28T04:11:37.202847185Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.sandboxes type=io.containerd.grpc.v1
time="2024-08-28T04:11:37.202858595Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.snapshots type=io.containerd.grpc.v1
time="2024-08-28T04:11:37.202873185Z" level=info msg="loading plugin" id=io.containerd.streaming.v1.manager type=io.containerd.streaming.v1
time="2024-08-28T04:11:37.202883865Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.streaming type=io.containerd.grpc.v1
time="2024-08-28T04:11:37.202892455Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.tasks type=io.containerd.grpc.v1
time="2024-08-28T04:11:37.202901105Z" level=info msg="loading plugin" id=io.containerd.transfer.v1.local type=io.containerd.transfer.v1
time="2024-08-28T04:11:37.202922405Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.transfer type=io.containerd.grpc.v1
time="2024-08-28T04:11:37.202931955Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.version type=io.containerd.grpc.v1
time="2024-08-28T04:11:37.202939095Z" level=info msg="loading plugin" id=io.containerd.monitor.container.v1.restart type=io.containerd.monitor.container.v1
time="2024-08-28T04:11:37.202973815Z" level=info msg="loading plugin" id=io.containerd.tracing.processor.v1.otlp type=io.containerd.tracing.processor.v1
time="2024-08-28T04:11:37.202984855Z" level=info msg="skip loading plugin" error="skip plugin: tracing endpoint not configured" id=io.containerd.tracing.processor.v1.otlp type=io.containerd.tracing.processor.v1
time="2024-08-28T04:11:37.202991725Z" level=info msg="loading plugin" id=io.containerd.internal.v1.tracing type=io.containerd.internal.v1
time="2024-08-28T04:11:37.202999145Z" level=info msg="skip loading plugin" error="skip plugin: tracing endpoint not configured" id=io.containerd.internal.v1.tracing type=io.containerd.internal.v1
time="2024-08-28T04:11:37.203005325Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.healthcheck type=io.containerd.grpc.v1
time="2024-08-28T04:11:37.203022425Z" level=info msg="loading plugin" id=io.containerd.nri.v1.nri type=io.containerd.nri.v1
time="2024-08-28T04:11:37.203040585Z" level=info msg="runtime interface created"
time="2024-08-28T04:11:37.203044885Z" level=info msg="created NRI interface"
time="2024-08-28T04:11:37.203055965Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.cri type=io.containerd.grpc.v1
time="2024-08-28T04:11:37.203063535Z" level=info msg="Connect containerd service"
time="2024-08-28T04:11:37.203273275Z" level=info msg="using experimental NRI integration - disable nri plugin to prevent this"
time="2024-08-28T04:11:37.204259855Z" level=error msg="failed to load cni during init, please check CRI plugin status before setting up network for pods" error="cni config load failed: no network config found in /etc/cni/net.d: cni plugin not initialized: failed to load cni config"
time="2024-08-28T04:11:37.216144536Z" level=debug msg="runtime \"runc\" supports recursive read-only mounts"
time="2024-08-28T04:11:37.216189716Z" level=debug msg="runtime \"runc\" supports CRI userns: false"
time="2024-08-28T04:11:37.216302036Z" level=info msg="Start subscribing containerd event"
time="2024-08-28T04:11:37.216384846Z" level=info msg="Start recovering state"
time="2024-08-28T04:11:37.216487096Z" level=info msg="Start event monitor"
time="2024-08-28T04:11:37.216509776Z" level=info msg="Start cni network conf syncer for default"
time="2024-08-28T04:11:37.216518166Z" level=info msg="Start streaming server"
time="2024-08-28T04:11:37.216533856Z" level=info msg="Registered namespace \"k8s.io\" with NRI"
time="2024-08-28T04:11:37.216541286Z" level=info msg="runtime interface starting up..."
time="2024-08-28T04:11:37.216546656Z" level=info msg="starting plugins..."
time="2024-08-28T04:11:37.216601806Z" level=info msg=serving... address=/run/containerd-test/containerd.sock.ttrpc
time="2024-08-28T04:11:37.216663896Z" level=info msg=serving... address=/run/containerd-test/containerd.sock
time="2024-08-28T04:11:37.216693676Z" level=debug msg="sd notification" notified=false state="READY=1"
time="2024-08-28T04:11:37.216711386Z" level=info msg="containerd successfully booted in 0.203890s"
time="2024-08-28T04:11:37.915657885Z" level=debug msg="(*service).Write started" expected="sha256:7b3ccabffc97de872a30dfd234fd972a66d247c8cfc69b0550f276481852627c" ref="index-sha256:7b3ccabffc97de872a30dfd234fd972a66d247c8cfc69b0550f276481852627c" total=2295
time="2024-08-28T04:11:38.049151204Z" level=debug msg="(*service).Write started" expected="sha256:907ca53d7e2947e849b839b1cd258c98fd3916c60f2e6e70c30edbf741ab6754" ref="manifest-sha256:907ca53d7e2947e849b839b1cd258c98fd3916c60f2e6e70c30edbf741ab6754" total=528
time="2024-08-28T04:11:38.178064985Z" level=debug msg="(*service).Write started" expected="sha256:66ba00ad3de8677a3fa4bc4ea0fc46ebca0f14db46ca365e7f60833068dd0148" ref="config-sha256:66ba00ad3de8677a3fa4bc4ea0fc46ebca0f14db46ca365e7f60833068dd0148" total=1457
time="2024-08-28T04:11:38.387775073Z" level=debug msg="stat snapshot" key="sha256:b64792c17e4ad443d16b218afb3a8f5d03ca0f4ec49b11c1a7aebe17f6c3c1d2"
time="2024-08-28T04:11:38.388530593Z" level=debug msg="prepare snapshot" key="extract-388036413-qBsk sha256:b64792c17e4ad443d16b218afb3a8f5d03ca0f4ec49b11c1a7aebe17f6c3c1d2" parent=
time="2024-08-28T04:11:38.399581294Z" level=debug msg="(*service).Write started" expected="sha256:205dae5015e78dd8c4d302e3db4eb31576fac715b46d099fe09680ba28093a7a" ref="layer-sha256:205dae5015e78dd8c4d302e3db4eb31576fac715b46d099fe09680ba28093a7a" total=2592863
time="2024-08-28T04:11:38.680446748Z" level=debug msg=" not found" error="exec: \"igzip\": executable file not found in $PATH"
time="2024-08-28T04:11:38.680537898Z" level=debug msg=" not found" error="exec: \"unpigz\": executable file not found in $PATH"
time="2024-08-28T04:11:38.749701223Z" level=debug msg="diff applied" d=69.497216ms digest="sha256:205dae5015e78dd8c4d302e3db4eb31576fac715b46d099fe09680ba28093a7a" media=application/vnd.docker.image.rootfs.diff.tar.gzip size=2592863
time="2024-08-28T04:11:38.750908073Z" level=debug msg="commit snapshot" key="extract-388036413-qBsk sha256:b64792c17e4ad443d16b218afb3a8f5d03ca0f4ec49b11c1a7aebe17f6c3c1d2" name="sha256:b64792c17e4ad443d16b218afb3a8f5d03ca0f4ec49b11c1a7aebe17f6c3c1d2"
time="2024-08-28T04:11:38.764704904Z" level=debug msg="create image" name="ghcr.io/containerd/busybox:1.36" target="sha256:7b3ccabffc97de872a30dfd234fd972a66d247c8cfc69b0550f276481852627c"
time="2024-08-28T04:11:38.766344684Z" level=debug msg="Received containerd event timestamp - 2024-08-28 04:11:38.766251035 +0000 UTC, namespace - \"testing\", topic - \"/images/create\""
time="2024-08-28T04:11:38.766374605Z" level=debug msg="Ignoring events in namespace - \"testing\""
time="2024-08-28T04:11:38.780651316Z" level=debug msg="prepare snapshot" key=TestIssue10589 parent="sha256:b64792c17e4ad443d16b218afb3a8f5d03ca0f4ec49b11c1a7aebe17f6c3c1d2"
time="2024-08-28T04:11:38.791604237Z" level=debug msg="get snapshot mounts" key=TestIssue10589
time="2024-08-28T04:11:38.801761747Z" level=debug msg="get snapshot mounts" key=TestIssue10589
time="2024-08-28T04:11:38.811294418Z" level=info msg="connecting to shim TestIssue10589" address="unix:///run/containerd/s/b409d80ab8498366dd90ee591ff37adaa8bfe4302953978e2801b6158cc41403" namespace=testing protocol=ttrpc version=3
time="2024-08-28T04:11:38.815234988Z" level=debug msg="loading plugin" id=io.containerd.event.v1.publisher runtime=io.containerd.runc.v2 type=io.containerd.event.v1
time="2024-08-28T04:11:38.815309199Z" level=debug msg="loading plugin" id=io.containerd.internal.v1.shutdown runtime=io.containerd.runc.v2 type=io.containerd.internal.v1
time="2024-08-28T04:11:38.815321628Z" level=debug msg="loading plugin" id=io.containerd.ttrpc.v1.task runtime=io.containerd.runc.v2 type=io.containerd.ttrpc.v1
time="2024-08-28T04:11:38.815394018Z" level=debug msg="registering ttrpc service" id=io.containerd.ttrpc.v1.task runtime=io.containerd.runc.v2
time="2024-08-28T04:11:38.815541139Z" level=debug msg="serving api on socket" socket="[inherited from parent]"
time="2024-08-28T04:11:38.815599498Z" level=debug msg="serving api on socket" socket="[inherited from parent]"
time="2024-08-28T04:11:38.815618769Z" level=debug msg="starting signal loop" namespace=testing path=/run/containerd-test/io.containerd.runtime.v2.task/testing/TestIssue10589 pid=95376 runtime=io.containerd.runc.v2
time="2024-08-28T04:11:38.952703070Z" level=debug msg="received signal" signal=terminated
time="2024-08-28T04:11:38.952758230Z" level=debug msg="sd notification" notified=false state="STOPPING=1"
time="2024-08-28T04:11:38.952831060Z" level=info msg="Stop CRI service"
time="2024-08-28T04:11:38.953052540Z" level=debug msg="cni watcher channel is closed"
time="2024-08-28T04:11:38.953070410Z" level=info msg="Stop CRI service"
time="2024-08-28T04:11:38.953133010Z" level=info msg="Event monitor stopped"
time="2024-08-28T04:11:38.953170460Z" level=info msg="Stream server stopped"

exit status 1
FAIL	github.com/containerd/containerd/v2/integration/client	1.991s
make: *** [Makefile:218: integration] Error 1

fuweid · 2024-08-28T05:12:24Z

integration/failpoint/cmd/runc-fp/delayexec.go

+	if err != nil {
+		return err
+	}
+	if err := ready.Trigger(); err != nil {


Using fifo as controller looks great!

laurazard · 2024-08-28T13:22:51Z

Thankyou! Will try to run/see if I can repro.

Update: I can reproduce with this 🥳

laurazard · 2024-08-28T13:47:55Z

Fwiw, testing with #10647 (and with #10651), I see:

sudo make integration EXTRA_TESTFLAGS='-run TestIssue10589'
+ integration
INFO[0000] Using the following image list: {Alpine:ghcr.io/containerd/alpine:3.14.0 BusyBox:ghcr.io/containerd/busybox:1.36 Pause:registry.k8s.io/pause:3.10 ResourceConsumer:registry.k8s.io/e2e-test-images/resource-consumer:1.10 VolumeCopyUp:ghcr.io/containerd/volume-copy-up:2.2 VolumeOwnership:ghcr.io/containerd/volume-ownership:2.1 ArgsEscaped:cplatpublic.azurecr.io/args-escaped-test-image-ns:1.0 DockerSchema1:registry.k8s.io/busybox@sha256:4bdd623e848417d96127e16037743f0cd8b528c026e9175e22a84f639eca58ff}
INFO[0000] running tests against containerd              revision=0574a010dd5afc1e98babe10911407ead1b095d7 runtime= snapshotter= version=v2.0.0-rc.3-215-g0574a010d
INFO[0000] start to pull seed image                      image="ghcr.io/containerd/busybox:1.36"
=== RUN   TestIssue10589
    container_linux_test.go:1597: 1. Create a sleeping container
    log_hook.go:47: time="2024-08-28T14:41:38.190740914+01:00" level=debug msg="remote introspection plugin filters" func="proxy.(*introspectionRemote).Plugins" file="/home/laurabrehm/go/src/github.com/containerd/containerd/core/introspection/proxy/remote.go:57" filters="[type==io.containerd.snapshotter.v1, id==overlayfs]" testcase=TestIssue10589
    container_linux_test.go:1633: 2. Create exec1
    container_linux_test.go:1658: 3. Wait until exec1 is waiting to start
    container_linux_test.go:1651: Starting exec1
    container_linux_test.go:1663: 4. Kill the container init process
    container_linux_test.go:1668: Killing main pid (871) of container TestIssue10589
    container_linux_test.go:1674: container status running
    container_linux_test.go:1677: 5. Create exec2
    container_linux_test.go:1705: 6. Wait until exec2 is waiting to start
    container_linux_test.go:1698: Starting exec2
    container_linux_test.go:1701: error starting exec2: container TestIssue10589 init process is not running: failed precondition

which is expected – means the shim prevented the new exec from being started since the init has exited.

samuelkarp · 2024-08-28T16:16:34Z

which is expected – means the shim prevented the new exec from being started since the init has exited.

Yay! ~~We can update the test to pass in this case; I'll look at the exact error.~~

Edit: The test already expects an error here.

integration/client/container_linux_test.go

samuelkarp · 2024-09-03T23:26:04Z

@laurazard I've updated the logic here to not block when exec2 fails to start and run against the latest in #10651, but I'm still seeing the task have an incorrect status:

    container_linux_test.go:1750: task status: running
    container_linux_test.go:1752: 
        	Error Trace:	/home/samuelkarp/go/src/github.com/containerd/containerd/integration/client/container_linux_test.go:1752
        	Error:      	Not equal: 
        	            	expected: "stopped"
        	            	actual  : "running"
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1,2 +1,2 @@
        	            	-(client.ProcessStatus) (len=7) "stopped"
        	            	+(client.ProcessStatus) (len=7) "running"
        	            	 
        	Test:       	TestIssue10589

Do I need to delay checking the task status here to give the shim some time to process the init exit?

Edit: adding a 5 second sleep did not help

Full log

samuelkarp@containerd:~/go/src/github.com/containerd/containerd$ sudo kill -9 $(pidof containerd-shim-runc-v2)
samuelkarp@containerd:~/go/src/github.com/containerd/containerd$ git pull --rebase
remote: Finding sources: 100% (6/6)
remote: Total 6 (delta 4), reused 6 (delta 4)
Unpacking objects: 100% (6/6), 3.19 KiB | 652.00 KiB/s, done.
From https://source.developers.google.com/p/samuelkarp-gke-dev/r/containerd
 + e73183aae...927ffbe86 shim-exec-fp-test -> samuelkarp-gke-dev/shim-exec-fp-test  (forced update)
Successfully rebased and updated refs/heads/shim-exec-fp-test.
samuelkarp@containerd:~/go/src/github.com/containerd/containerd$ sudo -E PATH=$PATH make integration EXTRA_TESTFLAGS='-run TestIssue10589'
+ integration
INFO[0000] Using the following image list: {Alpine:ghcr.io/containerd/alpine:3.14.0 BusyBox:ghcr.io/containerd/busybox:1.36 Pause:registry.k8s.io/pause:3.10 ResourceConsumer:registry.k8s.io/e2e-test-images/resource-consumer:1.10 VolumeCopyUp:ghcr.io/containerd/volume-copy-up:2.2 VolumeOwnership:ghcr.io/containerd/volume-ownership:2.1 ArgsEscaped:cplatpublic.azurecr.io/args-escaped-test-image-ns:1.0 DockerSchema1:registry.k8s.io/busybox@sha256:4bdd623e848417d96127e16037743f0cd8b528c026e9175e22a84f639eca58ff} 
INFO[0000] running tests against containerd              revision=0eb1d8ed220e07eca97b0533ec46b6e1dc1e4e7b runtime= snapshotter= version=v2.0.0-rc.3-213-g0eb1d8ed2
INFO[0000] start to pull seed image                      image="ghcr.io/containerd/busybox:1.36"
=== RUN   TestIssue10589
    container_linux_test.go:1597: 1. Create a sleeping container
    log_hook.go:47: time="2024-09-03T23:40:30.697982162Z" level=debug msg="remote introspection plugin filters" func="proxy.(*introspectionRemote).Plugins" file="/home/samuelkarp/go/src/github.com/containerd/containerd/core/introspection/proxy/remote.go:57" filters="[type==io.containerd.snapshotter.v1, id==overlayfs]" testcase=TestIssue10589
    container_linux_test.go:1633: 2. Create exec1
    container_linux_test.go:1658: 3. Wait until exec1 is waiting to start
    container_linux_test.go:1651: Starting exec1
    container_linux_test.go:1663: 4. Kill the container init process
    container_linux_test.go:1665: Killing main pid (152027) of container TestIssue10589
    container_linux_test.go:1669: container status running
    container_linux_test.go:1672: 5. Create exec2
    container_linux_test.go:1698: 6. Wait until exec2 is waiting to start
    container_linux_test.go:1691: Starting exec2
    container_linux_test.go:1694: error starting exec2: container TestIssue10589 init process is not running: failed precondition
    container_linux_test.go:1712: 7. Allow exec=1 to proceed
    container_linux_test.go:1717: container status running
    container_linux_test.go:1654: error starting exec1: OCI runtime exec failed: exec failed: cannot exec in a stopped container: unknown
    container_linux_test.go:1721: container status running
    container_linux_test.go:1736: 8. Skip exec2
    container_linux_test.go:1740: 9. Validate
    container_linux_test.go:1744: exec1 status: created
    container_linux_test.go:1748: exec2 status: created
    container_linux_test.go:1751: task status: running
    container_linux_test.go:1753: 
        	Error Trace:	/home/samuelkarp/go/src/github.com/containerd/containerd/integration/client/container_linux_test.go:1753
        	Error:      	Not equal: 
        	            	expected: "stopped"
        	            	actual  : "running"
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1,2 +1,2 @@
        	            	-(client.ProcessStatus) (len=7) "stopped"
        	            	+(client.ProcessStatus) (len=7) "running"
        	            	 
        	Test:       	TestIssue10589
--- FAIL: TestIssue10589 (5.19s)
FAIL
time="2024-09-03T23:40:28.934014925Z" level=info msg="starting containerd" revision=0eb1d8ed220e07eca97b0533ec46b6e1dc1e4e7b version=v2.0.0-rc.3-213-g0eb1d8ed2
time="2024-09-03T23:40:28.941185304Z" level=warning msg="Configuration migrated from version 2, use `containerd config migrate` to avoid migration" t="4.74µs"
time="2024-09-03T23:40:28.941220324Z" level=info msg="loading plugin" id=io.containerd.image-verifier.v1.bindir type=io.containerd.image-verifier.v1
time="2024-09-03T23:40:28.941248234Z" level=info msg="loading plugin" id=io.containerd.internal.v1.opt type=io.containerd.internal.v1
time="2024-09-03T23:40:28.942123584Z" level=info msg="loading plugin" id=io.containerd.warning.v1.deprecations type=io.containerd.warning.v1
time="2024-09-03T23:40:28.942212594Z" level=info msg="loading plugin" id=io.containerd.event.v1.exchange type=io.containerd.event.v1
time="2024-09-03T23:40:28.942259214Z" level=info msg="loading plugin" id=io.containerd.monitor.task.v1.cgroups type=io.containerd.monitor.task.v1
time="2024-09-03T23:40:28.942529884Z" level=info msg="loading plugin" id=io.containerd.content.v1.content type=io.containerd.content.v1
time="2024-09-03T23:40:28.942790894Z" level=info msg="loading plugin" id=io.containerd.snapshotter.v1.blockfile type=io.containerd.snapshotter.v1
time="2024-09-03T23:40:28.942862044Z" level=info msg="skip loading plugin" error="no scratch file generator: skip plugin" id=io.containerd.snapshotter.v1.blockfile type=io.containerd.snapshotter.v1
time="2024-09-03T23:40:28.942877564Z" level=info msg="loading plugin" id=io.containerd.snapshotter.v1.btrfs type=io.containerd.snapshotter.v1
time="2024-09-03T23:40:28.943105064Z" level=info msg="skip loading plugin" error="path /var/lib/containerd-test/io.containerd.snapshotter.v1.btrfs (ext4) must be a btrfs filesystem to be used with the btrfs snapshotter: skip plugin" id=io.containerd.snapshotter.v1.btrfs type=io.containerd.snapshotter.v1
time="2024-09-03T23:40:28.943130674Z" level=info msg="loading plugin" id=io.containerd.snapshotter.v1.devmapper type=io.containerd.snapshotter.v1
time="2024-09-03T23:40:28.943148384Z" level=info msg="skip loading plugin" error="devmapper not configured: skip plugin" id=io.containerd.snapshotter.v1.devmapper type=io.containerd.snapshotter.v1
time="2024-09-03T23:40:28.943155814Z" level=info msg="loading plugin" id=io.containerd.snapshotter.v1.native type=io.containerd.snapshotter.v1
time="2024-09-03T23:40:28.943286094Z" level=info msg="loading plugin" id=io.containerd.snapshotter.v1.overlayfs type=io.containerd.snapshotter.v1
time="2024-09-03T23:40:28.943473164Z" level=info msg="loading plugin" id=io.containerd.metadata.v1.bolt type=io.containerd.metadata.v1
time="2024-09-03T23:40:28.943530464Z" level=info msg="metadata content store policy set" policy=shared
time="2024-09-03T23:40:28.969587951Z" level=info msg="loading plugin" id=io.containerd.gc.v1.scheduler type=io.containerd.gc.v1
time="2024-09-03T23:40:28.969649381Z" level=info msg="loading plugin" id=io.containerd.differ.v1.walking type=io.containerd.differ.v1
time="2024-09-03T23:40:28.970055761Z" level=info msg="loading plugin" id=io.containerd.lease.v1.manager type=io.containerd.lease.v1
time="2024-09-03T23:40:28.970077261Z" level=info msg="loading plugin" id=io.containerd.service.v1.containers-service type=io.containerd.service.v1
time="2024-09-03T23:40:28.970088681Z" level=info msg="loading plugin" id=io.containerd.service.v1.content-service type=io.containerd.service.v1
time="2024-09-03T23:40:28.970098271Z" level=info msg="loading plugin" id=io.containerd.service.v1.diff-service type=io.containerd.service.v1
time="2024-09-03T23:40:28.970109691Z" level=info msg="loading plugin" id=io.containerd.service.v1.images-service type=io.containerd.service.v1
time="2024-09-03T23:40:28.970120361Z" level=info msg="loading plugin" id=io.containerd.service.v1.introspection-service type=io.containerd.service.v1
time="2024-09-03T23:40:28.970129001Z" level=info msg="loading plugin" id=io.containerd.service.v1.namespaces-service type=io.containerd.service.v1
time="2024-09-03T23:40:28.970137011Z" level=info msg="loading plugin" id=io.containerd.service.v1.snapshots-service type=io.containerd.service.v1
time="2024-09-03T23:40:28.970144941Z" level=info msg="loading plugin" id=io.containerd.shim.v1.manager type=io.containerd.shim.v1
time="2024-09-03T23:40:28.970165031Z" level=info msg="loading plugin" id=io.containerd.runtime.v2.task type=io.containerd.runtime.v2
time="2024-09-03T23:40:28.970282421Z" level=debug msg="loading tasks in namespace" namespace=testing
time="2024-09-03T23:40:28.970537531Z" level=info msg="connecting to shim TestIssue10589" address="unix:///run/containerd/s/b409d80ab8498366dd90ee591ff37adaa8bfe4302953978e2801b6158cc41403" namespace=testing protocol=ttrpc version=3
time="2024-09-03T23:40:28.970645481Z" level=error msg="copy shim log after reload" error="read /proc/self/fd/7: file already closed" namespace=testing
time="2024-09-03T23:40:28.970696981Z" level=warning msg="cleaning up after shim disconnected" id=TestIssue10589 namespace=testing
time="2024-09-03T23:40:28.970703491Z" level=info msg="cleaning up dead shim" namespace=testing
time="2024-09-03T23:40:29.081720708Z" level=warning msg="cleanup warnings time=\"2024-09-03T23:40:28Z\" level=debug msg=\"starting signal loop\" namespace=testing pid=151950 runtime=io.containerd.runc.v2\n" namespace=testing
time="2024-09-03T23:40:29.082343238Z" level=error msg="failed to load shim /run/containerd-test/io.containerd.runtime.v2.task/testing/TestIssue10589" error="unable to load shim \"TestIssue10589\": unable to make connection: failed to create TTRPC connection: dial unix /run/containerd/s/b409d80ab8498366dd90ee591ff37adaa8bfe4302953978e2801b6158cc41403: connect: connection refused" namespace=testing
time="2024-09-03T23:40:29.082422578Z" level=error msg="cleanup working directory in namespace" error="open /var/lib/containerd-test/io.containerd.runtime.v2.task/testing: no such file or directory" namespace=testing
time="2024-09-03T23:40:29.082455818Z" level=info msg="loading plugin" id=io.containerd.service.v1.tasks-service type=io.containerd.service.v1
time="2024-09-03T23:40:29.082477898Z" level=debug msg="No blockio config file specified, blockio not configured"
time="2024-09-03T23:40:29.082483028Z" level=debug msg="No RDT config file specified, RDT not configured"
time="2024-09-03T23:40:29.082490428Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.containers type=io.containerd.grpc.v1
time="2024-09-03T23:40:29.082503658Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.content type=io.containerd.grpc.v1
time="2024-09-03T23:40:29.082513458Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.diff type=io.containerd.grpc.v1
time="2024-09-03T23:40:29.082522838Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.events type=io.containerd.grpc.v1
time="2024-09-03T23:40:29.082532298Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.images type=io.containerd.grpc.v1
time="2024-09-03T23:40:29.082541318Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.introspection type=io.containerd.grpc.v1
time="2024-09-03T23:40:29.082550388Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.leases type=io.containerd.grpc.v1
time="2024-09-03T23:40:29.082560428Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.namespaces type=io.containerd.grpc.v1
time="2024-09-03T23:40:29.082579678Z" level=info msg="loading plugin" id=io.containerd.sandbox.store.v1.local type=io.containerd.sandbox.store.v1
time="2024-09-03T23:40:29.082588888Z" level=info msg="loading plugin" id=io.containerd.cri.v1.images type=io.containerd.cri.v1
time="2024-09-03T23:40:29.082650887Z" level=info msg="Get image filesystem path \"/var/lib/containerd-test/io.containerd.snapshotter.v1.overlayfs\" for snapshotter \"overlayfs\""
time="2024-09-03T23:40:29.082666758Z" level=info msg="Start snapshots syncer"
time="2024-09-03T23:40:29.082701698Z" level=info msg="loading plugin" id=io.containerd.cri.v1.runtime type=io.containerd.cri.v1
time="2024-09-03T23:40:29.082939678Z" level=info msg="starting cri plugin" config="{\"containerd\":{\"defaultRuntimeName\":\"runc\",\"runtimes\":{\"runc\":{\"runtimeType\":\"io.containerd.runc.v2\",\"runtimePath\":\"\",\"PodAnnotations\":null,\"ContainerAnnotations\":null,\"options\":{\"BinaryName\":\"\",\"CriuImagePath\":\"\",\"CriuWorkPath\":\"\",\"IoGid\":0,\"IoUid\":0,\"NoNewKeyring\":false,\"Root\":\"\",\"ShimCgroup\":\"\"},\"privileged_without_host_devices\":false,\"privileged_without_host_devices_all_devices_allowed\":false,\"baseRuntimeSpec\":\"\",\"cniConfDir\":\"\",\"cniMaxConfNum\":0,\"snapshotter\":\"\",\"sandboxer\":\"podsandbox\",\"io_type\":\"\"}},\"ignoreBlockIONotEnabledErrors\":false,\"ignoreRdtNotEnabledErrors\":false},\"cni\":{\"binDir\":\"/opt/cni/bin\",\"confDir\":\"/etc/cni/net.d\",\"maxConfNum\":1,\"setupSerially\":false,\"confTemplate\":\"\",\"ipPref\":\"\",\"useInternalLoopback\":false},\"enableSelinux\":false,\"selinuxCategoryRange\":1024,\"maxContainerLogSize\":16384,\"disableCgroup\":false,\"disableApparmor\":false,\"restrictOOMScoreAdj\":false,\"disableProcMount\":false,\"unsetSeccompProfile\":\"\",\"tolerateMissingHugetlbController\":true,\"disableHugetlbController\":true,\"device_ownership_from_security_context\":false,\"ignoreImageDefinedVolumes\":false,\"netnsMountsUnderStateDir\":false,\"enableUnprivilegedPorts\":true,\"enableUnprivilegedICMP\":true,\"enableCDI\":true,\"cdiSpecDirs\":[\"/etc/cdi\",\"/var/run/cdi\"],\"drainExecSyncIOTimeout\":\"0s\",\"ignoreDeprecationWarnings\":null,\"containerdRootDir\":\"/var/lib/containerd-test\",\"containerdEndpoint\":\"/run/containerd-test/containerd.sock\",\"rootDir\":\"/var/lib/containerd-test/io.containerd.grpc.v1.cri\",\"stateDir\":\"/run/containerd-test/io.containerd.grpc.v1.cri\"}"
time="2024-09-03T23:40:29.083004528Z" level=info msg="loading plugin" id=io.containerd.sandbox.controller.v1.podsandbox type=io.containerd.sandbox.controller.v1
time="2024-09-03T23:40:29.083091738Z" level=info msg="loading plugin" id=io.containerd.sandbox.controller.v1.shim type=io.containerd.sandbox.controller.v1
time="2024-09-03T23:40:29.083230328Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.sandbox-controllers type=io.containerd.grpc.v1
time="2024-09-03T23:40:29.083258978Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.sandboxes type=io.containerd.grpc.v1
time="2024-09-03T23:40:29.083272687Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.snapshots type=io.containerd.grpc.v1
time="2024-09-03T23:40:29.083281527Z" level=info msg="loading plugin" id=io.containerd.streaming.v1.manager type=io.containerd.streaming.v1
time="2024-09-03T23:40:29.083292487Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.streaming type=io.containerd.grpc.v1
time="2024-09-03T23:40:29.083300758Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.tasks type=io.containerd.grpc.v1
time="2024-09-03T23:40:29.083309398Z" level=info msg="loading plugin" id=io.containerd.transfer.v1.local type=io.containerd.transfer.v1
time="2024-09-03T23:40:29.083331558Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.transfer type=io.containerd.grpc.v1
time="2024-09-03T23:40:29.083341178Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.version type=io.containerd.grpc.v1
time="2024-09-03T23:40:29.083351537Z" level=info msg="loading plugin" id=io.containerd.monitor.container.v1.restart type=io.containerd.monitor.container.v1
time="2024-09-03T23:40:29.083398968Z" level=info msg="loading plugin" id=io.containerd.tracing.processor.v1.otlp type=io.containerd.tracing.processor.v1
time="2024-09-03T23:40:29.083410917Z" level=info msg="skip loading plugin" error="skip plugin: tracing endpoint not configured" id=io.containerd.tracing.processor.v1.otlp type=io.containerd.tracing.processor.v1
time="2024-09-03T23:40:29.083417708Z" level=info msg="loading plugin" id=io.containerd.internal.v1.tracing type=io.containerd.internal.v1
time="2024-09-03T23:40:29.083425818Z" level=info msg="skip loading plugin" error="skip plugin: tracing endpoint not configured" id=io.containerd.internal.v1.tracing type=io.containerd.internal.v1
time="2024-09-03T23:40:29.083431908Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.healthcheck type=io.containerd.grpc.v1
time="2024-09-03T23:40:29.083440558Z" level=info msg="loading plugin" id=io.containerd.nri.v1.nri type=io.containerd.nri.v1
time="2024-09-03T23:40:29.083451987Z" level=info msg="runtime interface created"
time="2024-09-03T23:40:29.083456018Z" level=info msg="created NRI interface"
time="2024-09-03T23:40:29.083470567Z" level=info msg="loading plugin" id=io.containerd.grpc.v1.cri type=io.containerd.grpc.v1
time="2024-09-03T23:40:29.083488127Z" level=info msg="Connect containerd service"
time="2024-09-03T23:40:29.083662607Z" level=info msg="using experimental NRI integration - disable nri plugin to prevent this"
time="2024-09-03T23:40:29.084470628Z" level=error msg="failed to load cni during init, please check CRI plugin status before setting up network for pods" error="cni config load failed: no network config found in /etc/cni/net.d: cni plugin not initialized: failed to load cni config"
time="2024-09-03T23:40:29.096181486Z" level=debug msg="runtime \"runc\" supports recursive read-only mounts"
time="2024-09-03T23:40:29.096210966Z" level=debug msg="runtime \"runc\" supports CRI userns: false"
time="2024-09-03T23:40:29.096385256Z" level=info msg="Start subscribing containerd event"
time="2024-09-03T23:40:29.096874226Z" level=info msg=serving... address=/run/containerd-test/containerd.sock.ttrpc
time="2024-09-03T23:40:29.096900866Z" level=info msg="Start recovering state"
time="2024-09-03T23:40:29.097190006Z" level=info msg="Start event monitor"
time="2024-09-03T23:40:29.097266246Z" level=info msg="Start cni network conf syncer for default"
time="2024-09-03T23:40:29.097281266Z" level=info msg="Start streaming server"
time="2024-09-03T23:40:29.097394796Z" level=info msg="Registered namespace \"k8s.io\" with NRI"
time="2024-09-03T23:40:29.097745326Z" level=info msg="runtime interface starting up..."
time="2024-09-03T23:40:29.096979176Z" level=info msg=serving... address=/run/containerd-test/containerd.sock
time="2024-09-03T23:40:29.097970266Z" level=info msg="starting plugins..."
time="2024-09-03T23:40:29.098171696Z" level=debug msg="sd notification" notified=false state="READY=1"
time="2024-09-03T23:40:29.098196976Z" level=info msg="containerd successfully booted in 0.164771s"
time="2024-09-03T23:40:29.825142583Z" level=debug msg="(*service).Write started" expected="sha256:7b3ccabffc97de872a30dfd234fd972a66d247c8cfc69b0550f276481852627c" ref="index-sha256:7b3ccabffc97de872a30dfd234fd972a66d247c8cfc69b0550f276481852627c" total=2295
time="2024-09-03T23:40:29.947087188Z" level=debug msg="(*service).Write started" expected="sha256:907ca53d7e2947e849b839b1cd258c98fd3916c60f2e6e70c30edbf741ab6754" ref="manifest-sha256:907ca53d7e2947e849b839b1cd258c98fd3916c60f2e6e70c30edbf741ab6754" total=528
time="2024-09-03T23:40:30.062308946Z" level=debug msg="(*service).Write started" expected="sha256:66ba00ad3de8677a3fa4bc4ea0fc46ebca0f14db46ca365e7f60833068dd0148" ref="config-sha256:66ba00ad3de8677a3fa4bc4ea0fc46ebca0f14db46ca365e7f60833068dd0148" total=1457
time="2024-09-03T23:40:30.283400659Z" level=debug msg="stat snapshot" key="sha256:b64792c17e4ad443d16b218afb3a8f5d03ca0f4ec49b11c1a7aebe17f6c3c1d2"
time="2024-09-03T23:40:30.284225090Z" level=debug msg="prepare snapshot" key="extract-283710330-DfmZ sha256:b64792c17e4ad443d16b218afb3a8f5d03ca0f4ec49b11c1a7aebe17f6c3c1d2" parent=
time="2024-09-03T23:40:30.296448668Z" level=debug msg="(*service).Write started" expected="sha256:205dae5015e78dd8c4d302e3db4eb31576fac715b46d099fe09680ba28093a7a" ref="layer-sha256:205dae5015e78dd8c4d302e3db4eb31576fac715b46d099fe09680ba28093a7a" total=2592863
time="2024-09-03T23:40:30.595056294Z" level=debug msg=" not found" error="exec: \"igzip\": executable file not found in $PATH"
time="2024-09-03T23:40:30.595146584Z" level=debug msg=" not found" error="exec: \"unpigz\": executable file not found in $PATH"
time="2024-09-03T23:40:30.664698076Z" level=debug msg="diff applied" d=69.820302ms digest="sha256:205dae5015e78dd8c4d302e3db4eb31576fac715b46d099fe09680ba28093a7a" media=application/vnd.docker.image.rootfs.diff.tar.gzip size=2592863
time="2024-09-03T23:40:30.665917956Z" level=debug msg="commit snapshot" key="extract-283710330-DfmZ sha256:b64792c17e4ad443d16b218afb3a8f5d03ca0f4ec49b11c1a7aebe17f6c3c1d2" name="sha256:b64792c17e4ad443d16b218afb3a8f5d03ca0f4ec49b11c1a7aebe17f6c3c1d2"
time="2024-09-03T23:40:30.681143974Z" level=debug msg="create image" name="ghcr.io/containerd/busybox:1.36" target="sha256:7b3ccabffc97de872a30dfd234fd972a66d247c8cfc69b0550f276481852627c"
time="2024-09-03T23:40:30.683020434Z" level=debug msg="Received containerd event timestamp - 2024-09-03 23:40:30.682829134 +0000 UTC, namespace - \"testing\", topic - \"/images/create\""
time="2024-09-03T23:40:30.683077154Z" level=debug msg="Ignoring events in namespace - \"testing\""
time="2024-09-03T23:40:30.699116212Z" level=debug msg="prepare snapshot" key=TestIssue10589 parent="sha256:b64792c17e4ad443d16b218afb3a8f5d03ca0f4ec49b11c1a7aebe17f6c3c1d2"
time="2024-09-03T23:40:30.713687280Z" level=debug msg="get snapshot mounts" key=TestIssue10589
time="2024-09-03T23:40:30.723742459Z" level=debug msg="get snapshot mounts" key=TestIssue10589
time="2024-09-03T23:40:30.734096348Z" level=info msg="connecting to shim TestIssue10589" address="unix:///run/containerd/s/b409d80ab8498366dd90ee591ff37adaa8bfe4302953978e2801b6158cc41403" namespace=testing protocol=ttrpc version=3
time="2024-09-03T23:40:30.738588157Z" level=debug msg="loading plugin" id=io.containerd.event.v1.publisher runtime=io.containerd.runc.v2 type=io.containerd.event.v1
time="2024-09-03T23:40:30.738642117Z" level=debug msg="loading plugin" id=io.containerd.internal.v1.shutdown runtime=io.containerd.runc.v2 type=io.containerd.internal.v1
time="2024-09-03T23:40:30.738651857Z" level=debug msg="loading plugin" id=io.containerd.ttrpc.v1.task runtime=io.containerd.runc.v2 type=io.containerd.ttrpc.v1
time="2024-09-03T23:40:30.738735987Z" level=debug msg="registering ttrpc service" id=io.containerd.ttrpc.v1.task runtime=io.containerd.runc.v2
time="2024-09-03T23:40:30.738895507Z" level=debug msg="serving api on socket" socket="[inherited from parent]"
time="2024-09-03T23:40:30.738923027Z" level=debug msg="serving api on socket" socket="[inherited from parent]"
time="2024-09-03T23:40:30.738933557Z" level=debug msg="starting signal loop" namespace=testing path=/run/containerd-test/io.containerd.runtime.v2.task/testing/TestIssue10589 pid=152002 runtime=io.containerd.runc.v2
time="2024-09-03T23:40:35.878313904Z" level=debug msg="received signal" signal=terminated
time="2024-09-03T23:40:35.878395314Z" level=debug msg="sd notification" notified=false state="STOPPING=1"
time="2024-09-03T23:40:35.878512814Z" level=info msg="Stop CRI service"
time="2024-09-03T23:40:35.878631224Z" level=debug msg="cni watcher channel is closed"
time="2024-09-03T23:40:35.878653624Z" level=info msg="Stop CRI service"
time="2024-09-03T23:40:35.878716394Z" level=info msg="Event monitor stopped"
time="2024-09-03T23:40:35.878749054Z" level=info msg="Stream server stopped"

exit status 1
FAIL	github.com/containerd/containerd/v2/integration/client	7.002s
make: *** [Makefile:218: integration] Error 1

laurazard · 2024-09-04T00:50:54Z

The latest in #10651 might've been bad, sorry! I made some changes/pushed what was missing, but I hadn't tried the no-timeout code yet so it might be broken somewhere. I'll take a better look tomorrow.

samuelkarp · 2024-09-04T16:59:14Z

This PR will remain WIP until #10651 is merged, at which point I will rebase it and mark it ready. Please feel free to review it now, however.

Signed-off-by: Samuel Karp <samuelkarp@google.com>

samuelkarp · 2024-09-05T19:49:01Z

Rebased now that #10651 merged.

pkg/fifosync/fifo_unix.go

integration/failpoint/const.go

samuelkarp · 2024-09-05T22:58:51Z

Latest push should fix the issue (mixing defer and t.Cleanup).

This issue was caused by a race between init exits and new exec process tracking inside the shim. The test operates by controlling the time between when the shim invokes "runc exec" and when the actual "runc exec" is triggered. This allows validating that races for shim state tracking between pre- and post-start of the exec process do not exist. Relates to containerd#10589 Signed-off-by: Samuel Karp <samuelkarp@google.com>

laurazard

LGTM

redcurtain · 2024-11-15T04:02:26Z

@samuelkarp Sorry to bother， but I still get a block when exec2 fails to start. Could you help take a look?
containerd version 1.7.14

sudo -E PATH=$PATH make integration EXTRA_TESTFLAGS='-run TestIssue10589'
+ integration
INFO[0000] Using the following image list: {Alpine:ghcr.io/containerd/alpine:3.14.0 BusyBox:mirrors.sangfor.com/busybox:1.36 Pause:registry.k8s.io/pause:3.8 ResourceConsumer:registry.k8s.io/e2e-test-images/resource-consumer:1.10 VolumeCopyUp:ghcr.io/containerd/volume-copy-up:2.1 VolumeOwnership:ghcr.io/containerd/volume-ownership:2.1 ArgsEscaped:cplatpublic.azurecr.io/args-escaped-test-image-ns:1.0 DockerSchema1:registry.k8s.io/busybox@sha256:4bdd623e848417d96127e16037743f0cd8b528c026e9175e22a84f639eca58ff}
INFO[0000] running tests against containerd              revision=212e8b6fa2f44b9c21b2798135fc6fb7c53efc16 runtime= snapshotter= version=v1.6.4
INFO[0000] start to pull seed image                      image="mirrors.sangfor.com/busybox:1.36"
=== RUN   TestIssue10589
    container_linux_test.go:1645: 1. Create a sleeping container
    log_hook.go:47: time="2024-11-15T11:35:25.350129380+08:00" level=debug msg="remote introspection plugin filters" func="introspection.(*introspectionRemote).Plugins" file="/home/dht/containerd/containerd-1.7.14/services/introspection/introspection.go:46" filters="[type==io.containerd.snapshotter.v1, id==overlayfs]" testcase=TestIssue10589
    container_linux_test.go:1688: 2. Create exec1
    container_linux_test.go:1713: 3. Wait until exec1 is waiting to start
    container_linux_test.go:1706: Starting exec1
    container_linux_test.go:1718: 4. Kill the container init process
    container_linux_test.go:1720: Killing main pid (19680) of container TestIssue10589
    container_linux_test.go:1724: container status running
    container_linux_test.go:1727: 5. Create exec2
    container_linux_test.go:1753: 6. Wait until exec2 is waiting to start
    container_linux_test.go:1746: Starting exec2

panic: test timed out after 10m0s
running tests:
        TestIssue10589 (10m0s)

goroutine 74 [running]:
testing.(*M).startAlarm.func1()
        /usr/local/go/src/testing/testing.go:2259 +0x259
created by time.goFunc
        /usr/local/go/src/time/sleep.go:176 +0x45

goroutine 1 [chan receive, 10 minutes]:
testing.(*T).Run(0xc0004a89c0, {0x1b61134, 0xe}, 0x1bba938)
        /usr/local/go/src/testing/testing.go:1649 +0x871
testing.runTests.func1(0x0?)
        /usr/local/go/src/testing/testing.go:2054 +0x85
testing.tRunner(0xc0004a89c0, 0xc0001dd568)
        /usr/local/go/src/testing/testing.go:1595 +0x262
testing.runTests(0xc000423860?, {0x26831c0, 0x67, 0x67}, {0x3648980?, 0x1d48a70?, 0x26e1820?})
        /usr/local/go/src/testing/testing.go:2052 +0x8ae
testing.(*M).Run(0xc000423860)
        /usr/local/go/src/testing/testing.go:1925 +0xcd8
github.com/containerd/containerd/integration/client.TestMain(0xe0272dbb69d1ea2c?)
        /home/dht/containerd/containerd-1.7.14/integration/client/client_test.go:164 +0x13b5
main.main()
        _testmain.go:257 +0x308

goroutine 6 [IO wait, 10 minutes]:
internal/poll.runtime_pollWait(0x155461f34eb0, 0x72)
        /usr/local/go/src/runtime/netpoll.go:343 +0x85
internal/poll.(*pollDesc).wait(0xc00007ef80, 0xc000490000?, 0x1)
        /usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0xb1
internal/poll.(*pollDesc).waitRead(...)
        /usr/local/go/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc00007ef60, {0xc000490000, 0x8000, 0x8000})
        /usr/local/go/src/internal/poll/fd_unix.go:164 +0x405
os.(*File).read(...)
        /usr/local/go/src/os/file_posix.go:29
os.(*File).Read(0xc00005e4f0, {0xc000490000, 0x8000, 0x8000})
        /usr/local/go/src/os/file.go:118 +0xad
io.copyBuffer({0x1d42f40, 0xc000013020}, {0x1d42d60, 0xc00005e4f0}, {0x0, 0x0, 0x0})
        /usr/local/go/src/io/io.go:430 +0x27e
io.Copy(...)
        /usr/local/go/src/io/io.go:389
os/exec.(*Cmd).writerDescriptor.func1()
        /usr/local/go/src/os/exec/exec.go:560 +0x5b
os/exec.(*Cmd).Start.func2(0xc000478140)
        /usr/local/go/src/os/exec/exec.go:717 +0x3d
created by os/exec.(*Cmd).Start in goroutine 1
        /usr/local/go/src/os/exec/exec.go:716 +0x101a

goroutine 66 [select, 10 minutes]:
github.com/containerd/containerd/integration/client.TestIssue10589(0xc0004a8b60)
        /home/dht/containerd/containerd-1.7.14/integration/client/container_linux_test.go:1759 +0x2578
testing.tRunner(0xc0004a8b60, 0x1bba938)
        /usr/local/go/src/testing/testing.go:1595 +0x262
created by testing.(*T).Run in goroutine 1
        /usr/local/go/src/testing/testing.go:1648 +0x846

goroutine 41 [select, 10 minutes]:
google.golang.org/grpc/internal/transport.(*Stream).waitOnHeader(0xc0003acc60)
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/internal/transport/transport.go:327 +0x114
google.golang.org/grpc/internal/transport.(*Stream).RecvCompress(...)
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/internal/transport/transport.go:342
google.golang.org/grpc.(*csAttempt).recvMsg(0xc000383380, {0x1a98ba0?, 0xc0003af380}, 0x0)
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/stream.go:1070 +0x16f
google.golang.org/grpc.(*clientStream).RecvMsg.func1(0xc0003acad4?)
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/stream.go:927 +0x51
google.golang.org/grpc.(*clientStream).withRetry(0xc0003aca20, 0xc000413bf0, 0xc000413be0)
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/stream.go:776 +0x198
google.golang.org/grpc.(*clientStream).RecvMsg(0xc0003aca20, {0x1a98ba0?, 0xc0003af380})
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/stream.go:926 +0x1f6
google.golang.org/grpc.invoke({0x1d4dfd0, 0xc0003959b0}, {0x1b845ef, 0x29}, {0x1ac0de0, 0xc0003976d0}, {0x1a98ba0, 0xc0003af380}, 0x155462bc2f18?, {0xc000046620, ...})
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/call.go:73 +0x13a
google.golang.org/grpc.(*ClientConn).Invoke(0xc000202400, {0x1d4dfd0, 0xc0003959b0}, {0x1b845ef, 0x29}, {0x1ac0de0, 0xc0003976d0}, {0x1a98ba0, 0xc0003af380}, {0x0, ...})
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/call.go:37 +0x2f1
github.com/containerd/containerd/api/services/tasks/v1.(*tasksClient).Start(0xc00037a570, {0x1d4dfd0, 0xc0003959b0}, 0xc0001c7100?, {0x0, 0x0, 0x0})
        /home/dht/containerd/containerd-1.7.14/api/services/tasks/v1/tasks_grpc.pb.go:68 +0xee
github.com/containerd/containerd.(*process).Start(0xc0003af350, {0x1d4dfd0, 0xc0003959b0})
        /home/dht/containerd/containerd-1.7.14/process.go:121 +0x242
github.com/containerd/containerd/integration/client.TestIssue10589.func5()
        /home/dht/containerd/containerd-1.7.14/integration/client/container_linux_test.go:1747 +0x148
created by github.com/containerd/containerd/integration/client.TestIssue10589 in goroutine 66
        /home/dht/containerd/containerd-1.7.14/integration/client/container_linux_test.go:1744 +0x23d4

goroutine 42 [syscall, 10 minutes]:
syscall.Syscall6(0x9423a9?, 0xc0004b8e00?, 0xf4a07a?, 0xc000534840?, 0x32?, 0xc0000acd80?, 0xffffffffffffffff?)
        /usr/local/go/src/syscall/syscall_linux.go:91 +0x3d
golang.org/x/sys/unix.openat(0x942855?, {0xc0000acd80, 0x32}, 0x9423a9?, 0x0)
        /root/go/pkg/mod/golang.org/x/sys@v0.13.0/unix/zsyscall_linux.go:87 +0xd3
golang.org/x/sys/unix.Open(...)
        /root/go/pkg/mod/golang.org/x/sys@v0.13.0/unix/syscall_linux.go:109
github.com/containerd/containerd/pkg/fifosync.(*fifo).Wait(0xc000045c50)
        /home/dht/containerd/containerd-1.7.14/pkg/fifosync/fifo_unix.go:113 +0x6c
github.com/containerd/containerd/integration/client.TestIssue10589.func6()
        /home/dht/containerd/containerd-1.7.14/integration/client/container_linux_test.go:1756 +0x43
created by github.com/containerd/containerd/integration/client.TestIssue10589 in goroutine 66
        /home/dht/containerd/containerd-1.7.14/integration/client/container_linux_test.go:1755 +0x250b

goroutine 67 [select, 10 minutes]:
google.golang.org/grpc/internal/grpcsync.(*CallbackSerializer).run(0xc000046660, {0x1d4e008, 0xc000096eb0})
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/internal/grpcsync/callback_serializer.go:83 +0x1e9
created by google.golang.org/grpc/internal/grpcsync.NewCallbackSerializer in goroutine 66
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/internal/grpcsync/callback_serializer.go:55 +0x24a

goroutine 85 [select, 10 minutes]:
google.golang.org/grpc/internal/transport.(*Stream).waitOnHeader(0xc00023a7e0)
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/internal/transport/transport.go:327 +0x114
google.golang.org/grpc/internal/transport.(*Stream).RecvCompress(...)
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/internal/transport/transport.go:342
google.golang.org/grpc.(*csAttempt).recvMsg(0xc000583860, {0x1a98ba0?, 0xc000395cb0}, 0x0)
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/stream.go:1070 +0x16f
google.golang.org/grpc.(*clientStream).RecvMsg.func1(0xc00023a654?)
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/stream.go:927 +0x51
google.golang.org/grpc.(*clientStream).withRetry(0xc00023a5a0, 0xc000125bf0, 0xc000125be0)
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/stream.go:776 +0x198
google.golang.org/grpc.(*clientStream).RecvMsg(0xc00023a5a0, {0x1a98ba0?, 0xc000395cb0})
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/stream.go:926 +0x1f6
google.golang.org/grpc.invoke({0x1d4dfd0, 0xc0003959b0}, {0x1b845ef, 0x29}, {0x1ac0de0, 0xc0004187d0}, {0x1a98ba0, 0xc000395cb0}, 0x155462bc2108?, {0xc000046620, ...})
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/call.go:73 +0x13a
google.golang.org/grpc.(*ClientConn).Invoke(0xc000202400, {0x1d4dfd0, 0xc0003959b0}, {0x1b845ef, 0x29}, {0x1ac0de0, 0xc0004187d0}, {0x1a98ba0, 0xc000395cb0}, {0x0, ...})
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/call.go:37 +0x2f1
github.com/containerd/containerd/api/services/tasks/v1.(*tasksClient).Start(0xc000045b30, {0x1d4dfd0, 0xc0003959b0}, 0xc0000acb80?, {0x0, 0x0, 0x0})
        /home/dht/containerd/containerd-1.7.14/api/services/tasks/v1/tasks_grpc.pb.go:68 +0xee
github.com/containerd/containerd.(*process).Start(0xc0002e6240, {0x1d4dfd0, 0xc0003959b0})
        /home/dht/containerd/containerd-1.7.14/process.go:121 +0x242
github.com/containerd/containerd/integration/client.TestIssue10589.func4()
        /home/dht/containerd/containerd-1.7.14/integration/client/container_linux_test.go:1707 +0x148
created by github.com/containerd/containerd/integration/client.TestIssue10589 in goroutine 66
        /home/dht/containerd/containerd-1.7.14/integration/client/container_linux_test.go:1704 +0x1734

goroutine 83 [IO wait, 10 minutes]:
internal/poll.runtime_pollWait(0x155461f34db8, 0x72)
        /usr/local/go/src/runtime/netpoll.go:343 +0x85
internal/poll.(*pollDesc).wait(0xc0000b57a0, 0xc0003e8000?, 0x0)
        /usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0xb1
internal/poll.(*pollDesc).waitRead(...)
        /usr/local/go/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc0000b5780, {0xc0003e8000, 0x8000, 0x8000})
        /usr/local/go/src/internal/poll/fd_unix.go:164 +0x405
net.(*netFD).Read(0xc0000b5780, {0xc0003e8000, 0x8000, 0x8000})
        /usr/local/go/src/net/fd_posix.go:55 +0x4b
net.(*conn).Read(0xc000522188, {0xc0003e8000, 0x8000, 0x8000})
        /usr/local/go/src/net/net.go:179 +0xad
bufio.(*Reader).Read(0xc00009ac60, {0xc0002a0040, 0x9, 0x9})
        /usr/local/go/src/bufio/bufio.go:244 +0x4be
io.ReadAtLeast({0x1d43220, 0xc00009ac60}, {0xc0002a0040, 0x9, 0x9}, 0x9)
        /usr/local/go/src/io/io.go:335 +0xd0
io.ReadFull(...)
        /usr/local/go/src/io/io.go:354
golang.org/x/net/http2.readFrameHeader({0xc0002a0040, 0x9, 0x9}, {0x1d43220, 0xc00009ac60})
        /root/go/pkg/mod/golang.org/x/net@v0.17.0/http2/frame.go:237 +0x9b
golang.org/x/net/http2.(*Framer).ReadFrame(0xc0002a0000)
        /root/go/pkg/mod/golang.org/x/net@v0.17.0/http2/frame.go:498 +0xf5
google.golang.org/grpc/internal/transport.(*http2Client).reader(0xc000324900, 0x1162f94?)
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/internal/transport/http2_client.go:1594 +0x2e5
created by google.golang.org/grpc/internal/transport.newHTTP2Client in goroutine 70
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/internal/transport/http2_client.go:397 +0x30f6

goroutine 69 [select, 10 minutes]:
google.golang.org/grpc/internal/grpcsync.(*CallbackSerializer).run(0xc0000467a0, {0x1d4e008, 0xc000096f50})
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/internal/grpcsync/callback_serializer.go:83 +0x1e9
created by google.golang.org/grpc/internal/grpcsync.NewCallbackSerializer in goroutine 66
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/internal/grpcsync/callback_serializer.go:55 +0x24a

goroutine 68 [select, 10 minutes]:
google.golang.org/grpc/internal/grpcsync.(*CallbackSerializer).run(0xc000046740, {0x1d4e008, 0xc000096f00})
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/internal/grpcsync/callback_serializer.go:83 +0x1e9
created by google.golang.org/grpc/internal/grpcsync.NewCallbackSerializer in goroutine 66
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/internal/grpcsync/callback_serializer.go:55 +0x24a

goroutine 84 [select, 10 minutes]:
google.golang.org/grpc/internal/transport.(*controlBuffer).get(0xc000548ff0, 0x1)
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/internal/transport/controlbuf.go:418 +0x1af
google.golang.org/grpc/internal/transport.(*loopyWriter).run(0xc0000b6cb0)
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/internal/transport/controlbuf.go:552 +0x14f
google.golang.org/grpc/internal/transport.newHTTP2Client.func6()
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/internal/transport/http2_client.go:451 +0x168
created by google.golang.org/grpc/internal/transport.newHTTP2Client in goroutine 70
        /root/go/pkg/mod/google.golang.org/grpc@v1.58.3/internal/transport/http2_client.go:449 +0x3a6b
exit status 2
FAIL    github.com/containerd/containerd/integration/client     601.594s
make: *** [integration] Error 1

samuelkarp · 2024-11-15T04:45:51Z

@redcurtain

containerd version 1.7.14
version=v1.6.4

https://github.com/containerd/containerd/releases/tag/v1.7.22 and https://github.com/containerd/containerd/releases/tag/v1.6.36 contain the fix for this issue.

redcurtain · 2024-11-15T06:56:10Z

@redcurtain

containerd version 1.7.14
version=v1.6.4

https://github.com/containerd/containerd/releases/tag/v1.7.22 and https://github.com/containerd/containerd/releases/tag/v1.6.36 contain the fix for this issue.

@samuelkarp
not understand clearly, expected to reproduce the issue with my containerd version(1.7.14).
Then merge the patch from https://github.com/containerd/containerd/releases/tag/v1.7.22.
But get a block when reproduce with containerd v1.7.14.

Do you mean that I should get the test from v1.6.36 or v1.7.22?

samuelkarp · 2024-11-15T07:06:40Z

reproduce the issue with my containerd version(1.7.14).

The test is expected to fail (and can block) on versions prior to 1.7.22. The test should pass on 1.7.22.

redcurtain · 2024-11-16T12:26:51Z

reproduce the issue with my containerd version(1.7.14).

The test is expected to fail (and can block) on versions prior to 1.7.22. The test should pass on 1.7.22.

@samuelkarp Thanks for reply.
Test with 1.7.22 Pass，but still get the same block after merged the PR10675 based on 1.7.14.
I will have more look.

linsite · 2024-11-20T10:06:25Z

I see the descriptions in issue and PR are a little bit obscure to grasp. Let me explain the issue a bit more for those who are interested, please correct me if I'm wrong.

The code I dive into is 1.7.21.

The reason why a defunct task exists is that an exit event is missing in containerd-shim-runc-v2.
Why the event is missing? 'cleanup' in preStart which remove subscriber from s.exitSubscribers, is called before s.processExits can see the event.

So that's the race. handleStarted produces the event, s.processExits is the consumer, cleanup is the cleaner. They share a lock, if cleanup got the lock before s.processExits, the issue occurs.

ningmingxiao · 2024-11-21T03:03:57Z

do you have some ways to reproduce @linsite ?

ningmingxiao · 2024-11-21T03:31:14Z

clean run after handleStarted

	handleStarted = func(c *runc.Container, p process.Process) {
		var pid int
		if p != nil {
			pid = p.Pid()
		}
		s.lifecycleMu.Lock()
		ees, exited := exits[pid]
		delete(s.exitSubscribers, &exits)
		exits = nil
		if pid == 0 || exited {
			s.lifecycleMu.Unlock()
			for _, ee := range ees {
				s.handleProcessExit(ee, c, p)
			}
		} else {
			// Process start was successful, add to `s.running`.
			s.running[pid] = append(s.running[pid], containerProcess{
				Container: c,
				Process:   p,
			})
			s.lifecycleMu.Unlock()
		}
	}

handleStarted will make sure exits = nil

	cleanup = func() {
		if exits != nil {
			s.lifecycleMu.Lock()
			defer s.lifecycleMu.Unlock()
			delete(s.exitSubscribers, &exits)
		}
	}

delete(s.exitSubscribers, &exits) may not happen?

linsite · 2024-11-21T05:56:55Z

clean run after handleStarted

	handleStarted = func(c *runc.Container, p process.Process) {
		var pid int
		if p != nil {
			pid = p.Pid()
		}
		s.lifecycleMu.Lock()
		ees, exited := exits[pid]
		delete(s.exitSubscribers, &exits)
		exits = nil
		if pid == 0 || exited {
			s.lifecycleMu.Unlock()
			for _, ee := range ees {
				s.handleProcessExit(ee, c, p)
			}
		} else {
			// Process start was successful, add to `s.running`.
			s.running[pid] = append(s.running[pid], containerProcess{
				Container: c,
				Process:   p,
			})
			s.lifecycleMu.Unlock()
		}
	}

handleStarted will make sure exits = nil

	cleanup = func() {
		if exits != nil {
			s.lifecycleMu.Lock()
			defer s.lifecycleMu.Unlock()
			delete(s.exitSubscribers, &exits)
		}
	}

delete(s.exitSubscribers, &exits) may not happen?

Yes, you're right, my previous analize is wrong. The true problems is the concurrent exec calls can make s.pendingExecs not 0, when the init process exits, it get skipped in the processExists func, so no TaskExit event will ever be sent.

if init && s.pendingExecs[cp.Container] != 0 {
                 skipped = append(skipped, cp)                                                                                                                                             
             } else {                                                                                                                                                                      
              cps = append(cps, cp)                                                                                                                                                     
           }

The 1.7.21 failed case log validate the issue, added in the s.send func.

/tmp/323049941:113:time="2024-11-21T13:05:38.296583785+08:00" level=info msg="linlin Sending event: &events.TaskCreate{state:impl.MessageState{NoUnkeyedLiterals:pragma.NoUnkeyedLiterals{}, DoNotCompare:pragma.DoNotCompare{}, DoNotCopy:pragma.DoNotCopy{}, atomicMessageInfo:(*impl.MessageInfo)(nil)}, sizeCache:0, unknownFields:[]uint8(nil), ContainerID:\"TestIssue10589\", Bundle:\"/run/containerd-test/io.containerd.runtime.v2.task/testing/TestIssue10589\", Rootfs:[]*types.Mount{(*types.Mount)(0xc000330850)}, IO:(*events.TaskIO)(0xc0002aa9c0), Checkpoint:\"\", Pid:0x9ea72}"
/tmp/323049941:121:time="2024-11-21T13:05:38.317175488+08:00" level=info msg="linlin Sending event: &events.TaskStart{state:impl.MessageState{NoUnkeyedLiterals:pragma.NoUnkeyedLiterals{}, DoNotCompare:pragma.DoNotCompare{}, DoNotCopy:pragma.DoNotCopy{}, atomicMessageInfo:(*impl.MessageInfo)(nil)}, sizeCache:0, unknownFields:[]uint8(nil), ContainerID:\"TestIssue10589\", Pid:0x9ea72}"
/tmp/323049941:122:time="2024-11-21T13:05:38.322465889+08:00" level=info msg="linlin Sending event: &events.TaskExecAdded{state:impl.MessageState{NoUnkeyedLiterals:pragma.NoUnkeyedLiterals{}, DoNotCompare:pragma.DoNotCompare{}, DoNotCopy:pragma.DoNotCopy{}, atomicMessageInfo:(*impl.MessageInfo)(nil)}, sizeCache:0, unknownFields:[]uint8(nil), ContainerID:\"TestIssue10589\", ExecID:\"exec1\"}"
/tmp/323049941:123:time="2024-11-21T13:05:38.334166390+08:00" level=info msg="linlin Sending event: &events.TaskExecAdded{state:impl.MessageState{NoUnkeyedLiterals:pragma.NoUnkeyedLiterals{}, DoNotCompare:pragma.DoNotCompare{}, DoNotCopy:pragma.DoNotCopy{}, atomicMessageInfo:(*impl.MessageInfo)(nil)}, sizeCache:0, unknownFields:[]uint8(nil), ContainerID:\"TestIssue10589\", ExecID:\"exec2\"}"

linsite · 2024-11-21T05:59:12Z

We've noticed an odd thing. This testcase can 100% fail 1.7.21 code in a phiscal PC with 2.8GHz CPU , but 100% can't fail any VirtualMachine with 2.6 or 2.2GHz vCPU. The OS is ubuntu18.024 or 20.04, Golang version is 1.22.

ningmingxiao · 2024-11-21T06:22:03Z

We've noticed an odd thing. This testcase can 100% fail 1.7.21 code in a phiscal PC with 2.8GHz CPU , but 100% can't fail any VirtualMachine with 2.6 or 2.2GHz vCPU. The OS is ubuntu18.024 or 20.04, Golang version is 1.22.

1.7.22 fix this you can try this pr #10675

linsite · 2024-11-21T06:26:22Z

thanks, :). I know the fix. All I want is to reproduce that issue and figure out the root cause.

FYI, after some study, I found that by adding a time.Sleep(50 * time.Milliseconds) after the 4th step, the issue can be triggered 100% in 1.7.21. That can ensure the init process exits with a non-zero s.pendingExecs[cp.Container].

best wishes.

wyike · 2024-12-07T02:58:48Z

We observe very similar phenomenon with containerd v1.6.33. It happens during running "kubectl drain node".

I only have part of logs at handy. From the log, node drain hangs for hours and one pod/container keeps failing to be killed. It seems there is a race between init exits and exec process. Both PreStop hook and container exit cannot finish forever.

./kubelet/drain.stdout.log:7:evicting pod tsmgr-3aac4723-0f75-4d1e-950f-6d738a30ebb7/postgres-db-monitor-0
./kubelet/kubelet.stderr.log:189151:E1125 13:25:04.871098    9713 kuberuntime_container.go:747] "Container termination failed with gracePeriod" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" pod="tsmgr-3aac4723-0f75-4d1e-950f-6d738a30ebb7/postgres-db-monitor-0" podUID=9c55ca22-eb20-44ae-a90f-3db4024d3cb4 containerName="monitor" containerID="containerd://19dff9cd5487c32c4f1bd5375e5eaa6392208fdb515a6bfb691a04b03bf28409" gracePeriod=10
./kubelet/kubelet.stderr.log:189152:E1125 13:25:04.871150    9713 kuberuntime_container.go:772] "Kill container failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" pod="tsmgr-3aac4723-0f75-4d1e-950f-6d738a30ebb7/postgres-db-monitor-0" podUID=9c55ca22-eb20-44ae-a90f-3db4024d3cb4 containerName="monitor" containerID={Type:containerd ID:19dff9cd5487c32c4f1bd5375e5eaa6392208fdb515a6bfb691a04b03bf28409}
./kubelet/kubelet.stderr.log:189517:E1125 13:27:04.871591    9713 pod_workers.go:1294] "Error syncing pod, skipping" err="[failed to \"KillContainer\" for \"monitor\" with KillContainerError: \"rpc error: code = DeadlineExceeded desc = context deadline exceeded\", failed to \"KillPodSandbox\" for \"9c55ca22-eb20-44ae-a90f-3db4024d3cb4\" with KillPodSandboxError: \"rpc error: code = DeadlineExceeded desc = context deadline exceeded\"]" pod="tsmgr-3aac4723-0f75-4d1e-950f-6d738a30ebb7/postgres-db-monitor-0" podUID=9c55ca22-eb20-44ae-a90f-3db4024d3cb4
./kubelet/kubelet.stderr.log:189519:E1125 13:27:05.061383    9713 kuberuntime_container.go:634] "PreStop hook failed" err="rpc error: code = Unknown desc = failed to exec in container: failed to start exec \"ba8029c2fda9bb0a1cc5fbb4fcb3e58a391b44ad16073e6bbc1345bc2364de15\": OCI runtime exec failed: exec failed: cannot exec in a stopped container: unknown" pod="tsmgr-3aac4723-0f75-4d1e-950f-6d738a30ebb7/postgres-db-monitor-0" podUID=9c55ca22-eb20-44ae-a90f-3db4024d3cb4 containerName="monitor" containerID="containerd://19dff9cd5487c32c4f1bd5375e5eaa6392208fdb515a6bfb691a04b03bf28409"

./kubelet/kubelet.stderr.log:189519:E1125 13:27:05.061383    9713 kuberuntime_container.go:634] "PreStop hook failed" err="rpc error: code = Unknown desc = failed to exec in container: failed to start exec \"ba8029c2fda9bb0a1cc5fbb4fcb3e58a391b44ad16073e6bbc1345bc2364de15\": OCI runtime exec failed: exec failed: cannot exec in a stopped container: unknown" pod="tsmgr-3aac4723-0f75-4d1e-950f-6d738a30ebb7/postgres-db-monitor-0" podUID=9c55ca22-eb20-44ae-a90f-3db4024d3cb4 containerName="monitor" containerID="containerd://19dff9cd5487c32c4f1bd5375e5eaa6392208fdb515a6bfb691a04b03bf28409" 
...
./kubelet/kubelet.stderr.log:202464:E1125 14:38:03.973276    9713 kuberuntime_container.go:634] "PreStop hook failed" err="rpc error: code = Unknown desc = failed to exec in container: failed to start exec \"86b0a50e384bef51dc6153a78bff53a52b31658b63859be227a6790d04915cae\": OCI runtime exec failed: exec failed: cannot exec in a stopped container: unknown" pod="tsmgr-3aac4723-0f75-4d1e-950f-6d738a30ebb7/postgres-db-monitor-0" podUID=9c55ca22-eb20-44ae-a90f-3db4024d3cb4 containerName="monitor" containerID="containerd://19dff9cd5487c32c4f1bd5375e5eaa6392208fdb515a6bfb691a04b03bf28409"

kubelet/kubelet.stderr.log:189152:E1125 13:25:04.871150    9713 kuberuntime_container.go:772] "Kill container failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" pod="tsmgr-3aac4723-0f75-4d1e-950f-6d738a30ebb7/postgres-db-monitor-0" podUID=9c55ca22-eb20-44ae-a90f-3db4024d3cb4 containerName="monitor" containerID={Type:containerd ID:19dff9cd5487c32c4f1bd5375e5eaa6392208fdb515a6bfb691a04b03bf28409}
...
kubelet/kubelet.stderr.log:202097:E1125 14:36:03.459989    9713 kuberuntime_container.go:772] "Kill container failed" err="rpc error: code = DeadlineExceeded desc = an error occurs during waiting for container \"19dff9cd5487c32c4f1bd5375e5eaa6392208fdb515a6bfb691a04b03bf28409\" to be killed: wait container \"19dff9cd5487c32c4f1bd5375e5eaa6392208fdb515a6bfb691a04b03bf28409\": context deadline exceeded" pod="tsmgr-3aac4723-0f75-4d1e-950f-6d738a30ebb7/postgres-db-monitor-0" podUID=9c55ca22-eb20-44ae-a90f-3db4024d3cb4 containerName="monitor" containerID={Type:containerd ID:19dff9cd5487c32c4f1bd5375e5eaa6392208fdb515a6bfb691a04b03bf28409}
kubelet/kubelet.stderr.log:202857:E1125 14:40:13.974144    9713 kuberuntime_container.go:772] "Kill container failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" pod="tsmgr-3aac4723-0f75-4d1e-950f-6d738a30ebb7/postgres-db-monitor-0" podUID=9c55ca22-eb20-44ae-a90f-3db4024d3cb4 containerName="monitor" containerID={Type:containerd ID:19dff9cd5487c32c4f1bd5375e5eaa6392208fdb515a6bfb691a04b03bf28409}

containerd log also reports the same for hours

containerd/containerd.stderr.log:1211:time="2024-11-25T13:22:54.871258928Z" level=info msg="Stop container \"19dff9cd5487c32c4f1bd5375e5eaa6392208fdb515a6bfb691a04b03bf28409\" with signal terminated"
containerd/containerd.stderr.log:1568:time="2024-11-25T13:22:58.618693876Z" level=error msg="ExecSync for \"19dff9cd5487c32c4f1bd5375e5eaa6392208fdb515a6bfb691a04b03bf28409\" failed" error="failed to exec in container: failed to start exec \"b3f29d2cd30300fda165f8bbb3594ffd127bd3178ff45567296f0820a82c39df\": OCI runtime exec failed: exec failed: cannot exec in a stopped container: unknown"
...
containerd/containerd.stderr.log:16315:time="2024-11-25T14:40:16.388797736Z" level=error msg="ExecSync for \"19dff9cd5487c32c4f1bd5375e5eaa6392208fdb515a6bfb691a04b03bf28409\" failed" error="failed to exec in container: failed to start exec \"d0d70536a9f23370d07881b7dae74a67f2ffd016cd4e120c8bdeacb70cc464b1\": OCI runtime exec failed: exec failed: cannot exec in a stopped container: unknown"

ningmingxiao · 2024-12-07T04:45:28Z

can you ps -auxf show process status. (containerd main process and exec process status)
and ctr -n k8s.io task ls and kill -10 shimpid (will print stack into containerd). @wyike

wyike · 2024-12-11T05:44:52Z

Hi @ningmingxiao unfortunately our customer upgrades all their clusters with workaround anyhow and we didn't catch the chance to collect more of these data. Thanks for the guidance anyway!

k8s-ci-robot added the do-not-merge/work-in-progress label Aug 28, 2024

k8s-ci-robot requested a review from laurazard August 28, 2024 04:09

k8s-ci-robot added the size/L label Aug 28, 2024

samuelkarp force-pushed the shim-exec-fp-test branch 2 times, most recently from 04ebfeb to 7627176 Compare August 28, 2024 04:19

fuweid reviewed Aug 28, 2024

View reviewed changes

laurazard mentioned this pull request Aug 28, 2024

runc-shim: fix races/prevent init exits from being dropped #10651

Merged

chrishenzie reviewed Aug 29, 2024

View reviewed changes

integration/client/container_linux_test.go Outdated Show resolved Hide resolved

laurazard mentioned this pull request Aug 30, 2024

Containerd lost the control of shim #10636

Closed

samuelkarp force-pushed the shim-exec-fp-test branch 2 times, most recently from 0c7d310 to e70eefc Compare September 3, 2024 23:24

samuelkarp mentioned this pull request Sep 4, 2024

make sure init exited event is last sent #10650

Closed

fifosync: cross-process synchronization

5f37a2c

Signed-off-by: Samuel Karp <samuelkarp@google.com>

samuelkarp force-pushed the shim-exec-fp-test branch from e70eefc to 376cb86 Compare September 5, 2024 19:48

samuelkarp marked this pull request as ready for review September 5, 2024 19:49

k8s-ci-robot removed the do-not-merge/work-in-progress label Sep 5, 2024

dosubot bot added area/runtime Runtime kind/test labels Sep 5, 2024

dims reviewed Sep 5, 2024

View reviewed changes

pkg/fifosync/fifo_unix.go Show resolved Hide resolved

dims reviewed Sep 5, 2024

View reviewed changes

integration/failpoint/const.go Outdated Show resolved Hide resolved

samuelkarp force-pushed the shim-exec-fp-test branch from e768237 to 0c00392 Compare September 5, 2024 22:58

samuelkarp force-pushed the shim-exec-fp-test branch from 0c00392 to 18725f0 Compare September 5, 2024 23:17

laurazard approved these changes Sep 6, 2024

View reviewed changes

dims added this pull request to the merge queue Sep 6, 2024

Merged via the queue into containerd:main with commit e013322 Sep 6, 2024
52 checks passed

This was referenced Sep 6, 2024

[release/1.7] integration: regression test for issue 10589 #10682

Merged

[release/1.6] integration: regression test for issue 10589 #10683

Merged

samuelkarp added cherry-picked/1.6.x PR commits are cherry-picked into release/1.6 branch cherry-picked/1.7.x PR commits are cherry-picked into release/1.7 branch labels Sep 6, 2024

linsite mentioned this pull request Nov 20, 2024

supposedly running container actually dead #10589

Closed

cartermckinnon mentioned this pull request Nov 21, 2024

bug(containerd): aws-node pods crashlooping using containerd version >=1.7.2x awslabs/amazon-eks-ami#2067

Closed

ningmingxiao mentioned this pull request Dec 12, 2024

Pod stuck terminating due to zombie container lingering #11133

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

integration: regression test for issue 10589 #10649

integration: regression test for issue 10589 #10649

samuelkarp commented Aug 28, 2024 •

edited

Loading

k8s-ci-robot commented Aug 28, 2024

k8s-ci-robot commented Aug 28, 2024

samuelkarp commented Aug 28, 2024

fuweid Aug 28, 2024

laurazard commented Aug 28, 2024 •

edited

Loading

laurazard commented Aug 28, 2024

samuelkarp commented Aug 28, 2024 •

edited

Loading

samuelkarp commented Sep 3, 2024 •

edited

Loading

laurazard commented Sep 4, 2024

samuelkarp commented Sep 4, 2024

samuelkarp commented Sep 5, 2024

samuelkarp commented Sep 5, 2024

laurazard left a comment

redcurtain commented Nov 15, 2024 •

edited

Loading

samuelkarp commented Nov 15, 2024

redcurtain commented Nov 15, 2024

samuelkarp commented Nov 15, 2024 •

edited

Loading

redcurtain commented Nov 16, 2024 •

edited

Loading

linsite commented Nov 20, 2024 •

edited

Loading

ningmingxiao commented Nov 21, 2024

ningmingxiao commented Nov 21, 2024

linsite commented Nov 21, 2024 •

edited

Loading

linsite commented Nov 21, 2024 •

edited

Loading

ningmingxiao commented Nov 21, 2024

linsite commented Nov 21, 2024 •

edited

Loading

wyike commented Dec 7, 2024 •

edited

Loading

ningmingxiao commented Dec 7, 2024

wyike commented Dec 11, 2024

integration: regression test for issue 10589 #10649

integration: regression test for issue 10589 #10649

Conversation

samuelkarp commented Aug 28, 2024 • edited Loading

k8s-ci-robot commented Aug 28, 2024

k8s-ci-robot commented Aug 28, 2024

samuelkarp commented Aug 28, 2024

fuweid Aug 28, 2024

Choose a reason for hiding this comment

laurazard commented Aug 28, 2024 • edited Loading

laurazard commented Aug 28, 2024

samuelkarp commented Aug 28, 2024 • edited Loading

samuelkarp commented Sep 3, 2024 • edited Loading

laurazard commented Sep 4, 2024

samuelkarp commented Sep 4, 2024

samuelkarp commented Sep 5, 2024

samuelkarp commented Sep 5, 2024

laurazard left a comment

Choose a reason for hiding this comment

redcurtain commented Nov 15, 2024 • edited Loading

samuelkarp commented Nov 15, 2024

redcurtain commented Nov 15, 2024

samuelkarp commented Nov 15, 2024 • edited Loading

redcurtain commented Nov 16, 2024 • edited Loading

linsite commented Nov 20, 2024 • edited Loading

ningmingxiao commented Nov 21, 2024

ningmingxiao commented Nov 21, 2024

linsite commented Nov 21, 2024 • edited Loading

linsite commented Nov 21, 2024 • edited Loading

ningmingxiao commented Nov 21, 2024

linsite commented Nov 21, 2024 • edited Loading

wyike commented Dec 7, 2024 • edited Loading

ningmingxiao commented Dec 7, 2024

wyike commented Dec 11, 2024

samuelkarp commented Aug 28, 2024 •

edited

Loading

laurazard commented Aug 28, 2024 •

edited

Loading

samuelkarp commented Aug 28, 2024 •

edited

Loading

samuelkarp commented Sep 3, 2024 •

edited

Loading

redcurtain commented Nov 15, 2024 •

edited

Loading

samuelkarp commented Nov 15, 2024 •

edited

Loading

redcurtain commented Nov 16, 2024 •

edited

Loading

linsite commented Nov 20, 2024 •

edited

Loading

linsite commented Nov 21, 2024 •

edited

Loading

linsite commented Nov 21, 2024 •

edited

Loading

linsite commented Nov 21, 2024 •

edited

Loading

wyike commented Dec 7, 2024 •

edited

Loading