-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CRI changes to support in-place pod resize #111645
CRI changes to support in-place pod resize #111645
Conversation
@vinaykul: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Hi @vinaykul. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/sig-node |
@vinaykul: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
klog.V(10).InfoS("[RemoteRuntimeService] UpdateContainerResources", "containerID", containerID, "timeout", r.timeout) | ||
ctx, cancel := getContextWithTimeout(r.timeout) | ||
defer cancel() | ||
|
||
if r.useV1API() { | ||
_, err = r.runtimeClient.UpdateContainerResources(ctx, &runtimeapi.UpdateContainerResourcesRequest{ | ||
ContainerId: containerID, | ||
Linux: resources, | ||
Linux: resources.GetLinux(), | ||
Windows: resources.GetWindows(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks to me the ContainerResources object declared below is unused. I think we should specify Linux, Windows and ContainerResources{Linux, Windows} to allow for some backwards compatibliity for containerd (whose old version expect just the Linux and Windows fields, and not yet the ContainerResources field). Am I interpreting this right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, originally I had a separate ContainerResources with Linux and windows - it is still in the KEP , but WindowsContainerResources was added to UCRR and made that unnecessary, and I removed it recently after @mikebrow spotted it Please see comment
Did I misunderstand you?
}) | ||
} else { | ||
_, err = r.runtimeClientV1alpha2.UpdateContainerResources(ctx, &runtimeapiV1alpha2.UpdateContainerResourcesRequest{ | ||
ContainerId: containerID, | ||
Linux: v1alpha2LinuxContainerResources(resources), | ||
Linux: v1alpha2LinuxContainerResources(resources.GetLinux()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We updated v1alpha2 with the new ContainerResources structure, so I think we should do the same we do above. If we don't update v1alpha2, then we can leave this as it is now IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, adding..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no conversion function for v1alpha2LinuxContainerResources.
Is it wise to just add an unsafe pointer conversion func? I don't have a way to test/verify.
I'm not very sure about this below. Does it look right to you?
diff --git a/pkg/kubelet/cri/remote/conversion.go b/pkg/kubelet/cri/remote/conversion.go
index 35539397755..120b718cbf5 100644
--- a/pkg/kubelet/cri/remote/conversion.go
+++ b/pkg/kubelet/cri/remote/conversion.go
@@ -113,6 +113,10 @@ func v1alpha2LinuxContainerResources(from *runtimeapi.LinuxContainerResources) *
return (*v1alpha2.LinuxContainerResources)(unsafe.Pointer(from))
}
+func v1alpha2WindowsContainerResources(from *runtimeapi.WindowsContainerResources) *v1alpha2.WindowsContainerResources {
+ return (*v1alpha2.WindowsContainerResources)(unsafe.Pointer(from))
+}
+
func v1alpha2ExecRequest(from *runtimeapi.ExecRequest) *v1alpha2.ExecRequest {
// If this function changes, also adapt the corresponding Exec code in
// pkg/kubelet/cri/remote/remote_runtime.go
diff --git a/pkg/kubelet/cri/remote/remote_runtime.go b/pkg/kubelet/cri/remote/remote_runtime.go
index a09879b346a..976f845e011 100644
--- a/pkg/kubelet/cri/remote/remote_runtime.go
+++ b/pkg/kubelet/cri/remote/remote_runtime.go
@@ -639,6 +639,7 @@ func (r *remoteRuntimeService) UpdateContainerResources(containerID string, reso
_, err = r.runtimeClientV1alpha2.UpdateContainerResources(ctx, &runtimeapiV1alpha2.UpdateContainerResourcesRequest{
ContainerId: containerID,
Linux: v1alpha2LinuxContainerResources(resources.GetLinux()),
+ Windows: v1alpha2WindowsContainerResources(resources.GetWindows()),
})
}
if err != nil {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps @marosset can comment if this is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah that looks correct. this could also take the approach of other recent CRI changes and just drop v1alpha2 support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SG. Adding with rebase fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like containerd has had support updating Windows resources in UpdateContainerResources
since the 1.6 release (which I'm pretty sure still used v1alpha2
).
Because of this I think we should include the Windows changes in v1alpha2
and v1
.
x-ref containerd/containerd#5778
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the changes I just linked use the v1
API.
I don't think we need to support v1alpha2
but since the changes are already in might as well leave them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove it with the main feature changes in 1.26. I've added a TODO tracking item for this.
/assign @ruiwen-zhao @Random-Liu @thockin @liggitt - please take a quick look, your sharp eyes needed if you can spare them, mine are blurry right now. This is a drag-and-drop of the CRI changes from #102884 |
@@ -42,7 +42,7 @@ import ( | |||
type ActivePodsFunc func() []*v1.Pod | |||
|
|||
type runtimeService interface { | |||
UpdateContainerResources(id string, resources *runtimeapi.LinuxContainerResources) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a blocker but rather just to confirm, changing this parameter from LinuxContainerResources
to ContainerResources
could be decoupled from the CRI changes right? remote_runtime would just need to call UpdateContainerResources with LinuxContainerResources directly (windows would not be supported tho).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does runtime call this? IIUC, runtime responds to it, and yes it does not have to change. The UpdateContainerResourcesRequest.Linux field will contain what it needs same as before. Windows runtime looks at UpdateContainerResourcesRequest.Windows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because of the way the changes went in (not changing the CRI api except for the additional status field and some additional context) the client side manager proto should only impact cri-tools implementing the manager and container runtimes doing integration by implementing a manager for test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be good with decoupling the client side manager changes as well.. if desired..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
noting windows was there in the request before, but was not being used:
message UpdateContainerResourcesRequest {
// ID of the container to update.
string container_id = 1;
// Resource configuration specific to Linux containers.
LinuxContainerResources linux = 2;
// Resource configuration specific to Windows containers.
WindowsContainerResources windows = 3;
// Unstructured key-value map holding arbitrary additional information for
// container resources updating. This can be used for specifying experimental
// resources to update or other options to use when updating the container.
map<string, string> annotations = 4;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, runtime responds to it.
That I understand. I was just trying to confirm the parameters here did not need to change.
the client side manager proto should only impact cri-tools implementing the manager and container runtimes doing integration by implementing a manager for test
Gotcha. Yeah in that case this should have very little risk. I am okay with leaving this as the current PR is.
@@ -515,8 +515,10 @@ func (m *manager) updateContainerCPUSet(containerID string, cpus cpuset.CPUSet) | |||
// this patch-like partial resources. | |||
return m.containerRuntime.UpdateContainerResources( | |||
containerID, | |||
&runtimeapi.LinuxContainerResources{ | |||
CpusetCpus: cpus.String(), | |||
&runtimeapi.ContainerResources{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note to self: need to carry these param changes forward to the main/containerd/containerd/integration bucket for local testing
/assign @marosset |
KEP: /enhancements/keps/sig-node/1287-in-place-update-pod-resources
c6d97b8
to
09fb5da
Compare
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mikebrow, mrunalp, vinaykul The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
&runtimeapi.LinuxContainerResources{ | ||
CpusetCpus: cpus.String(), | ||
&runtimeapi.ContainerResources{ | ||
Linux: &runtimeapi.LinuxContainerResources{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a Windows
field in here too for completeness?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mark, thanks for the quick response. Does Windows have an equivalent for Cpuset? Unless I missed something, I don't see an equivalent in WindowsContainerResources.
In any case, we are less than 2 hours away from code freeze and I don't want to reset CI unless I have to. If this can come in with changes for adding windows support in or before beta, lets do it at that point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There isn't an equivalent for cpuset on Windows (today)
CRI changes LGTM. Had one open question about adding in WindowsContainerResousces tot he fake runtime objects for completely but this is non-blocking. |
LGTM thanks! |
/lgtm |
/lgtm |
Since code freeze is in effect, we need to add the milestone if this should be merged. |
/milestone 1.25 |
@mrunalp: You must be a member of the kubernetes/milestone-maintainers GitHub team to set the milestone. If you believe you should be able to issue the /milestone command, please contact your Milestone Maintainers Team and have them propose you as an additional delegate for this responsibility. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@dims could you help add milestone? |
looks like this had the /milestone v1.25 |
What type of PR is this?
/kind feature
/kind api-change
What this PR does / why we need it:
It implements CRI API changes to support In-Place Pod Vertical Scaling. It is needed to unblock the implementation of CRI support needed for the main feature to work end-to-end.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
The changes in here are extraction of CRI changes from the main PR #102884
Does this PR introduce a user-facing change? Yes
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: