-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support spotFleet for instances groups #1784
Comments
I agree this would be a great feature. There were some limitations around tagging of instances previously which make this non-trivial. An interesting option (IMO) is to have the autoscaler be "spot fleet" aware; either by directly using spot fleet or by reimplementing spot fleet. The advantage of reimplementing spot fleet is maybe the autoscaler can know better what resources are required, for example whether we need CPU or Memory or GPUs etc. Also that it would be cloud agnostic. |
The user mumoshu in the public kubernetes-incubator/kube-aws project has created an implementation that works around the tagging limitations kubernetes-retired/kube-aws#112 As an individual interested in running kubernetes in a production context, the target-capacity limitation of cloudformation with the aforementioned implementation documented at https://github.com/kubernetes-incubator/kube-aws/blob/master/Documentation/kubernetes-on-aws-node-pool.md#known-limitations is troubling.
I do believe that a re-implementation of spot fleet would be beneficial especially if it means I can mix lifecycle type nodes and define pods that are In a production context zone coverage of application pods is also important. Ideally this should be adjustable so you can get the biggest bang for your buck if you are not running in production. Also draining the pods when a termination is signaled http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-interruptions.html#spot-instance-termination-notices will make a more opaque end-user experience of services running in the cluster. |
As mentioned in kubernetes/kubernetes#24472, AWS now supports propagating tags on Spot Fleets! |
Would this then also allow to support https://github.com/cristim/autospotting? Or would it just be a matter of that tool to then have to set the right tag on each instance it attaches to the autoscaling group? |
So spot fleet actually works with kubernetes, now that we have tagging support. Changing a spot fleet configuration though is different from how ASGs work, so we need another approach. I'm hoping we can look at this in the 1.9 1.10 timeframe, using the machines API probably. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale @justinsb is there any updated docs on how to configure kops to do this? |
@cdenneen this needs to be implemented, we only support spot instances via the ASG API calls. /lifecycle frozen |
I am marking with help wanted label since this is a bit of a project, not a great place to start, but a great feature to add. |
@justinsb I have two comments about #1784 (comment) ...
@chrislovecnm And on #1784 (comment) ...
I'd posted on Slack that I'm looking to contribute by tackling this but based on my research I'm convinced that the solution is not trivial, a bit premature to tackle and is probably best addressed on the autoscaler level. And if AWS indeed connect EC2 Fleets with ASGs, then even the autoscaler won't need to reinvent the wheel. I guess the question is when! 😅 FYI, there are two issues on autoscaler about fleets. There's kubernetes/autoscaler#519 requesting support for Spot Fleets and kubernetes/autoscaler#838 requesting support for EC2 Fleets. |
Is there any documentation on how to manually make your own instance groups? In EKS, I made my own spot fleet, but that is because the userdata for nodes joining an EKS cluster is pretty simple and I could easily change it up for my needs. The userdata for nodes launched with kops is a bit more involved... :/ |
@cjbottaro how did you make your own spot fleet for EKS? I'm interested in how you implemented this and the benefits you see in doing so. Thanks |
the benefits We currently run in ECS and we have two AGS (using launch configs) populating our cluster: one for on-demand instances and one for spot. More than a few times, we've seen spot prices spike, and thus wipe out 3/4 of our cluster, causing an outage before we realize what's going on and up our on-demand instances to take over. Spot fleets (which require launch templates instead of launch configs) let you specify multiple machine types. So if prices spike for one type and get wiped out, the fleet will automatically fulfill other machine types that you can afford. Furthermore, you can tell the spot fleet to diversity among the machine types / availability zones, so there is less chance of max extinction. How in EKS I followed the tutorial to get an EKS cluster up and running, which includes creating a launch config and ASG for nodes. I simply converted the launch config to a launch template (by hand in the AWS web console) and copy/pasted the userdata which configures and starts kubelet, then made a spot fleet using that launch template. The user data for EKS nodes is pretty small and easy to understand. To add taints and labels, you can simply add something like this to the end of the user data:
Issue with kops Kops has these first class objects It would be nice if there was a command to output the user data for a given instancegroup:
So we can create our own launch configs / launch templates. Thanks! |
I think this also needs some proper k8s core integration - I've filed kubernetes/kubernetes#70342 about that. |
related #6277 |
@gambol99 , I checked the PR, it looks great. |
@disha1104 ...
yep :-) .. |
That's really great. |
It's amazing. I was thinking on using When it is planned to be merged? |
So I don't cut the releases ... you'd need to be speak to @justinsb
The launch-configuration -> launch-template is done automatically (though admittedly it leaves the LC hanging, so just needs to deleted manually post update)
So I this doesn't add a spot termination handler addon as I'd rather let the users choose how they want it done. |
@justinsb which release is this planned for? |
#6277 is merged, I think we can close this one 🎉 |
@wanghanlin: You can't close an active issue/PR unless you authored it or you are a collaborator. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Spot fleets allow to use a wide range of instances types to provide certain amount of resources by requesting spot instances. This is great to provide capacity for workloads that require high amount of resources, dont require real-time interaction and are able to recover if the node dissapear
kubernetes/kubernetes#24472
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-fleet.html
The text was updated successfully, but these errors were encountered: