Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service objects should be annotated by cloud-providers to track underlying resources. #70159

Open
aoxn opened this issue Oct 24, 2018 · 28 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/network Categorizes an issue or PR as relevant to SIG Network.

Comments

@aoxn
Copy link
Contributor

aoxn commented Oct 24, 2018

/kind feature

What we want to do

We need to add an extra field Service.Status.LoadBalancer.Ingress[0].ProviderID in Service object to identify the CloudProvider SLB id , like what we did in Node object.
This would benefit cloudprovider for supporting rename SLB.

Reason

Cloud Provider createed a LoadBalancer for service with Type equals LoadBalancer. However, Service object does not provide a way to record cloudprovider`s SLB Id . So, each cloudprovider create the loadbalancer with a name generated from service object use the code below which is meaningless string and also NOT UNIQ for the cloud service. People can change the name from cloud console which result in a failure in kubernetes cloudprovider.(CloudProvider would failed to find the SLB again after the name change)

We found that if we could add an extra field named ProviderID in the Service.Status.LoadBalancer field then we would allow user to rename their LB name freely.

Like what we did in Node Spec.

Hope to hear feedback from the community.

CloudProvider auto generate an SLB name with the following code .
// TODO(#6812): Use a shorter name that's less likely to be longer than cloud // providers' name length limits. func GetLoadBalancerName(service *v1.Service) string { //GCE requires that the name of a load balancer starts with a lower case letter. ret := "a" + string(service.UID) ret = strings.Replace(ret, "-", "", -1) //AWS requires that the name of a load balancer is shorter than 32 bytes. if len(ret) > 32 { ret = ret[:32] } return ret }
Which is meaningless string.

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 24, 2018
@aoxn
Copy link
Contributor Author

aoxn commented Oct 24, 2018

/cc @andrewsykim /sig api /sig cloud-provider

@andrewsykim
Copy link
Member

/sig network cloud-provider apimachinery

@k8s-ci-robot k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 24, 2018
@andrewsykim
Copy link
Member

/assign @thockin

@thockin thockin assigned bowei and thockin and unassigned thockin Oct 26, 2018
@thockin
Copy link
Member

thockin commented Oct 26, 2018

@etune for viz

I don't know that adding a field is the right thing here. Some implementations will have a single "backend" resource and others will have multiple.

I've been suggesting that folks consider provider-specific annotations to carry this data. Would that be enough for you?

@andrewsykim
Copy link
Member

andrewsykim commented Oct 26, 2018

Did you mean to ping @erictune?

I don't know that adding a field is the right thing here. Some implementations will have a single "backend" resource and others will have multiple.

Can you clarify this a little bit? Do you mean that some providers can have multiple types of LBs so a single provider ID format is not sufficient to cover those types or did you mean there are implementations that create multiple LBs so more than 1 provider ID is associated with a Service?

To add more to the issue description, this is in relation to #69293. Here we're talking about setting a (unique) provider ID for the LB associated with a Service with type LoadBalancer (equivalent to node.Spec.ProviderID for nodes). By doing this we don't need to rely on cryptic loadbalancer names or worry about naming conflicts on the cloud provider. I'm not aware of any cloud provider that does not have unique IDs attached to a LB resource.

I think ideally we want this pattern for all providers, but not sure if requiring it from all providers warrants an addition to the core API. Maybe an annotation that is required to be set by all cloud providers is sufficient. Even if we went with annotations we would still need to ensure it is backwards compatible similar to v1 API going forward. On the other hand, adding a field lets us add our own validations from apiserver which like node.Spec.ProviderID is nice because we can pre-define some of the behaviours around LBs that we might want. What do you think?

@thockin
Copy link
Member

thockin commented Oct 26, 2018

yes, I meant @erictune because he was doing some thinking in this area

What I meant is that there's no requirement than an external load-balancer be represented as a single resource. E.g. in GCP it's actually 3 resources (FwdRule, HealthCheck, TargetPool). An Ingress is more like 5 resources. Which backend thing would this carry, and what about the rest? Comparatively, I can define as many gcp-specific annotations as I need and nobody needs to care except people who WANT to dig into backends, in which case they need it all.

I think it's a good idea to make this linkage more explicit. I just don't know that a new field is the right model.

Maybe an annotation that is required to be set by all cloud providers is sufficient. Even if we went with annotations we would still need to ensure it is backwards compatible similar to v1 API going forward

What are you going to do with this information generically? What validation can you actually do? Better LB names is a great thing (I have wanted it for years) but that logic exists within provider-specific-space only, so why try to generalize it?

@andrewsykim
Copy link
Member

andrewsykim commented Oct 26, 2018

What I meant is that there's no requirement than an external load-balancer be represented as a single resource. E.g. in GCP it's actually 3 resources (FwdRule, HealthCheck, TargetPool). An Ingress is more like 5 resources. Which backend thing would this carry, and what about the rest? Comparatively, I can define as many gcp-specific annotations as I need and nobody needs to care except people who WANT to dig into backends, in which case they need it all.

This is not something I have considered yet 🤔 which definitely changes my perspective on this problem.

What are you going to do with this information generically? What validation can you actually do? Better LB names is a great thing (I have wanted it for years) but that logic exists within provider-specific-space only, so why try to generalize it?

This makes sense. I'm 👍 for annotations given there are some consistent best practices shared amongst providers. Will wait for @erictune and @aoxn to comment.

@aoxn
Copy link
Contributor Author

aoxn commented Oct 30, 2018

Annotation seems OK.

For annotation, Since CloudProvider implementation(aws, alibaba etc.) does not allowed to modify service object directly, this would require CloudProvider framework doing the provider id annotation writing back on service object.

And method signature might need to modified to contain the required providerid info. Something like below.
`
func (c *Cloud) EnsureLoadBalancer(clusterName string, service *v1.Service, nodes []*v1.Node) (*v1.LoadBalancerStatus, providerid, error) {

`

@andrewsykim @thockin

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 28, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 27, 2019
@thockin thockin added the triage/unresolved Indicates an issue that can not or will not be resolved. label Mar 8, 2019
@thockin
Copy link
Member

thockin commented Apr 4, 2019

/lifecycle frozen
/remove-lifecycle stale
/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Apr 4, 2019
@thockin thockin removed the triage/unresolved Indicates an issue that can not or will not be resolved. label Apr 4, 2019
@thockin thockin changed the title Propose an extra field Service.Status.LoadBalancer.Ingress[0].ProviderID in Service object. Service objects should be annotated by cloud-providers to track underlying resources. Apr 4, 2019
@thockin
Copy link
Member

thockin commented Apr 4, 2019

I have re-titled this. The question is whether this should just be a documented "best practice" or whether we should actually go through all of the existing providers and try to make this happen? I'd love to see it done.

@andrewsykim
Copy link
Member

Thanks Tim, added this to the SIG CP backlog kubernetes/cloud-provider#27

@bowei
Copy link
Member

bowei commented May 3, 2019

cc: @MrHohn may be relevant to garbage collection issues

@MrHohn
Copy link
Member

MrHohn commented May 3, 2019

Thanks for adding me --- this could be more useful when pairing with service finalizer (kubernetes/enhancements#980).

@leilajal
Copy link
Contributor

@andrewsykim what do we need here?

@cheftako
Copy link
Member

cheftako commented Feb 3, 2021

@bowei Where are we with this?

@bowei
Copy link
Member

bowei commented Feb 4, 2021

I think someone has to drive a KEP here to add the field and work through the details.

@jpbetz
Copy link
Contributor

jpbetz commented Feb 4, 2021

I think someone has to drive a KEP here to add the field and work through the details.

My interpretation of the comment thread is that instead of adding a field, we should define best practices for how to use annotations to track underlying resources (#70159 (comment)) and then go through the existing cloud providers and get them transitioned over to using the best practices (#70159 (comment)).

@sftim
Copy link
Contributor

sftim commented Dec 21, 2022

This is more specific than #9106, but largely duplicates it.

@thockin
Copy link
Member

thockin commented Dec 21, 2022

Now that we have CRDs we should probably encourage a pattern like:

  • Create service
  • Create $cloudLB CR(s) with finalizers
  • annotate Service with info about CRs
  • cloudLB controller actuates

@thockin
Copy link
Member

thockin commented Dec 21, 2022

This is also probably something to implement in external cloud providers.

@sftim
Copy link
Contributor

sftim commented Dec 22, 2022

What might be on us is to document the approach with finalizers, and to promote that.

@shaneutt
Copy link
Member

shaneutt commented Mar 22, 2024

This one has gotten quite old, bringing this one up for our next meeting so we can review it.

@sftim
Copy link
Contributor

sftim commented Mar 22, 2024

Potentially worth a spike-level related discussion: what is Kubernetes' common architectural pattern for a cloud provider integration to be able to store unstructured / provider-dependent status data?

Reason I mention it is that Karpenter, a cluster autoscaler, has faced a similar question recently.

@thockin
Copy link
Member

thockin commented Mar 25, 2024

We don't have a pattern.

Options:

  1. Store a stringly-typed annotation
  2. Store a same-named CR
  3. Store a CR and name it in an annotation

Option 2 is the one I see most. We do not have a mechanism for managing 2 resources toghether, nor a standard way to denote "linked" objects.

@aojea
Copy link
Member

aojea commented Mar 26, 2024

@aojea
Copy link
Member

aojea commented Mar 28, 2024

/assign @robscott

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/network Categorizes an issue or PR as relevant to SIG Network.
Projects
None yet
Development

No branches or pull requests