Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamically Changing Pod Labels? #11954

Closed
MarkRx opened this issue Nov 17, 2016 · 7 comments
Closed

Dynamically Changing Pod Labels? #11954

MarkRx opened this issue Nov 17, 2016 · 7 comments

Comments

@MarkRx
Copy link

MarkRx commented Nov 17, 2016

I am attempting to do blue-green (stage/active) deployments in openshift. The following format is used:
stage route -> stage service -> dc X
active route -> active service -> dc X (when promoting).

When promoting the existing active service is pointed to a new deployment that was previously staged. Staging allows for verification testing before routing over live traffic to new deployment configs.

I would like to dynamically change the labels on all existing (and new pods) after doing a promotion of status: stage to status: active. This is needed for both service selectors and metadata reflection for the application knowing what state it is in. I have been able to change the labels on the running pods. If however I change the label on the dc / replication controller new pods continue to use the old label.

Note I do not want to use the config change trigger (I have triggers as an empty list) as that causes new pods to be spun up (due to creating a new deployment). I want to continue using the old pods that have been verified. Thus I wish to continue to use the existing deployment object.

I have found that the replication controller has an embedded field "openshift.io/encoded-deployment-config" that seems to be the dc that it came from but even if I change the status field in that new pods continue to use the old label.

Attempt 1:

The deployment config and replication controller can be modified along with all active pods. This works but doesn't seem robust or safe.

Attempt 2:

Instead of keeping the same dc across stage/active I have tried having two dc's (dc-stage and dc-active). I am able to cause the dc to use the "status" as part of the selector to choose which pods to pull. The idea was to transfer pods between dc / rc owners by changing the "status" label on the running pods. This seems to fall short though since something is automatically adding "deployment" and "deploymentconfig" to the rc selector causing it to only consider pods created by the rc.

  selector:
    deployment: helloworld-dc-active-1
    deploymentconfig: helloworld-dc-active
    status: active
Current Result

It's not possible (/ I have not found a way) to change labels on existing deployments to cause new pods from the same deployment to get new labels.

Expected Result

I am able to both dynamically change labels on running pods and the replication controller while continuing to use the same pods (instead of tearing down the old ones and starting up new ones).

Version

vagrant@vagrant-ubuntu-trusty-64:~$ oc version
oc v3.2.1.7
kubernetes v1.2.0-36-g4a3f9c5

@0xmichalis
Copy link
Contributor

I've proposed a solution to this upstream: kubernetes/kubernetes#36897

@smarterclayton
Copy link
Contributor

I think having two dc's and two services is the natural way to solve this. I think it's also the right mechanism for many isolation cases.

@MarkRx
Copy link
Author

MarkRx commented Nov 18, 2016

Multiple dc's are used in our case - when staging. We are doing switching at a service level instead of a route level due to some reasons that were explained to me by networking and the infrastructure team (something to do with load balancing, wide ip's, and HA proxy).

Before:

stg rte-> stg svc-> null
act rte-> act svc -> dc X

Stage Y

stg rte-> stg svc-> dc Y
act rte-> act svc -> dc X

Promote Y (delete dc X)

stg rte-> stg svc-> null
act rte-> act svc -> dc Y

I understand another approach would be using blue/green services and dc's that are agnostic of if they are production items or not. It would look like this:

Before

stg rte -> blue svc -> null
act rte -> green svc -> dc green

Stage

stg rte -> blue svc -> dc blue
act rte -> green svc -> dc green

Promote

stg rte -> null
act rte -> blue svc -> dc blue

We ended up not taking this approach due to potential confusion from an operator standpoint as to knowing what the state of the system is. In particular, an openshift project could contain N number of components with some being blue and some green. The additional stage/active abstraction makes it clear if a pod is serving production traffic or not. What is more labels can now be used by selectors to easily pick which pods to route to based on if they have a "stage" or "active" label. At least, they would if the label could be dynamically updated.

@smarterclayton
Copy link
Contributor

Generally when people want to do blue green, they're doing it so the fewest possible moving pieces change between stage and production. Part of my concern with relabelling (to have one deployment get another's pods) is that it's a lot of individual moving part actions (if one of those moves fail, you might break production and staging). I think there is a lot of advantages to focusing on using only things that can be a single atomic operation in order to do swaps. Today that would only be the service label selector or the route backend selector. Even if we added relabelling, the potential for failure at any step goes up, and the possibility of rollback is much harder. But changing a service label selector or route backend selector back is pretty easy.

If i had to make a tradeoff personally, I think I'd be willing to deal with some potential confusion about which deployment is "live" if it meant that my rollout / rollback is a single atomic operation.

A canary approach has been proposed which looks like the active / stg concept you described:

  1. two deployments, active and next
  2. service pointing to active
  3. new deployments always happen on next first
  4. once stage is validated, switch service to next
  5. run for a while - once happy, redeploy active (which is hidden) to the same version as next
  6. switch service back to active
  7. scale next to zero.

That does mean occasionally "next" is taking the traffic, but has the advantage of always allowing you to know what the next state is (if next > 0 and service points to it, your next step is to update active, etc)

@MarkRx
Copy link
Author

MarkRx commented Nov 23, 2016

Having less moving parts does make sense.

In that canary approach the pods that get deployed to active are never validated as they are new. The pods from next are not reused. While I am of the opinion that it doesn't matter (it's the pod template that should be validated and not the actual pods themselves), some colleagues of mine insist that the pods that are validated be the same ones that are promoted and used in active. That's not hard to do (keep track of dc's being promoted), but it becomes an issue when they also want to know the state of a pod from metadata (to allow services to always point to the same labels and for internal business logic).

I get the impression that those two asks put together don't fit the platform model. It seems right now the platform is designed for RC's to be immutable (minus replication counts) and instead be recreated when changes are needed. Hence, changing label information on RC's / pods doesn't fit. Neither does reusing pods across RC's. Instead, pods should all be the same so in theory spinning them down/up in rolling deployments results in no net change. My colleges insist that it could (such as database connections going astray) but I argue it's no different than pods becoming unhealthy and being restarted.

In any case the question for the topic remains - are labels on pods meant to be immutable or can they be modified? If they can be modified does it make sense to update the RC as well to keep them in sync?

@smarterclayton
Copy link
Contributor

Because we don't provide atomicity across RC and Pods it's likely to remain
fairly fragile to change labels while the controller is actively managing
them. Pods are mostly immutable today, will likely stay that way for a
while, and there's lots of good reasons to think about them that way. I
don't think it's wrong to look at patterns that take advantage of the
limited mutability we allow, but we're unlikely to be capable of providing
a lot of support for that in any near term unless it's truly critical to
enable specific use cases.

Certainly the two deployment "swap" model is likely to work much more
cleanly in the near term under the constraints of pod reuse for the next
6mo to a year.

On Nov 23, 2016, at 7:24 AM, Mark notifications@github.com wrote:

Having less moving parts does make sense.

In that canary approach the pods that get deployed to active are never
validated as they are new. The pods from next are not reused. While I am of
the opinion that it doesn't matter (it's the pod template that should be
validated and not the actual pods themselves), some colleagues of mine
insist that the pods that are validated be the same ones that are promoted
and used in active. That's not hard to do (keep track of dc's being
promoted), but it becomes an issue when they also want to know the state of
a pod from metadata (to allow services to always point to the same labels
and for internal business logic).

I get the impression that those two asks put together don't fit the
platform model. It seems right now the platform is designed for RC's to be
immutable (minus replication counts) and instead be recreated when changes
are needed. Hence, changing label information on RC's / pods doesn't fit.
Neither does reusing pods across RC's. Instead, pods should all be the same
so in theory spinning them down/up in rolling deployments results in no net
change. My colleges insist that it could (such as database connections
going astray) but I argue it's no different than pods becoming unhealthy
and being restarted.

In any case the question for the topic remains - are labels on pods meant
to be immutable or can they be modified? If they can be modified does it
make sense to update the RC as well to keep them in sync?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#11954 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_pzPgiAJyufHFyDcfYSYZ0GudxM5Sks5rBD6lgaJpZM4K1ex4
.

@0xmichalis
Copy link
Contributor

Closing this in favor of kubernetes/kubernetes#36897

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants