-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor pkg/health into more reusable pkg/probe #3695
Conversation
I get the Check/Probe rename. I am less sure about the rest. I feel like this is forcing commonality on things that aren't really the same. I'd rather see probes be more decoupled from kubernetes-concepts like pod. The RunInContainer could be passed in by kubelet as an implementation of util/exec.Interface or flip it around and define and receive an interface that exec implements. That wall all the pod and docker and whatever logic stays firmly in kubelet, and probes stay simple. Then I wonder if this Context type is really worthwhile for just two instances, as opposed to just making two calls to something like RegisterLivenessProbes(NewHTTPProbe(), NewTCPProbe(), NewExecProbe()) and RegisterReadinessProbes(...). This has the side effect that the two sets of allowed probes are not necessarily the same list. I'll chew on this a bit more, but thoughts? |
I agree with the need to decouple checks from kubernetes concepts. A "ProbeRunner" should not assume that the probe will always be accessed through a specific location in an api object, which is currently preventing the HealthCheck from being used in contexts other than Liveness. What I would like to see is a ProbeRunner interface that accepts a HealthyProbe (like a LivenessProbe but more generic) and some struct, possibly as per this TODO, that contains the information/args required to do the check. The current implementation of pkg/health is very tightly bound to preforming HealthChecks on a single container. I wasn't sure if that was the ultimate intention but I erred on leaving that intact as much as possible while still refactoring pkg/health to be useful for contexts other than Liveness. I also think that Context is a terrible and overloaded word but I'm struggling to find a word that fits well. I don't think that the type is necessary. I'm still trying to figure out how best to accomplish this. |
I am on-call today, so I won't have much time to think about this very On Thu, Jan 22, 2015 at 12:27 PM, Mike Danese notifications@github.com
|
No worries, take your time. I've tried factoring the api objects out of the pkg/health/health.go interfaces where the client passes in a Probe and a "context" (unrelated to my previous use of the word context) required to execute the Probe. This has the benefit of decoupling api types from the interfaces, isolates most of the mess in the PodeRunner implementations, and would allow a client to pass in HealthyProbes in settings other than Liveness. |
Ooh, I'd love to have readiness checks. I'll take a look. FYI, something similar also applies to nodes, though in that case we've focused on reporting the "condition" rather than specifying how it is determined: |
It does seem similar although NodeConditionKind is not mutually exclusive. ProbeRunner could probably be adapted to work for both uses, but I wouldn't want to shoehorn it. I'll think about whether it's possible to reconcile the two. It would be nice if Node checks used pkg/health. |
I spent a little time looking at the existing health check code and I feel What if we backed down on the plugin-style here? Nobody but kubelet uses pkg/health EXCEPT for client/kubelet and a few For example, the pkg/probe/exec.Probe type would have a method Run(exec pkg/probe would NOT treat probes as opaque plugins. To do that requires a
|
I agree that pkg/health should be rethought. Readiness is something I'll be working towards over the next couple weeks but it's low enough priority that I wouldn't put off doing any cleanup I can along the way. I'll think about your suggestions and work on a first pass at this today. |
3cdbb81
to
4402d46
Compare
@thockin I've taken a first pass at what I think a pkg/probe could look like and I've tried to address your concerns. PTAL and let me know if I'm on track. I can add tests and move things over that depend on pkg/health if you prefer this. |
b9b0841
to
52fd703
Compare
"github.com/golang/glog" | ||
) | ||
|
||
func (kl *Kubelet) makeLivenessProbeRunner(podFullName string, podUID types.UID, status api.PodStatus, container api.Container) probe.ProbeRunner { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd just pass in LivenessProbe and any other information from the container needed (e.g., name). That would make it easier to reuse this for readiness probes.
We should change the name of type LivenessProbe to Probe.
bc3147a
to
22f4207
Compare
@bgrant0607 rather than renaming LivenessProbe, we could break the Actions into a Probe type and maintain {Context}Probes that hold a Probe and context specific settings. e.g. type Probe struct {
HTTPGet *HTTPGetAction `json:"httpGet,omitempty"`
TCPSocket *TCPSocketAction `json:"tcpSocket,omitempty"`
Exec *ExecAction `json:"exec,omitempty"`
}
type LivenessProbe struct {
Probe `json:",inline"`
InitialDelaySeconds int64 `json:"initialDelaySeconds,omitempty"`
}
type ReadinessProbe struct {
Probe `json:",inline"`
TimeoutSeconds int64 `json:"timeoutSeconds,omitempty"`
FrequencySeconds int64 `json:"frequencySeconds,omitempty"`
UnhealthyThreshold int64 `json:"unhealthyThreshold,omitempty"`
HealthyThreshold int64 `json:"healthyThreshold,omitempty"`
} I'm not sure if this is over thinking it as most settings could make sense for most contexts, but I can see a few benefits to this approach. We could add configs to ContextProbes as they become implemented (i.e. we wouldn't have config fields that do things in some spots and are unimplemented in others). We could also have Context specific settings. We could also rename LivenessProbe to Probe and break out ContextProbes later on if needed. Either way, I'm happy to work on these api changes. I will probably do this in a separate PR as this one is already looking large. |
@thockin code and tests are moved over from pkg/health. Sorry about the size. I tried to break it up logically by commit. PTAL. |
if err != nil { | ||
return probe.Unknown, err | ||
} | ||
if strings.ToLower(string(data)) != defaultHealthyOutput { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we just be checking for a zero exit status?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only concern with this is that we would be changing the behavior of the LivenessProbe api (although it's currently not documented). Some ExecActions that used to return Unhealthy would return Healthy, and some that used to return Unknown would return Unhealthy.
This seems reasonable but I wouldn't want to burry a change that potentially breaks the api. I'll make the change if this is not a concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also for an Exec probe, there would be no way to differentiate a Failure on the part of the probe or on the part of the thing being probed. we'd just assume that it was a Failure on the thing being probed i.e. no Unknowns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well considered points. I am not sure I like it, but you;re right that this would be a change in semantic. Leave it as is. I'll file an issue to discuss
@bgrant0607 hadn't seen that TODO. I'll be playing around with this in #3818. |
Status on this? Is it ready for another final review and commit? |
Yes, thanks. I've addressed 2/3 of your comments. If you want me to address the third (0 status code check only for exec probe) I can but I've replied why I was hesitant. Otherwise this is ready for review. |
LGTM. I'm going to run e2e on this before commit. |
e2e passes with the exception of PD, which has been having trouble. @satnam6502 @brendandburns I am making a call to commit this - I don't think PD is related. |
refactor pkg/health into more reusable pkg/probe
This refactor rocks! @mikedanese FYI, I expect controller manager to also use this pkg for node health check. |
This is a proposal to add a concept of Context to the HealthCheck. The Context represents the setting in which a containers health is being queried. Right now the only context used is Liveness, but it's intended that Readiness will follow shortly. The Context is used to access the correct LivenessProbe from an api.Container.
To accommodate this change, some of the HealthCheck impls have been migrated to a separate interface called ProbeRunner so they are no longer responsible for accessing the LivenessProbe directly from the api.Container.
The ultimate goal of this work is to create a mechanism for supporting readiness checks as per #620. The next steps towards this goal might be:
I'll be working on the api changes next. I can either WIP this PR and append the commits or I can do that work in separate PRs. I am submitting so we can discuss whether this path is on track, or whether this is being worked on somewhere else.