-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend cluster autoscaler to check AUTOSCALER_ENV_VARS in kube-env #710
Conversation
/assign @mwielgus @roberthbailey |
/kind bug |
/assign @bskiba |
Looks like wee need to handle possible empty values in parseKeyValueListToMap, sth along the lines of: bskiba@a15c29d Otherwise this looks good to me and I have successfully tested it on GCE together with kubernetes/kubernetes#61119 |
Do you have a test case that will trigger this? (It sounds like something that would come from a leading or trailing comma, which we might want to just disallow)
Great! |
I tested this on GCE with the new variables and when node-taints where empty (as is in the default case), this errored on me from here: https://github.com/mtaufen/autoscaler/blob/037467fad2c566d901f04a3f78024aff0a55aa54/cluster-autoscaler/cloudprovider/gce/templates.go#L452 I can write a unit-test for this case tomorrow. |
Here's a unit test to make it break: bskiba@faa2151 (tests for empty node taints). With bskiba/autoscaler@a15c29d this passes. Could you incorporate the fix? |
037467f
to
9020b95
Compare
Updated, PTAL :) |
if err != nil { | ||
return nil, err | ||
} | ||
return parseKeyValueListToMap(labels) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you could remove this line and just exit the if block and let the line below return the map to reduce code duplication. up to you whether you think that's more confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
if err != nil { | ||
return nil, err | ||
} | ||
taintMap, err := parseKeyValueListToMap(taints) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above, but in this case there is more duplicated code. Once you get the taints extracted, you can leave the if block and parse them into a map in one place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
} | ||
resourcesRegexp := regexp.MustCompile(`--kube-reserved=([^ ]+)`) | ||
return kubeReserved[0], nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need a check here to verify that kubeReserved will have a zero-th entry and this won't cause a panic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extractAutoscalerVarFromKubeEnv
already does this:
if len(result) == 0 {
return nil, fmt.Errorf("var %s not found in %s: %v", name, autoscalerVars, autoscalerVals)
}
so once we check the error, we know we at least have a 0th element.
if err != nil { | ||
return nil, err | ||
} | ||
result := []string{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: i personally prefer to write this as var result []string
and then let the append calls do the allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
return nil, err | ||
} | ||
result := []string{} | ||
for _, val := range autoscalerVals { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we always expect a single val right? so this outer loop is expected to either execute once or not at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One would hope. But since this file seems to expect the possibility of duplicates in kube-env, I thought it best to handle that here too.
} | ||
result := []string{} | ||
for _, val := range autoscalerVals { | ||
for _, v := range strings.Split(val, ";") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can labels or taints contain a semi-colon? If so, we can't use it as the separator char.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so:
labels: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#syntax-and-character-set
Labels are key/value pairs. Valid label keys have two segments: an optional prefix and name, separated by a slash (/). The name segment is required and must be 63 characters or less, beginning and ending with an alphanumeric character ([a-z0-9A-Z]) with dashes (-), underscores (), dots (.), and alphanumerics between. The prefix is optional. If specified, the prefix must be a DNS subdomain: a series of DNS labels separated by dots (.), not longer than 253 characters in total, followed by a slash (/). If the prefix is omitted, the label Key is presumed to be private to the user. Automated system components (e.g. kube-scheduler, kube-controller-manager, kube-apiserver, kubectl, or other third-party automation) which add labels to end-user objects must specify a prefix. The kubernetes.io/ prefix is reserved for Kubernetes core components.
Valid label values must be 63 characters or less and must be empty or begin and end with an alphanumeric character ([a-z0-9A-Z]) with dashes (-), underscores (), dots (.), and alphanumerics between.
Taints use the same validation as labels for keys and values.
The effect part of the taint args to Kubelet (after the :
) is split out from the value part here when Kubelet parses the flag: https://github.com/kubernetes/kubernetes/blob/ca02c11887c65431af85f809606527be7f08062d/pkg/util/taints/taints.go#L46
@@ -297,26 +297,41 @@ func TestBuildAllocatableFromCapacity(t *testing.T) { | |||
} | |||
|
|||
func TestExtractLabelsFromKubeEnv(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are there test cases to check for expected errors during parsing (e.g. malformed vars, missing vars, etc)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some additional cases for expected errors.
// In v1.10+, kube-reserved is only exposed for the autoscaler via AUTOSCALER_ENV_VARS | ||
// see kubernetes/kubernetes#61119. We try AUTOSCALER_ENV_VARS first, then | ||
// fall back to the old way. | ||
kubeReserved, err := extractAutoscalerVarFromKubeEnv(kubeEnv, "kube_reserved") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In kubernetes/kubernetes#61119 you have a comment that kube-reserved will need to be pulled from KUBELET_ARGS but this code only checks AUTOSCALER_ARGS with a fallback to KUBELET_TEST_ARGS so it won't find the parameter for 1.10 nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For GKE we will have kube-reserved in AUTOSCALER_ARGS.
On plain GCE kube-reserved is not added by default to KUBE_ENV so unless the user explicitly adds it to MIG template, we don't have it anyway (this is the current state, not a regression from this PR). So this might just be a question of removing the comment about kube-reserved from kubernetes/kubernetes#61119
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Users can just edit their MIG templates to include it in AUTOSCALER_ENV_VARS on GCE if they need it. We can probably just remove the comment. I'd recommend we have a release note in the next autoscaler version to that effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, I can take care of adding to release notes.
Tested this once again after fixes from @mtaufen on GCE and GKE and still works good. Please fix comments from @roberthbailey and I'll be happy to merge. |
9020b95
to
59270dd
Compare
/retest |
Looks like we don't have github.com/stretchr/testify/require vendored |
/sigh |
This is the second part of the fix in kubernetes/kubernetes#61119 This provides a temporary way for the cluster autoscaler to get at values that were removed from kube-env in kubernetes/kubernetes#60020. Ideally this information will eventually be available via e.g. the Cluster API, because kube-env is an internal interface that carries no stability guarantees.
59270dd
to
ee890f5
Compare
/lgtm |
kubernetes/kubernetes#61119 merged, so this is all that's left |
…ge_jobs [preemption] Add LowerOrNewerEqualPriority preemption
This is the second part of the fix in kubernetes/kubernetes#61119
This provides a temporary way for the cluster autoscaler to get at
values that were removed from kube-env in kubernetes/kubernetes#60020.
Ideally this information will eventually be available via e.g. the Cluster API,
because kube-env is an internal interface that carries no stability
guarantees.
/cc @roberthbailey @bskiba @mwielgus