Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/validating admission policy/metrics improvement #126124

Conversation

cici37
Copy link
Contributor

@cici37 cici37 commented Jul 16, 2024

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

Improve the VAP metrics by wrapping the error clearly.

This PR is to merge changes from #124330

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Current PR includes commits from #126123. Could be reviewed after #126123 is merged. Thanks.

Does this PR introduce a user-facing change?

The ValidatingAdmissionPolicy metrics have been redone to count and time all validations, including failures and admissions.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/apiserver sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jul 16, 2024
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 16, 2024
@cici37
Copy link
Contributor Author

cici37 commented Jul 16, 2024

/retest

@cici37
Copy link
Contributor Author

cici37 commented Jul 16, 2024

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 16, 2024
@cici37
Copy link
Contributor Author

cici37 commented Jul 16, 2024

/assign @jpbetz Thanks :)

@jpbetz
Copy link
Contributor

jpbetz commented Jul 17, 2024

@logicalhan Could we get a SIG insturmentation review for this one?

@jpbetz
Copy link
Contributor

jpbetz commented Jul 17, 2024

cc @richabanker

@jpbetz
Copy link
Contributor

jpbetz commented Jul 17, 2024

For API Machinery aspects: This looks like a useful improvement.

For SIG Instrumentation: Is this change acceptable for the stability level of this metric?

@richabanker
Copy link
Contributor

Since the metrics being changed are currently ALPHA, I believe it should be okay to modify them (I see that the labels are being updated here). Had the metrics been beta or stable thats when modifying labels would have been an issue.

@cici37
Copy link
Contributor Author

cici37 commented Jul 17, 2024

Hi @richabanker , thanks for the comment. I do want to promote the stability level of the existing metrics from alpha to beta in a followup pr. Wondering if there is any requirement regarding with the metrics promotion? Any required soak period or related feature status? Thank you!

@richabanker
Copy link
Contributor

I am not aware of any documented set of requirements that need to be followed in order to graduate a metric from alpha -> beta -> stable, but maybe @logicalhan has more information on this. I could only find this section on how to go about proposing updating the stability level of a metric.

@cici37
Copy link
Contributor Author

cici37 commented Jul 17, 2024

/sig instrumentation

@k8s-ci-robot k8s-ci-robot added the sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. label Jul 17, 2024
@cici37
Copy link
Contributor Author

cici37 commented Jul 19, 2024

Hi @richabanker, would you have time to review this pr? Thanks

Copy link
Contributor

@richabanker richabanker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
with some nits.

The metric changes look good especially since I dont see possibility of high cardinality after the changes done to metric labels (new label is bounded)

var ErrCompilation = fmt.Errorf("%w: compilation error", ErrInvalid)

// ErrOutOfBudget is the basic error that occurs when the expression fails due to
// exceeding budget.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth expanding on what "budget" means here? Apologies if this is something tied to CEL that I don't know.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for the review!
Yes it is a cel thing and we use it in returned err message as well :)


const (
// ValidationCompileError indicates that the expression fails to compile.
ValidationCompileError ValidationErrorType = "compile_error"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to move these also to staging/src/k8s.io/apiserver/pkg/cel/errors.go ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added into here since it is only used by metrics. I could add it into metrics/errors.go in my next metric pr is that makes sense :)

}

_, err = cel.AstToCheckedExpr(ast)
if err != nil {
// should be impossible since env.Compile returned no issues
return resultError("unexpected compilation error: "+err.Error(), apiservercel.ErrorTypeInternal)
return resultError("unexpected compilation error: "+err.Error(), apiservercel.ErrorTypeInternal, nil)
Copy link
Contributor

@richabanker richabanker Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are passing nil here, we should ensure that we do nil checks for its usages, I couldn't find any instance in this PR but was just something that stood out to me. Maybe replacing nil here with a "apiservercel.UnknownError" or something would be better? Leave it to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the code we would never expect to hit :) So we don't have a specific err type for this :)

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 19, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: b6106815d334a6f96ea2c753521cf7f4a4bc039d

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cici37, richabanker

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit acaec0c into kubernetes:master Jul 19, 2024
14 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.31 milestone Jul 19, 2024
@cici37 cici37 deleted the feature/validating-admission-policy/metrics-improvement branch July 19, 2024 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants