Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP 4447: Promote PolicyReport API to Kubernetes SIG API #4448

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

anusha94
Copy link

  • One-line PR description: Adding a new KEP-4447 to promote PolicyReport API to a Kubernetes SIG API
  • Other comments: None

/sig auth
/wg policy

cc @JimBugwadia

@k8s-ci-robot k8s-ci-robot added sig/auth Categorizes an issue or PR as relevant to SIG Auth. wg/policy Categorizes an issue or PR as relevant to WG Policy. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jan 27, 2024
@k8s-ci-robot k8s-ci-robot requested a review from deads2k January 27, 2024 10:09
@k8s-ci-robot k8s-ci-robot added the kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory label Jan 27, 2024
@k8s-ci-robot
Copy link
Contributor

Welcome @anusha94!

It looks like this is your first PR to kubernetes/enhancements 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/enhancements has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jan 27, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @anusha94. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jan 27, 2024
keps/sig-auth/4447-promote-policyreport-sig-api/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4447-promote-policyreport-sig-api/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4447-promote-policyreport-sig-api/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4447-promote-policyreport-sig-api/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4447-promote-policyreport-sig-api/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4447-promote-policyreport-sig-api/README.md Outdated Show resolved Hide resolved
Comment on lines +302 to +304
Based on the producer and usage, it is possible to create lots of report objects.
For example, if a policy engine has 20 policy rules and a namespace has 1000 pods,
an implementation may produce 20,000 reports. This can overwhelm etcd.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Based on the producer and usage, it is possible to create lots of report objects.
For example, if a policy engine has 20 policy rules and a namespace has 1000 pods,
an implementation may produce 20,000 reports. This can overwhelm etcd.
Based on the producer and usage, it is possible to create lots of report objects.
For example, if a policy engine has 20 policy rules and a namespace has 1000 pods,
an implementation may produce 20,000 reports. If a cluster operator deploys PolicyReport
into their cluster, using this APU can overwhelm etcd.

(is this a risk? We already let people deploy any CRD they like.)

Copy link

@sudermanjr sudermanjr Jan 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has definitely been reported as an issue for users of the API. Whether that constitutes a risk or not is a good question, but this should be highlighted somewhere

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@maxsmythe maxsmythe Mar 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my core concern with the concept of reporting findings via the API server.

I think events are another example of a high-volume object? One salient difference between this and events is that events are understood to be subject to throttling/sampling. Security reports may not have the same luxury.

I like the idea of reports-server, but IMO it would need to be an expectation that all clusters have a similar scalable backend solution before reports could be reliably enabled without risking cluster stability.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should also add that there is a difference between "users can deploy any CRD they like" and "K8s accepts using KRM/the API server this way as a valid practice", the second statement has much stronger implications around supportability.

Copy link
Member

@ritazh ritazh Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JimBugwadia can you comment on the cluster reliability and performance concerns brought up here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ritazh - can you please help clarify what exactly is expected?

The proposal is for a uniform API for reporting, and reliability or performance will depend heavily on implementations. For example, the API as a contract between consumers and producers can be used as a bounded log for the last N results.

We can help document best practices, but seems like a number of those may be applicable to any other API as well. For example, the standard size limits would apply, and resource limits can be configured.

Is there any prior work, done to test performance and reliability impacts of other APIs, that we can reference?

If there are specific tests or measurements that are recommended, happy to help capture the data.

Comment on lines +405 to +406
We need approvals from the following stakeholders:
[TBD]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we target this API at a release?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, its decoupled from Kubernetes releases.


- Add `policy-report-api` as a new project under kubernetes-sigs i.e `github.com/kubernetes-sigs/policy-report-api`
- Provide guidance on building consumers and producers

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to publish official artefacts for the API?

  • YAML manifest?
  • OCI image of Helm chart?
  • something else?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's what I suggest:

  • Golang client set to reuse in producers and consumers
  • Generated YAMLs
  • API spec
  • Docs

keps/sig-auth/4447-promote-policyreport-sig-api/README.md Outdated Show resolved Hide resolved
Comment on lines +302 to +304
Based on the producer and usage, it is possible to create lots of report objects.
For example, if a policy engine has 20 policy rules and a namespace has 1000 pods,
an implementation may produce 20,000 reports. This can overwhelm etcd.
Copy link

@sudermanjr sudermanjr Jan 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has definitely been reported as an issue for users of the API. Whether that constitutes a risk or not is a good question, but this should be highlighted somewhere

@nilekhc
Copy link
Contributor

nilekhc commented Feb 12, 2024

/assign @ritazh

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 2, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle rotten
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 2, 2024
@JimBugwadia
Copy link

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jul 2, 2024
Co-authored-by: Andy Suderman <andy@suderman.dev>
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: anusha94
Once this PR has been reviewed and has the lgtm label, please ask for approval from ritazh. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 24, 2024
@JimBugwadia
Copy link

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. sig/auth Categorizes an issue or PR as relevant to SIG Auth. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. wg/policy Categorizes an issue or PR as relevant to WG Policy.
Projects
Status: In Review
Development

Successfully merging this pull request may close these issues.

9 participants