-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP 4447: Promote PolicyReport API to Kubernetes SIG API #4448
base: master
Are you sure you want to change the base?
Conversation
Welcome @anusha94! |
Hi @anusha94. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Based on the producer and usage, it is possible to create lots of report objects. | ||
For example, if a policy engine has 20 policy rules and a namespace has 1000 pods, | ||
an implementation may produce 20,000 reports. This can overwhelm etcd. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the producer and usage, it is possible to create lots of report objects. | |
For example, if a policy engine has 20 policy rules and a namespace has 1000 pods, | |
an implementation may produce 20,000 reports. This can overwhelm etcd. | |
Based on the producer and usage, it is possible to create lots of report objects. | |
For example, if a policy engine has 20 policy rules and a namespace has 1000 pods, | |
an implementation may produce 20,000 reports. If a cluster operator deploys PolicyReport | |
into their cluster, using this APU can overwhelm etcd. |
(is this a risk? We already let people deploy any CRD they like.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has definitely been reported as an issue for users of the API. Whether that constitutes a risk or not is a good question, but this should be highlighted somewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have 2 projects we can link to and reference - perhaps in the docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my core concern with the concept of reporting findings via the API server.
I think events are another example of a high-volume object? One salient difference between this and events is that events are understood to be subject to throttling/sampling. Security reports may not have the same luxury.
I like the idea of reports-server, but IMO it would need to be an expectation that all clusters have a similar scalable backend solution before reports could be reliably enabled without risking cluster stability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should also add that there is a difference between "users can deploy any CRD they like" and "K8s accepts using KRM/the API server this way as a valid practice", the second statement has much stronger implications around supportability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JimBugwadia can you comment on the cluster reliability and performance concerns brought up here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ritazh - can you please help clarify what exactly is expected?
The proposal is for a uniform API for reporting, and reliability or performance will depend heavily on implementations. For example, the API as a contract between consumers and producers can be used as a bounded log for the last N results.
We can help document best practices, but seems like a number of those may be applicable to any other API as well. For example, the standard size limits would apply, and resource limits can be configured.
Is there any prior work, done to test performance and reliability impacts of other APIs, that we can reference?
If there are specific tests or measurements that are recommended, happy to help capture the data.
We need approvals from the following stakeholders: | ||
[TBD] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will we target this API at a release?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, its decoupled from Kubernetes releases.
|
||
- Add `policy-report-api` as a new project under kubernetes-sigs i.e `github.com/kubernetes-sigs/policy-report-api` | ||
- Provide guidance on building consumers and producers | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to publish official artefacts for the API?
- YAML manifest?
- OCI image of Helm chart?
- something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's what I suggest:
- Golang client set to reuse in producers and consumers
- Generated YAMLs
- API spec
- Docs
Based on the producer and usage, it is possible to create lots of report objects. | ||
For example, if a policy engine has 20 policy rules and a namespace has 1000 pods, | ||
an implementation may produce 20,000 reports. This can overwhelm etcd. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has definitely been reported as an issue for users of the API. Whether that constitutes a risk or not is a good question, but this should be highlighted somewhere
/assign @ritazh |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
/remove-lifecycle rotten |
Co-authored-by: Andy Suderman <andy@suderman.dev>
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: anusha94 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
/sig auth
/wg policy
cc @JimBugwadia