-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AlertmanagerConfig CRD #3451
Add AlertmanagerConfig CRD #3451
Conversation
@grdryn Is this ready for final review? The build failing and I think we should add some more e2e tests here too. Let us know if you need further assistance :) |
@s-urbaniak no, this is not ready for review yet. There's still a lot to do! 🙂 I think GH added reviewers automatically. I will indeed be in touch if/when I need help, thanks! 🙂 |
25d6434
to
893c47c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@simonpasquier @s-urbaniak @brancz This currently "works" for the CRD type that exists so far (i.e. only the PagerDuty receiver type is implemented, but others should be easy to add after). Probably some tests need to be fixed up & some more added, I'll look in that direction next.
Could you please take a look when you have some time, to see if what I've done so far is generally in the right direction, or if there are things you'd like changed? It's very much inspired by how the prometheus controller handles ServiceMonitor (etc) CRs, and in many cases blatently plagiarised. At least that should make it somewhat familiar! 🙂
893c47c
to
b433122
Compare
Existing tests appear to work now! 🎉 However, the prometheus e2e tests are failing in travis with:
Is that a known issue, or is it caused by the changes here? I don't think these changes should affect those prometheus controller e2e tests that much 🤷♂️ |
ea2de4e
to
3907ed0
Compare
example/prometheus-operator-crd/monitoring.coreos.com_alertmanagerconfigs.yaml
Outdated
Show resolved
Hide resolved
afd1050
to
3082d0d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for doing this! I've made a first limited review. I'll add more later.
69e0b84
to
36f519b
Compare
36f519b
to
1d19b0d
Compare
I'm going to remove the WIP prefix for this PR now, as I think it's close to a state that could be merged. Only the PagerDuty receiver has been added so far. I can add the others here, or I can do them in follow-up changes if you'd prefer. |
I am fine in a follow-up as long as you can open the issues for maybe each one would be a good one for outside contributors to take, but up to @simonpasquier and @brancz on that one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work on the PR! Thanks so much! Couple of comments, leaving up to Frederic and Simon to have a final look, no blockers on my side.
} | ||
} | ||
|
||
// TODO: Do we need to enque secrets just for the namespace or in general? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this TODO still valid before we merge the PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I plagiarized this TODO when I plagiarised this function from here 🙈
// TODO: Do we need to enque secrets just for the namespace or in general? |
I could be wrong, but here's how I see it:
-
We should enqueue reconciliation for an
Alertmanager
if any of theSecrets
referenced in its spec are deleted (it has asecrets
field which is a list,configSecret
field,containers
orinitContainers
fields may reference a Secret to mount, etc). These all have to be in the same namespace as the Alertmanager. This function, enqueuing all Alertmanagers in the namespace, covers these cases, I believe. -
We may want to also enqueue reconciliation for an
Alertmanager
, if there's an AlertmanagerConfig selected by it that references a Secret that's deleted (upon reconciliation, if the Secret is missing for an AlertmanagerConfig, then that AlertmanagerConfig will be excluded when building the config file). If that's what we want, then I think it would either mean enqueuing all Alertmanagers in all namespaces, for any Secret changes in any AlertmanagerConfig namespaces; or determining if an Alertmanager should be reconciled by checking if the Secret is referenced in either the Alertmanager spec, or in any AlertmanagerConfigs that it selects.
I'm starting to confuse myself. In any case, if we want the second option in addition to the first, I'm not quite sure how to achieve that here. 🤔 Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to dig deeper into this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Just a few comments, but it looks really good.
I won't get to doing a deep dive on this soon, so it's better for @simonpasquier to do so. Two general comments:
|
fd9e294
to
3c47587
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments for the Alertmanager controller. Otherwise this is looking really good. Given the amount of the code changes + the fact that the resource is v1alpha1
, 👍 for merging this after addressing my last remarks.
pkg/alertmanager/operator.go
Outdated
} | ||
} | ||
|
||
func (c *Operator) enqueueForMonitorNamespace(nsName string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need the intermediate enqueueForMonitorNamespace()
function. We can have only func (c *Operator) enqueueForNamespace(nsName string)
which would use the c.nsAlrtCfgInf
's store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I've done that now. I have to admit to being less familiar with this part of the changes, I think it comes from branczs earlier WIP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completely understand, you've done an amazing work knowing that you didn't know extensively the code base before :)
} | ||
} | ||
|
||
// TODO: Do we need to enque secrets just for the namespace or in general? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to dig deeper into this.
a25808d
to
a9ba233
Compare
👍 for me. Lets give other @prometheus-operator/prometheus-operator-reviewers a chance to review it :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm 👍
Thanks again!!
The only thing I would ask if the commits can be squashed, just to make it easier in case we need to revert something. 25 seems a bit too much even for this much changes. Do you mind doing that @grdryn and then we can merge 🎉 |
We could also |
Personally, I like to have generated (API in this example) and actual code separated, but not a blocker for me, happy to do it via github UI as well. |
Leaving the final call to @grdryn if he's willing to squash the commits and split the generated code in a separate commit :) |
Command: make --always-make generate \ pkg/client/versioned/clientset.go \ pkg/client/listers/monitoring/v1/prometheus.go \ pkg/client/listers/monitoring/v1alpha1/prometheus.go \ pkg/client/informers/externalversions/monitoring/v1/prometheus.go \ pkg/client/informers/externalversions/monitoring/v1alpha1/prometheus.go
a9ba233
to
238b53e
Compare
I've squashed to 3 commits: brancz's original work, then the rest is the split between manual code changes and generated files. Is that the split you're looking for? |
Thanks sounds good! |
Thanks so much again for the amazing work and patience! 🎉 |
Thank you to everyone involved in getting this merged 🎉🙌 |
Hi, thanks ! |
This builds upon @brancz's work in #3265 and will hopefully eventually implement the core of the proposal at https://docs.google.com/document/d/1aVVttvocop8zNezwrNFbHh_HKRW_fhJUZqhmrr8fGK0/edit?usp=sharing
Closes #3261, #2766, #2927, #1528.