Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audit use buffered backend #60237

Merged

Conversation

crassirostris
Copy link

This is the next step after #60076

This PR fixes #53020, to address #53006 later

In this PR buffered backend, introduced in #60076, is used to replace ad-hoc solution for webhook and add an ability to enable buffering for the log audit backend.

Log audit backend can now be configured to perform batching before writing events to disk.

/cc @sttts @tallclair @ericchiang @CaoShuFeng

@crassirostris crassirostris added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/auth Categorizes an issue or PR as relevant to SIG Auth. area/audit approved-for-milestone labels Feb 22, 2018
@crassirostris crassirostris added this to the v1.10 milestone Feb 22, 2018
@k8s-ci-robot k8s-ci-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Feb 22, 2018
@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Feb 22, 2018
@crassirostris crassirostris force-pushed the audit-use-buffered-backend branch 2 times, most recently from 5d8bb64 to 502a2b1 Compare February 22, 2018 19:12
@lavalamp
Copy link
Member

/assign @yliaog

<-b.shutdownCh

// Wait until all sending routines exit.
b.wg.Wait()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the new wg shutdown logic is correct:

  • when b.shutdownCh is closed, we know that the go routine in Run has terminated.
  • this means that processIncomingEvents has terminated
  • which means that b.buffer is closed and cannot accept any new events anymore.
  • because processEvents is called synchronously from the Run go routine, the waitgroup has its final value.
  • Hence, wg.Wait will not miss any more outgoing batches.

Should we put this proof into a comment (follow-up is fine)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, added a comment

// recover from.
defer func() {
if err := recover(); err != nil {
sendErr = fmt.Errorf("panic when processing events: %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I liked the old error message more. The new one sounds like something is broken.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, done

Copy link
Author

@crassirostris crassirostris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sttts Thanks for the review, comments addressed

<-b.shutdownCh

// Wait until all sending routines exit.
b.wg.Wait()
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, added a comment

// recover from.
defer func() {
if err := recover(); err != nil {
sendErr = fmt.Errorf("panic when processing events: %v", err)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, done

@crassirostris crassirostris changed the title [WIP] Audit use buffered backend Audit use buffered backend Feb 26, 2018
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 26, 2018
@crassirostris crassirostris force-pushed the audit-use-buffered-backend branch 2 times, most recently from 161d1a3 to 4a42acb Compare February 26, 2018 18:13
@sttts
Copy link
Contributor

sttts commented Feb 26, 2018

/approve
Lgtm, but would like to have some more eyes one it, especially a detailed review of the changes to the old backends. @tallclair are you available?

@crassirostris crassirostris force-pushed the audit-use-buffered-backend branch from 4a42acb to 503c0a7 Compare February 26, 2018 18:47
@crassirostris
Copy link
Author

/assign @thockin

Could you please approve this PR for the new dependency?

@thockin
Copy link
Member

thockin commented Feb 26, 2018

/approve

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 26, 2018
@thockin thockin removed their assignment Feb 26, 2018
@crassirostris crassirostris force-pushed the audit-use-buffered-backend branch from 68cfc48 to 2a54b3b Compare February 27, 2018 11:40
Copy link
Member

@tallclair tallclair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, except the flag issue

fs.DurationVar(&o.BatchConfig.InitialBackoff, "audit-webhook-batch-initial-backoff",
o.BatchConfig.InitialBackoff, "The amount of time to wait before retrying the "+
"first failed requests. Only used in batch mode.")
fs.DurationVar(&o.InitialBackoff, "audit-webhook-initial-backoff",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately you can't just change flags (see deprecation policy, #5b). I suggest marking the old one as deprecated, and making it an alias for the new one.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, thanks, done

Copy link
Author

@crassirostris crassirostris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tallclair Thanks! PTAL

fs.DurationVar(&o.BatchConfig.InitialBackoff, "audit-webhook-batch-initial-backoff",
o.BatchConfig.InitialBackoff, "The amount of time to wait before retrying the "+
"first failed requests. Only used in batch mode.")
fs.DurationVar(&o.InitialBackoff, "audit-webhook-initial-backoff",
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, thanks, done

"first failed requests.")
fs.DurationVar(&o.InitialBackoff, "audit-webhook-batch-initial-backoff",
o.InitialBackoff, "The amount of time to wait before retrying the first failed "+
"requests. Deprecated, use --audit-webhook-initial-backoff instead.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Was deceived by an example in another file

Done

Signed-off-by: Mik Vyatskov <vmik@google.com>
@crassirostris crassirostris force-pushed the audit-use-buffered-backend branch from c870cab to 881e6d4 Compare March 1, 2018 13:31
@tallclair
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 1, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: crassirostris, sttts, tallclair, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 60542, 60237). If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 209cdd9 into kubernetes:master Mar 1, 2018
@liggitt
Copy link
Member

liggitt commented Mar 2, 2018

audit e2e tests started flaking as soon as this merged. tracking in #60719

LogOptions: AuditLogOptions{
Format: pluginlog.FormatJson,
BatchOptions: AuditBatchOptions{
Mode: ModeBatch,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes default behavior of log backend.

#60773

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already a pull request: #60739

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, this is a conscious decision. I believe batching is always better by default than blocking

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/audit cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/auth Categorizes an issue or PR as relevant to SIG Auth. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[audit] Restore audit logging in the scalability tests
10 participants