Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HEAD & OPTIONS http verbs don’t seem to show up in the apiserver audit logs when we hit proxy endpoints. #95966

Closed
riaankleinhans opened this issue Oct 28, 2020 · 25 comments
Assignees
Labels
area/audit help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/security Categorizes an issue or PR as relevant to SIG Security. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@riaankleinhans
Copy link
Contributor

Description:

In the process of conformance testing of the 'PodProxyWithPath' Endpoints we have ran into issue with testing HEAD' and OPTION` endpoints.

Affected endpoints:

  • connectCoreV1OptionsNamespacedPodProxyWithPath
  • connectCoreV1HeadNamespacedPodProxyWithPath
  • connectCoreV1OptionsNamespacedServiceProxyWithPath
  • connectCoreV1HeadNamespacedServiceProxyWithPath

The http.Method(HEAD) is hard coded into RestquestInfo.Verb(get) making some of the API inaccessible from what we see.
When special verbs "Proxy" are hit it is suppose to check if the current part of the URL has proxy in it and override the request verb.
When running our test, we know that the http.Method is set to HEAD, but shows up in the audit logs as get.
image

In the audit log the K8s operation is not visible in the audit log. We rely on the a combination of the verb as stored by the audit logs and request URI to match up and find the operation ID. If the verb is not correct we identified the operation.
Request HEAD doesn’t make sense for APIs other than proxy. This might be an API Server bug.

Write-up:

HEAD & OPTIONS write-up give more details.

Conformance office hours discussion:

The topic was discussed in the Conformance office hours meeting on 20 Oct 2020 and can be reviewed in the video from 28min - 35min

Next steps:

Please review and help to move this forward.

@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 28, 2020
@riaankleinhans
Copy link
Contributor Author

/assign @smarterclayton
/cc @hh @heyste

@riaankleinhans
Copy link
Contributor Author

/sig network

@k8s-ci-robot k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 28, 2020
@hh
Copy link
Member

hh commented Oct 28, 2020

/sig apimachinery

@k8s-ci-robot
Copy link
Contributor

@hh: The label(s) sig/apimachinery cannot be applied, because the repository doesn't have them

In response to this:

/sig apimachinery

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@hh
Copy link
Member

hh commented Oct 28, 2020

/sig api-machinery

@k8s-ci-robot k8s-ci-robot added the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Oct 28, 2020
@roycaihw
Copy link
Member

/sig auth
/sig security
/remove-sig network
/help

@k8s-ci-robot
Copy link
Contributor

@roycaihw:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/sig auth
/sig security
/remove-sig network
/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/security Categorizes an issue or PR as relevant to SIG Security. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. and removed sig/network Categorizes an issue or PR as relevant to SIG Network. labels Oct 29, 2020
@fedebongio
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 3, 2020
@roycaihw
Copy link
Member

roycaihw commented Nov 4, 2020

When running our test, we know that the http.Method is set to HEAD, but shows up in the audit logs as get.

by definition verb is the kubernetes verb associated with the request. See https://kubernetes.io/docs/reference/access-authn-authz/authorization/#determine-the-request-verb on how kubernetes verbs are determined.

I think the first question is for sig auth on whether this is working as intended.

(p.s. the code that does HEAD->get is here, changing that will have impact on places including authorization, where you probably only want to change the audit events).

@roycaihw
Copy link
Member

roycaihw commented Nov 5, 2020

/remove-sig api-machinery

I think sig auth owns both the definition of Kubernetes request verb (given this doc falls under auth), and the implementation of how audit log collects the verb from request info.

@k8s-ci-robot k8s-ci-robot removed the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Nov 5, 2020
@hh
Copy link
Member

hh commented Dec 1, 2020

/sig auth

@liggitt
Copy link
Member

liggitt commented Feb 3, 2021

The http.Method(HEAD) is hard coded into RestquestInfo.Verb(get) making some of the API inaccessible from what we see.

The API is not inaccessible (the HEAD or OPTIONS requests should still be proxied through), but the audit event does not distinguish between them.

HEAD being treated as get for audit/authorization purposes is working as intended (they are authorized the same and according to the spec HEAD should be processed as a GET excluding body from the return)

@hh
Copy link
Member

hh commented Feb 3, 2021

You are right, it's not inaccessible. We are hitting it!

I wonder about the intention for auditing purposes.

Should we be logging enough information information to map back to the OpenAPI spec?
If we don't and log both as GET, it's not possible to distinguish the two API Operations from each other.

I started a thread in #sig-auth mailing list, but might be good to pull this back here.
https://groups.google.com/g/kubernetes-sig-auth/c/k-1xNWS9BdA/m/fw2tCuM4AwAJ

Our primary use case is to be able to map audit log entries back to the operationId of the openapi spec.

It might be useful to share the effort and work arounds we've taken to map audit log path and k8s verb back to http methods so we can lookup operationIds in the OpenAPI swagger definition.

We had to figure out this mapping manually, and found ourselves surprised sometimes on the logic:

https://github.com/cncf/apisnoop/blob/39c1030f37a7d5094d753f7312bdae1bce6eccd8/apps/snoopdb/postgres/snoopUtils.py#L23-L38

# The k8s VERB (aka action) mapping to http METHOD
# Our audit logs do NOT contain any mention of METHOD, only the VERB
VERB_TO_METHOD={
'get': 'get',
'list': 'get',
'proxy': 'proxy',
'create': 'post',
'post':'post',
'put':'post',
'update':'put',
'patch':'patch',
'connect':'connect',
'delete':'delete',
'deletecollection':'delete',
'watch':'get'
}

And for audit entries where the k8s 'verb' is missing, our current hack is to set it to "options":

https://github.com/cncf/apisnoop/blob/39c1030f37a7d5094d753f7312bdae1bce6eccd8/apps/snoopdb/postgres/snoopUtils.py#L198-L204

def find_operation_id(openapi_spec, event):
  try:
    method=VERB_TO_METHOD[event['verb']]
  except:
    method='options'
    #return 'bugNoVerb' # Our alternative method if we find this happens again
  url = urlparse(event['requestURI'])
#continues...

Note the rest of our 'find_operation_id' function tries to cache lookups as processing gigabytes of logs used to take hours before we settled on this bit of caching / operationId lookup logic.

As we look into correcting the k8s verb recording to the audit logs, it would be extremely helpful if the audit logging mechanisms where updated to include the OpenAPI operationId so these type of manual audit log->k8s verb->http method->swagger->OperationId mapping lookups and caching are unnecessary.

(@liggitt I think we explored this before, but our use case may have been deemed to much of an outlier)

I think we are at a point where we have no way to concretely map (for some of the Audited API) which OpenAPI operation was called.

@liggitt
Copy link
Member

liggitt commented Feb 3, 2021

Should we be logging enough information information to map back to the OpenAPI spec?

Mapping audit events back to the openapi spec is not a particular goal of the audit log.

Understanding frequency and type of API access is (especially distinguishing read vs write, and distinguishing different types of write requests).

I think head/get → get is reasonable from an audit perspective

I'm not sure what I would expect OPTIONS to map to

@dims
Copy link
Member

dims commented Feb 17, 2021

/cc

@liggitt
Copy link
Member

liggitt commented Feb 17, 2021

Discussed in sig-auth meeting on 2/17

From a requestinfo/audit/authorization perspective:

  • The "head/get → get" behavior is working as intended.
  • The "verbs other than get/put/post/patch/delete show up as "" in authorization and audit" behavior is unexpected and should be corrected.

To unblock conformance-testing:

  • explicit support for HEAD and OPTIONS methods in the openapi spec is limited to the proxy subresources; special-casing those to recognize coverage from the specific tests that are exercising those (looking for a specific path suffix in the audit log, etc), is reasonable.

@enj
Copy link
Member

enj commented May 17, 2021

/assign @liggitt

@liggitt liggitt added the priority/backlog Higher priority than priority/awaiting-more-evidence. label May 24, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 22, 2021
@riaankleinhans
Copy link
Contributor Author

@smarterclayton @liggitt should this be kept open?

@slaskawi
Copy link
Contributor

slaskawi commented Sep 3, 2021

Dear Sig Auth,

I'm a newcomer to the Kubernetes codebase and I would be more than happy to work on this as my first contribution.

I hope this issue is not time-critical as it might take some time until I learn the codebase.

Thanks!

/assign

@riaankleinhans
Copy link
Contributor Author

/remove-lifecycle stale
Thank you @slaskawi

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 5, 2021
@hh
Copy link
Member

hh commented Sep 5, 2021

You can take your time, it's been open for a long while.

It'll definitely be a good deep dive into the pool!

Once you wrap your head around what's happening, I'm sure you'll find an excellent solution.

Let us know if you need help.

@slaskawi
Copy link
Contributor

slaskawi commented Sep 6, 2021

Hey @hh @liggitt @Riaankl,

Here's a proposed solution to enhance the audit logs #104795

I skimmed through the testsuite and I think it should be fine. If there are any test failures caused by this chance, I'll fix it right away.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 5, 2021
@riaankleinhans
Copy link
Contributor Author

This PR was fixed by #95128
/close

@enj enj moved this to Closed / Done in SIG Auth Dec 5, 2022
@enj enj added this to SIG Auth Dec 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/audit help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/security Categorizes an issue or PR as relevant to SIG Security. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.