Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "Merge pull request #47353 from apelisse/http-cache" #50254

Merged
merged 1 commit into from
Aug 8, 2017

Conversation

liggitt
Copy link
Member

@liggitt liggitt commented Aug 7, 2017

Some issues were discovered with the caching merged in #47353:

  • uses a disk-based cache that is not safe between processes (does not use atomic fs operations)
  • writes get/list responses to disk that should not be cached (like kubectl get secrets)
  • is vulnerable to partially written cache responses being used as responses to future requests
  • breaks uses of the client transport that make use of websockets
  • defaults to enabling the cache for any client builder using RecommendedConfigOverrideFlags or DefaultClientConfig which affects more components than just kubectl

This reverts commit fc89743, reversing changes made to 29ab38e.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 7, 2017
@k8s-github-robot k8s-github-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. release-note-label-needed labels Aug 7, 2017
@liggitt
Copy link
Member Author

liggitt commented Aug 7, 2017

cc @apelisse @mbohlool @pwittrock from original PR

@liggitt liggitt added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note-label-needed labels Aug 7, 2017
@apelisse
Copy link
Member

apelisse commented Aug 7, 2017

Thanks @liggitt for working on that, and sorry for breaking your use-case. All good points,

Can you clarify what the difference is between:

uses a disk-based cache that is not safe between processes (does not use atomic fs operations)

and

is vulnerable to partially written cache responses being used as responses to future requests

Or maybe the question is, would a fix for the first point also fix that second point? If not, can you clarify what you meant for the latter.

@liggitt
Copy link
Member Author

liggitt commented Aug 7, 2017

uses a disk-based cache that is not safe between processes (does not use atomic fs operations)

this means that two successful writes by separate processes would stomp on each other (process 1 starts writing, process 2 starts writing, they both write to the file in an interleaved way)

is vulnerable to partially written cache responses being used as responses to future requests

this means that if a single process fails halfway through writing the file, the half-written file is left in place. a future invocation of the cache will scan existing files to determine what is in cache, and the half-written file gets put in the index of available cached responses.

would a fix for the first point also fix that second point?

possibly, depending on the fix... if the disk cache changed to always write to a tmp file and only on success do a move to a file that is used to serve future cached responses, I think that would likely address both issues.

@apelisse
Copy link
Member

apelisse commented Aug 7, 2017

if the disk cache changed to always write to a tmp file and only on success do a move to a file that is used to serve future cached responses, I think that would likely address both issues.

I think we're on the same page then, that's exactly what I had in mind. Thanks

@smarterclayton
Copy link
Contributor

/lgtm

Given that this impacts concurrent kubectl clients in unpredictable ways, and also caches secret data on disk.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 7, 2017
@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 7, 2017
@liggitt liggitt force-pushed the revert-disk-cache branch from 61b1c18 to 60f1828 Compare August 7, 2017 18:34
@liggitt
Copy link
Member Author

liggitt commented Aug 7, 2017

rebased to fix godep conflict

@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 7, 2017
@liggitt liggitt added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 7, 2017
@liggitt liggitt force-pushed the revert-disk-cache branch from 60f1828 to 4ee72eb Compare August 7, 2017 20:21
@liggitt
Copy link
Member Author

liggitt commented Aug 7, 2017

fixed conflict in staging godep

@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 7, 2017
@smarterclayton
Copy link
Contributor

/retest

@thockin
Copy link
Member

thockin commented Aug 7, 2017

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 7, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: liggitt, smarterclayton, thockin

Associated issue: 47353

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:
  • OWNERS [smarterclayton,thockin]

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@smarterclayton
Copy link
Contributor

smarterclayton commented Aug 8, 2017 via email

@liggitt
Copy link
Member Author

liggitt commented Aug 8, 2017

unit flaked on #50022
/retest

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Aug 8, 2017

@liggitt: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-unit 4ee72eb link /test pull-kubernetes-unit

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 50254, 50174, 50179)

@k8s-github-robot k8s-github-robot merged commit 187e6ab into kubernetes:master Aug 8, 2017
@liggitt liggitt deleted the revert-disk-cache branch August 9, 2017 21:04
k8s-github-robot pushed a commit that referenced this pull request Sep 1, 2017
Automatic merge from submit-queue (batch tested with PRs 51480, 49616, 50123, 50846, 50404)

Kubectl to use http caching to cache openapi responses from the server

**What this PR does / why we need it**: 

This PR is trying to address the problems raised in #50254 

> * uses a disk-based cache that is not safe between processes (does not use atomic fs operations)
> * writes get/list responses to disk that should not be cached (like kubectl get secrets)
> * is vulnerable to partially written cache responses being used as responses to future requests
> * breaks uses of the client transport that make use of websockets
> * defaults to enabling the cache for any client builder using RecommendedConfigOverrideFlags or DefaultClientConfig which affects more components than just kubectl

All of these points are addressed by this pull-request:
1. It now uses atomic fs operations
2. Doesn't cache by default, only if requested by the client (and it's only done by openapi client)
3. Fixed because of atomic fs operations
4. Found the reason for the bug: Cache wrapper couldn't be unwrapped. I implemented the `WrappedRoundTripper` interface.
5. Since 2. is fixed, I think that should be fine

@smarterclayton @liggitt 

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #50254

**Special notes for your reviewer**:

**Release note**:

```release-note
Allows kubectl to use http caching mechanism for the OpenAPI schema. The cache directory can be configured through `--cache-dir` command line flag to kubectl. If set to empty string, caching will be disabled.
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants