Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local Ephemeral Storage limit not working #78865

Closed
dashpole opened this issue Jun 10, 2019 · 30 comments
Closed

Local Ephemeral Storage limit not working #78865

dashpole opened this issue Jun 10, 2019 · 30 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/storage Categorizes an issue or PR as relevant to SIG Storage.

Comments

@dashpole
Copy link
Contributor

@arunbpt7 opened an issue in kubernetes/enhancements. I am moving it here.
/kind bug
/priority important-longterm
/sig node

As discussed in #361 , Looking for a solution to restrict ephemeral storage for pods usage . As it is found that ephemeral storage is shared across all the pods and that is going to be fill up /var/lib/docker frequently based on the pods writable layer and logs. This is causing the high utilization on /var/lib/docker file system frequently. If there is a solution to restrict ephemeral storage for pods , for an example set a defined size (lets say 20G) for the pods , that particular pods only can use 20G on ephemeral storage and defined persistant volumes for more storage requirement. So that other pods can use available space on /var/lib/docker which again restrict them to use other 20G for each pods.

have defined ephemeral-storage request and limit in resources (spec.hard.requests.ephemeral-storage , spec.hard.limits.ephemeral-storage) on the deployment and verified that evictionHard: is enabled for "imagefs and "nodefs" on the node . but when when deploying the pod and it is not restricting the pod to use the defined ephemeral storage . when creating large file inside the container it is still able to create files more that the ephemeral-storage request and limit.

evictionHard:
imagefs.available: 15%
memory.available: 100Mi
nodefs.available: 10%
nodefs.inodesFree: 5%

containers:

  • name: busybox
    image:
    resources:
    requests:
    ephemeral-storage: "500Mi"
    limits:
    ephemeral-storage: "500Mi"
@dashpole dashpole added the kind/bug Categorizes issue or PR as related to a bug. label Jun 10, 2019
@k8s-ci-robot k8s-ci-robot added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Jun 10, 2019
@dashpole
Copy link
Contributor Author

Can you show your monitoring that shows that the pod exceeds its limit for an extended period of time (a few minutes?)

@arunbpt7
Copy link

containers:
- name: busybox
image:
resources:
requests:
ephemeral-storage: "500Mi"
limits:
ephemeral-storage: "500Mi"

State: Running
Started: Mon, 10 Jun 2019 12:48:57 -0400
Ready: True
Restart Count: 0
Limits:
ephemeral-storage: 500Mi
Requests:
ephemeral-storage: 500Mi
Environment:

kubectl get po busybox-7cc68d968c-mb47z -n testns
NAME READY STATUS RESTARTS AGE
busybox-7cc68d968c-mb47z 1/1 Running 0 82m

kubectl exec -it busybox-7cc68d968c-mb47z -n testns -- bash
bash-4.2$ fallocate -l 2G /var/tmp/test2
bash-4.2$ du -sh /var/tmp/*
1.0G /var/tmp/test
2.0G /var/tmp/test2

bash-4.2$ exit

kubectl get po busybox-7cc68d968c-mb47z -n testns
NAME READY STATUS RESTARTS AGE
busybox-7cc68d968c-mb47z 1/1 Running 0 83m

@poidag-zz
Copy link
Contributor

I was able to recreate this issue on a 1.13.5 cluster

For the node that was being tested an EBS volume was attached to the instance and mounted as an xfs volume to /var/lib/docker

The deployment with resources set had a pod scheduled to the node in question.

"Execing" into the pod and running.

fallocate -l 2G /var/tmp/test1

Created a file larger than the set ephemeral storage limit of 500Mi. The pod was not evicted. Even waiting up to 10 minutes.

Starting again with a fresh volume and deployment.

Initially creating a 4G file within the new pod with an underlying volume of 5G mounted on /var/lib/docker

fallocate -l 4G /var/tmp/test1

caused imageGCManager to kick in due to the node DiskPressure condition rather than honouring ephemeral limits and evicting that one pod first

Jun 13 02:47:35 ip-10-0-2-15 kubelet[27133]: W0613 02:47:35.771392 27133 eviction_manager.go:333] eviction manager: attempting to reclaim ephemeral-storage
Jun 13 02:47:35 ip-10-0-2-15 kubelet[27133]: I0613 02:47:35.771424 27133 container_gc.go:85] attempting to delete unused containers
Jun 13 02:47:35 ip-10-0-2-15 kubelet[27133]: I0613 02:47:35.782369 27133 image_gc_manager.go:317] attempting to delete unused images
Jun 13 02:47:35 ip-10-0-2-15 kubelet[27133]: I0613 02:47:35.794272 27133 eviction_manager.go:344] eviction manager: must evict pod(s) to reclaim ephemeral-storage
Jun 13 02:47:35 ip-10-0-2-15 kubelet[27133]: I0613 02:47:35.794493 27133 eviction_manager.go:362] eviction manager: pods ranked for eviction: debug-887cd4775-2brw9_test(d9f7c5ee-8d84-11e9-b987-02f54a20dc4c), canal-js4tm_kube-system(53c05e18-8d66-11e9-b987-02f54a20dc4c), debug-887cd4775-fwvhq_test(a7ac5932-8d84-11e9-b987-02f54a20dc4c), debug-887cd4775-r9zcc_test(0695b54b-8d85-11e9-b987-02f54a20dc4c), debug-887cd4775-l6kxx_test(d18c07b5-8d84-11e9-b987-02f54a20dc4c), debug-887cd4775-rvv

The ranking for eviction, however, was correct and the pod debug-887cd4775-2brw9 was the pod used to create the volume.

@msau42
Copy link
Member

msau42 commented Jun 13, 2019

cc @kubernetes/sig-storage-bugs @jingxu97

@k8s-ci-robot k8s-ci-robot added the sig/storage Categorizes an issue or PR as relevant to SIG Storage. label Jun 13, 2019
@jingxu97
Copy link
Contributor

@pickledrick @arunbpt7 Could you please share your pod yaml file? You can also email me jinxu at google.com if you prefer. Thanks!

@arunbpt7
Copy link

arunbpt7 commented Jun 13, 2019

@jingxu97

apiVersion: apps/v1
kind: Deployment
metadata:
name: busybox
spec:
replicas: 1
selector:
matchLabels:
app: busybox
template:
metadata:
labels:
app: busybox
spec:
securityContext:
runAsUser: 99
fsGroup: 99
containers:
- name: busybox
image:
resources:
requests:
ephemeral-storage: “500Mi”
limits:
ephemeral-storage: “500Mi”

@jingxu97
Copy link
Contributor

@arunbpt7 did you miss some part of yaml file?

@poidag-zz
Copy link
Contributor

poidag-zz commented Jun 13, 2019

@jingxu97

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app: debug
  name: debug
spec:
  selector:
    matchLabels:
      app: debug
  template:
    metadata:
      labels:
        app: debug
    spec:
      containers:
      - image: quay.io/pickledrick/debug
        imagePullPolicy: Always
        name: debug
        resources:
          limits:
            ephemeral-storage: 500Mi
          requests:
            ephemeral-storage: 500Mi

@dashpole
Copy link
Contributor Author

@arunbpt7 can you query the summary api (localhost:10255/stats/summary) from the node that pod is running on to make sure it is measuring disk space correctly?

@dashpole
Copy link
Contributor Author

Our tests for this are not super consistent: https://k8s-testgrid.appspot.com/sig-node-kubelet#node-kubelet-serial&include-filter-by-regex=LocalStorageCapacityIsolationEviction, but are mostly green. I'll try and bump the timeout on the serial tests to see if we can get a clearer signal.

@arunbpt7
Copy link

arunbpt7 commented Jun 17, 2019

@arunbpt7
Copy link

the curl -s http://localhost:10255/stats/summary ran on the node where the pod is running and shows nothing .

@dashpole
Copy link
Contributor Author

It sounds like that is probably your problem then. If you don't have any metrics, the kubelet can't do its monitoring or eviction. Can you share your kubelet logs, or see if there are any errors related to metrics?

@arunbpt7
Copy link

@pickledrick , can you share the kubelet logs

@poidag-zz
Copy link
Contributor

Hi all,

The insecure stats API appears to be deprecated.

#59666
kubernetes/kubeadm#732

@yastij
Copy link
Member

yastij commented Jun 28, 2019

I'm not able to reproduce it with 1.13.5 cluster, I'm having the following

debug-887cd4775-ckp4l   0/1     Evicted   0          12m
debug-887cd4775-hlvxw   1/1     Running   0          112s

@pickledrick - if you have access to generated certs you can use the secure port

@poidag-zz
Copy link
Contributor

Hi @yastij I'm not able to reproduce. It seems in my environment Evictions are happening eventually. I am sourcing more information to see if there is something else set in original reporters configuration.

@arunbpt7
Copy link

arunbpt7 commented Jul 9, 2019

@pickledrick

/var/lib/docker is a separate file system apart from node root fs

@poidag-zz
Copy link
Contributor

Hi @arunbpt7

Yes, My test environment reflects this. Are you able to confirm the docker version you are using in this environment?

@cpearring
Copy link

Ran into something similar to this as well, but on a 1.12.8 cluster and using emptyDir. Initially the pod never got evicted, however in subsequent runs evictions worked properly and I haven't run into this since.

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: storage-test
  namespace: storage-test
spec:
  template:
    metadata:
      labels:
        app: storage-test
    spec:
      containers:
      - name: storage-test-container
        image: k8s.gcr.io/ubuntu-slim:0.1
        resources:
          requests:
            ephemeral-storage: 10Mi
          limits:
            ephemeral-storage: 10Mi
        command: ["/bin/sh"]
        args: ["-c", "dd if=/dev/urandom of=/cache/file.txt count=100 bs=1048576; sleep 1h"]
        volumeMounts:
        - mountPath: "/cache"
          name: cache-volume
      volumes:
      - name: cache-volume
        emptyDir: {}

@RobertKrawitz
Copy link
Contributor

RobertKrawitz commented Aug 16, 2019

This happens with the writable layer when the runtime directory (e. g. /var/lib/crio, /var/lib/docker) is not on the kubernetes root filesystem; it's due to this code in the eviction manager.

It's not apparent to me why this was done, and the author of the code (in changeset 27901ad) appears to have moved on. I'm planning to open a PR on it.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 14, 2019
@wu0407
Copy link

wu0407 commented Nov 19, 2019

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 19, 2019
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 17, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 18, 2020
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@rushins
Copy link

rushins commented Apr 19, 2020

hello

i got the same issue on K8S 1.16 and my docker image is 200GB . i have plenty of space in /var/lib/docker (500 GB out 450 GB Free) and i am getting this error. "ephermeral storage " ..can someone tell me what should be the fix.

@rushins
Copy link

rushins commented Apr 19, 2020

i have this error : The node was low on resource: ephemeral-storage. Container k8stst was using 112619704Ki, which exceeds its request of 0

@andreamaruccia
Copy link

andreamaruccia commented Apr 29, 2020

hm maybe not 100% related to this issue but I had a problem which was due to the fact that I had 2 filesystems. K8S couldn't manage that fact. See kubernetes/enhancements#361 (comment).

So I ended up to mount the filesystem and made docker and kubelet use the same partition with symlinks which solved the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Projects
None yet
Development

No branches or pull requests