Quota 'SUBNETWORKS' exceeded in e2e tests #46713

crassirostris · 2017-05-31T17:34:49Z

E.g.

https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/46700/pull-kubernetes-e2e-gce-etcd3/33214/
https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/46700/pull-kubernetes-kubemark-e2e-gce/32814/
https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/46696/pull-kubernetes-kubemark-e2e-gce/32813/

W0531 10:20:34.781] ERROR: (gcloud.compute.networks.create) Could not fetch resource:
W0531 10:20:34.782]  - Quota 'SUBNETWORKS' exceeded.  Limit: 100.0

The text was updated successfully, but these errors were encountered:

crassirostris · 2017-05-31T17:36:29Z

/cc @fejta Could you please assign it to an appropriate person?

krzyzacy · 2017-05-31T17:38:34Z

/assign

krzyzacy · 2017-05-31T17:42:42Z

the kubemark project looks normal, etcd project is flooded though, does etcd suite using excessive resources?

fejta · 2017-05-31T17:57:47Z

/remove-sig testing
/unassign @krzyzacy
/assign @bowei
@kubernetes/sig-network-test-failures

Looks like we are leaking SUBNETWORKS, see http://prow.k8s.io/?type=presubmit&job=pull-kubernetes-e2e-gce-etcd3

krzyzacy · 2017-05-31T18:01:05Z

I don't think we are leaking - each network create 8 subnets + there's 8 default one, and we are running 12 instances which total subnets will be 13 * 8 = 104 > 100, the 12th job will always fail..

probably just need to bump the quota a little bit more.

gcloud compute networks subnets list --project=k8s-jkns-pr-gce-etcd3
NAME                    REGION           NETWORK                 RANGE
default                 asia-northeast1  default                 10.146.0.0/20
e2e-gce-agent-pr-102-0  asia-northeast1  e2e-gce-agent-pr-102-0  10.146.0.0/20
e2e-gce-agent-pr-105-0  asia-northeast1  e2e-gce-agent-pr-105-0  10.146.0.0/20
e2e-gce-agent-pr-15-0   asia-northeast1  e2e-gce-agent-pr-15-0   10.146.0.0/20
e2e-gce-agent-pr-2-0    asia-northeast1  e2e-gce-agent-pr-2-0    10.146.0.0/20
e2e-gce-agent-pr-21-0   asia-northeast1  e2e-gce-agent-pr-21-0   10.146.0.0/20
e2e-gce-agent-pr-34-0   asia-northeast1  e2e-gce-agent-pr-34-0   10.146.0.0/20
e2e-gce-agent-pr-36-0   asia-northeast1  e2e-gce-agent-pr-36-0   10.146.0.0/20
e2e-gce-agent-pr-47-0   asia-northeast1  e2e-gce-agent-pr-47-0   10.146.0.0/20
e2e-gce-agent-pr-49-0   asia-northeast1  e2e-gce-agent-pr-49-0   10.146.0.0/20
e2e-gce-agent-pr-57-0   asia-northeast1  e2e-gce-agent-pr-57-0   10.146.0.0/20
e2e-gce-agent-pr-98-0   asia-northeast1  e2e-gce-agent-pr-98-0   10.146.0.0/20
default                 us-west1         default                 10.138.0.0/20
e2e-gce-agent-pr-102-0  us-west1         e2e-gce-agent-pr-102-0  10.138.0.0/20
e2e-gce-agent-pr-105-0  us-west1         e2e-gce-agent-pr-105-0  10.138.0.0/20
e2e-gce-agent-pr-15-0   us-west1         e2e-gce-agent-pr-15-0   10.138.0.0/20
e2e-gce-agent-pr-2-0    us-west1         e2e-gce-agent-pr-2-0    10.138.0.0/20
e2e-gce-agent-pr-21-0   us-west1         e2e-gce-agent-pr-21-0   10.138.0.0/20
e2e-gce-agent-pr-34-0   us-west1         e2e-gce-agent-pr-34-0   10.138.0.0/20
e2e-gce-agent-pr-36-0   us-west1         e2e-gce-agent-pr-36-0   10.138.0.0/20
e2e-gce-agent-pr-47-0   us-west1         e2e-gce-agent-pr-47-0   10.138.0.0/20
e2e-gce-agent-pr-49-0   us-west1         e2e-gce-agent-pr-49-0   10.138.0.0/20
e2e-gce-agent-pr-57-0   us-west1         e2e-gce-agent-pr-57-0   10.138.0.0/20
e2e-gce-agent-pr-98-0   us-west1         e2e-gce-agent-pr-98-0   10.138.0.0/20
default                 asia-east1       default                 10.140.0.0/20
e2e-gce-agent-pr-102-0  asia-east1       e2e-gce-agent-pr-102-0  10.140.0.0/20
e2e-gce-agent-pr-105-0  asia-east1       e2e-gce-agent-pr-105-0  10.140.0.0/20
e2e-gce-agent-pr-15-0   asia-east1       e2e-gce-agent-pr-15-0   10.140.0.0/20
e2e-gce-agent-pr-2-0    asia-east1       e2e-gce-agent-pr-2-0    10.140.0.0/20
e2e-gce-agent-pr-21-0   asia-east1       e2e-gce-agent-pr-21-0   10.140.0.0/20
e2e-gce-agent-pr-34-0   asia-east1       e2e-gce-agent-pr-34-0   10.140.0.0/20
e2e-gce-agent-pr-36-0   asia-east1       e2e-gce-agent-pr-36-0   10.140.0.0/20
e2e-gce-agent-pr-47-0   asia-east1       e2e-gce-agent-pr-47-0   10.140.0.0/20
e2e-gce-agent-pr-49-0   asia-east1       e2e-gce-agent-pr-49-0   10.140.0.0/20
e2e-gce-agent-pr-57-0   asia-east1       e2e-gce-agent-pr-57-0   10.140.0.0/20
e2e-gce-agent-pr-98-0   asia-east1       e2e-gce-agent-pr-98-0   10.140.0.0/20
default                 asia-southeast1  default                 10.148.0.0/20
e2e-gce-agent-pr-102-0  asia-southeast1  e2e-gce-agent-pr-102-0  10.148.0.0/20
e2e-gce-agent-pr-105-0  asia-southeast1  e2e-gce-agent-pr-105-0  10.148.0.0/20
e2e-gce-agent-pr-15-0   asia-southeast1  e2e-gce-agent-pr-15-0   10.148.0.0/20
e2e-gce-agent-pr-2-0    asia-southeast1  e2e-gce-agent-pr-2-0    10.148.0.0/20
e2e-gce-agent-pr-21-0   asia-southeast1  e2e-gce-agent-pr-21-0   10.148.0.0/20
e2e-gce-agent-pr-34-0   asia-southeast1  e2e-gce-agent-pr-34-0   10.148.0.0/20
e2e-gce-agent-pr-36-0   asia-southeast1  e2e-gce-agent-pr-36-0   10.148.0.0/20
e2e-gce-agent-pr-47-0   asia-southeast1  e2e-gce-agent-pr-47-0   10.148.0.0/20
e2e-gce-agent-pr-49-0   asia-southeast1  e2e-gce-agent-pr-49-0   10.148.0.0/20
e2e-gce-agent-pr-57-0   asia-southeast1  e2e-gce-agent-pr-57-0   10.148.0.0/20
e2e-gce-agent-pr-98-0   asia-southeast1  e2e-gce-agent-pr-98-0   10.148.0.0/20
default                 us-east4         default                 10.150.0.0/20
e2e-gce-agent-pr-102-0  us-east4         e2e-gce-agent-pr-102-0  10.150.0.0/20
e2e-gce-agent-pr-105-0  us-east4         e2e-gce-agent-pr-105-0  10.150.0.0/20
e2e-gce-agent-pr-15-0   us-east4         e2e-gce-agent-pr-15-0   10.150.0.0/20
e2e-gce-agent-pr-2-0    us-east4         e2e-gce-agent-pr-2-0    10.150.0.0/20
e2e-gce-agent-pr-21-0   us-east4         e2e-gce-agent-pr-21-0   10.150.0.0/20
e2e-gce-agent-pr-34-0   us-east4         e2e-gce-agent-pr-34-0   10.150.0.0/20
e2e-gce-agent-pr-36-0   us-east4         e2e-gce-agent-pr-36-0   10.150.0.0/20
e2e-gce-agent-pr-47-0   us-east4         e2e-gce-agent-pr-47-0   10.150.0.0/20
e2e-gce-agent-pr-49-0   us-east4         e2e-gce-agent-pr-49-0   10.150.0.0/20
e2e-gce-agent-pr-57-0   us-east4         e2e-gce-agent-pr-57-0   10.150.0.0/20
e2e-gce-agent-pr-98-0   us-east4         e2e-gce-agent-pr-98-0   10.150.0.0/20
default                 europe-west1     default                 10.132.0.0/20
e2e-gce-agent-pr-102-0  europe-west1     e2e-gce-agent-pr-102-0  10.132.0.0/20
e2e-gce-agent-pr-105-0  europe-west1     e2e-gce-agent-pr-105-0  10.132.0.0/20
e2e-gce-agent-pr-15-0   europe-west1     e2e-gce-agent-pr-15-0   10.132.0.0/20
e2e-gce-agent-pr-2-0    europe-west1     e2e-gce-agent-pr-2-0    10.132.0.0/20
e2e-gce-agent-pr-21-0   europe-west1     e2e-gce-agent-pr-21-0   10.132.0.0/20
e2e-gce-agent-pr-34-0   europe-west1     e2e-gce-agent-pr-34-0   10.132.0.0/20
e2e-gce-agent-pr-36-0   europe-west1     e2e-gce-agent-pr-36-0   10.132.0.0/20
e2e-gce-agent-pr-47-0   europe-west1     e2e-gce-agent-pr-47-0   10.132.0.0/20
e2e-gce-agent-pr-49-0   europe-west1     e2e-gce-agent-pr-49-0   10.132.0.0/20
e2e-gce-agent-pr-57-0   europe-west1     e2e-gce-agent-pr-57-0   10.132.0.0/20
e2e-gce-agent-pr-98-0   europe-west1     e2e-gce-agent-pr-98-0   10.132.0.0/20
default                 us-east1         default                 10.142.0.0/20
e2e-gce-agent-pr-102-0  us-east1         e2e-gce-agent-pr-102-0  10.142.0.0/20
e2e-gce-agent-pr-105-0  us-east1         e2e-gce-agent-pr-105-0  10.142.0.0/20
e2e-gce-agent-pr-15-0   us-east1         e2e-gce-agent-pr-15-0   10.142.0.0/20
e2e-gce-agent-pr-2-0    us-east1         e2e-gce-agent-pr-2-0    10.142.0.0/20
e2e-gce-agent-pr-21-0   us-east1         e2e-gce-agent-pr-21-0   10.142.0.0/20
e2e-gce-agent-pr-34-0   us-east1         e2e-gce-agent-pr-34-0   10.142.0.0/20
e2e-gce-agent-pr-36-0   us-east1         e2e-gce-agent-pr-36-0   10.142.0.0/20
e2e-gce-agent-pr-47-0   us-east1         e2e-gce-agent-pr-47-0   10.142.0.0/20
e2e-gce-agent-pr-49-0   us-east1         e2e-gce-agent-pr-49-0   10.142.0.0/20
e2e-gce-agent-pr-57-0   us-east1         e2e-gce-agent-pr-57-0   10.142.0.0/20
e2e-gce-agent-pr-98-0   us-east1         e2e-gce-agent-pr-98-0   10.142.0.0/20
default                 us-central1      default                 10.128.0.0/20
e2e-gce-agent-pr-102-0  us-central1      e2e-gce-agent-pr-102-0  10.128.0.0/20
e2e-gce-agent-pr-105-0  us-central1      e2e-gce-agent-pr-105-0  10.128.0.0/20
e2e-gce-agent-pr-15-0   us-central1      e2e-gce-agent-pr-15-0   10.128.0.0/20
e2e-gce-agent-pr-2-0    us-central1      e2e-gce-agent-pr-2-0    10.128.0.0/20
e2e-gce-agent-pr-21-0   us-central1      e2e-gce-agent-pr-21-0   10.128.0.0/20
e2e-gce-agent-pr-34-0   us-central1      e2e-gce-agent-pr-34-0   10.128.0.0/20
e2e-gce-agent-pr-36-0   us-central1      e2e-gce-agent-pr-36-0   10.128.0.0/20
e2e-gce-agent-pr-47-0   us-central1      e2e-gce-agent-pr-47-0   10.128.0.0/20
e2e-gce-agent-pr-49-0   us-central1      e2e-gce-agent-pr-49-0   10.128.0.0/20
e2e-gce-agent-pr-57-0   us-central1      e2e-gce-agent-pr-57-0   10.128.0.0/20
e2e-gce-agent-pr-98-0   us-central1      e2e-gce-agent-pr-98-0   10.128.0.0/20

krzyzacy · 2017-05-31T18:14:46Z

hummm, I take it back, it should not fail every single PR there..

bowei · 2017-05-31T18:35:54Z

Is there a project with extra subnets? From the list above, it looks like 1/project. Also -- I disabled the only CI job that creates its own subnets directly.

krzyzacy · 2017-05-31T18:41:57Z

it's 1 subnet per pr run per zone

fejta · 2017-05-31T18:54:11Z

Why are we creating subnets in all zones?

krzyzacy · 2017-05-31T19:29:00Z

@bowei
when trying to clean up some old subnets, I got:

ERROR: (gcloud.compute.networks.subnets.delete) Some requests did not succeed:
 - Invalid resource usage: 'Cannot delete auto subnetwork from an auto subnet mode network.'.

any idea?

krzyzacy · 2017-05-31T19:45:59Z

and the ones not affecting by the subnets also failing -
http://prow.k8s.io/log?pod=pull-kubernetes-e2e-gce-etcd3-33199

seems some node timeout issue?

/assign @pwittrock
as you are the build-cop now, maybe you can help here :-)

cblecker · 2017-05-31T19:52:58Z

I'm going to guess that the new us-west1-c zone that went live yesterday may have pushed this over the edge?

krzyzacy · 2017-05-31T20:13:31Z

requested a bump, I'm more worrying that the runs has not hit by subnets issue also failed, that seems a separate issue.

j3ffml · 2017-05-31T20:38:02Z

Why are we creating subnets in all zones?

I also would like to know this. Why does this project need so many subnets?

cblecker · 2017-05-31T21:08:02Z

So it looks like it's auto-creating subnetworks in all zones because of the --mode=auto flag here:
https://github.com/kubernetes/kubernetes/blob/master/cluster/gce/util.sh#L717

If we changed this to custom, would that break anything? It looks like there are also functions in the script to create subnetworks manually.

Either way, a quota increase would also fix this, it seems.

krzyzacy · 2017-05-31T21:30:49Z

quota is bumped, let's see if it fixes things

cblecker · 2017-05-31T21:52:30Z

kicking off a test to try it: #46711

krzyzacy · 2017-05-31T22:02:17Z

/assign @MrHohn

Some network resource are still leaking, I'm manually running janitor to clean them up. Seems PRs are piling up though.

krzyzacy · 2017-05-31T22:10:24Z

a sample log that's failing not due to subnetwork quota:

https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/46125/pull-kubernetes-e2e-gce-etcd3/33303

krzyzacy · 2017-05-31T23:17:58Z

kubernetes/test-infra#2902 should fix it, but we need to wait for couple runs

krzyzacy · 2017-06-01T00:25:50Z

https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/46648/pull-kubernetes-e2e-gce-etcd3/33358

things are start passing, now wait for the backlog to drain..

MrHohn · 2017-06-01T00:30:47Z

Also to clarify what was going on:

Gather test suite metrics for e2e-gce-etcd3 test-infra#2874 mistakenly overwrote GINKGO_TEST_ARGS for k8s-jkns-pr-gce-etcd3, which originally should be --ginkgo.skip=\[Slow\]|\[Serial\]|\[Disruptive\]|\[Flaky\]|\[Feature:.+\].
As the consequence, pull-kubernetes-e2e-gce-etcd3 ran all the e2e tests including Slow, Serial and Disruptive tests without skipping, during which all worst cases could happen -- forwarding rule, target pool, healthcheck and firewall resources may be created and orphaned in every PR job.
Due to the orphaned gce resources --- especially firewall resources, jenkins failed to delete the corresponding subnet because that was in used by the orphaned firewall rules. And gradually this ate up all quotas in project k8s-jkns-pr-gce-etcd3.

krzyzacy · 2017-06-01T01:03:49Z

whoops I was running the clean script from a different branch... now I'd expect old subnets are all gone from the project, and subsequent runs would be fine.

krzyzacy · 2017-06-01T02:18:06Z

seems stable now.
/close

k8s-ci-robot · 2017-06-01T02:18:06Z

@krzyzacy: you can't close an issue unless you authored it or you are assigned to it.

In response to this:

seems stable now.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

krzyzacy · 2017-06-01T02:18:28Z

/assign

krzyzacy · 2017-06-01T02:18:32Z

/close

crassirostris mentioned this issue May 31, 2017

Quota 'SUBNETWORKS' exceeded in e2e tests kubernetes/test-infra#2896

Closed

crassirostris added the sig/testing Categorizes an issue or PR as relevant to SIG Testing. label May 31, 2017

crassirostris mentioned this issue May 31, 2017

Add event exporter deployment to the fluentd-gcp addon #46700

Merged

k8s-ci-robot assigned krzyzacy May 31, 2017

k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. and removed sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels May 31, 2017

k8s-ci-robot assigned bowei and unassigned krzyzacy May 31, 2017

timstclair mentioned this issue May 31, 2017

apiserver: add a webhook implementation of the audit backend #45919

Merged

5 tasks

krzyzacy mentioned this issue May 31, 2017

Let janitor clear network/subnets, also tune pr janitor freq kubernetes/test-infra#2899

Merged

timstclair mentioned this issue May 31, 2017

Add an e2e test for AdvancedAuditing #46557

Merged

k8s-ci-robot assigned pwittrock May 31, 2017

ncdc mentioned this issue May 31, 2017

Disable all alpha feature gates by default in local-up-cluster.sh #46709

Merged

This was referenced May 31, 2017

set LANG to 'C' in Makefile #46487

Merged

try to deflake CR watches in tests #46536

Merged

jpeeler mentioned this issue May 31, 2017

Allow pods to opt out of PodPreset mutation via an annotation on the pod #44965

Merged

This was referenced May 31, 2017

IPv6 support for getting node IP #46044

Merged

IPv6 support for getting IP from default route #46138

Merged

spiffxp mentioned this issue May 31, 2017

Migrate kubelet to ConfigMapManager interface and use TTL-based caching manager #46470

Merged

Q-Lee mentioned this issue May 31, 2017

Enable PodSecurityPolicy in gce #46064

Closed

k8s-ci-robot assigned MrHohn May 31, 2017

lavalamp added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. labels May 31, 2017

krzyzacy mentioned this issue May 31, 2017

Fix pull-etcd3 job kubernetes/test-infra#2902

Merged

mikedanese closed this as completed Jun 1, 2017

mikedanese reopened this Jun 1, 2017

k8s-ci-robot assigned krzyzacy Jun 1, 2017

k8s-ci-robot closed this as completed Jun 1, 2017

This was referenced Jun 1, 2017

pr:pull-kubernetes-e2e-gce-etcd3 flaked 95 times in the past week #46741

Closed

pull-kubernetes-e2e-kops-aws is flaky #46370

Closed

yujuhong mentioned this issue Jun 12, 2017

pull-kubernetes-e2e-gce-etcd3 failing: Quota 'SUBNETWORKS' exceeded #47362

Closed

cblecker mentioned this issue Aug 22, 2017

pull-kubernetes-e2e-gce-etcd3 failing: Quota 'SUBNETWORKS' exceeded #51136

Closed

MrHohn mentioned this issue Sep 8, 2017

Create manual network instead of automatic network for PR jobs. kubernetes/test-infra#4472

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quota 'SUBNETWORKS' exceeded in e2e tests #46713

Quota 'SUBNETWORKS' exceeded in e2e tests #46713

crassirostris commented May 31, 2017

crassirostris commented May 31, 2017

krzyzacy commented May 31, 2017

krzyzacy commented May 31, 2017

fejta commented May 31, 2017

krzyzacy commented May 31, 2017

krzyzacy commented May 31, 2017

bowei commented May 31, 2017

krzyzacy commented May 31, 2017

fejta commented May 31, 2017

krzyzacy commented May 31, 2017

krzyzacy commented May 31, 2017

cblecker commented May 31, 2017 •

edited

Loading

krzyzacy commented May 31, 2017

j3ffml commented May 31, 2017

cblecker commented May 31, 2017

krzyzacy commented May 31, 2017

cblecker commented May 31, 2017

krzyzacy commented May 31, 2017

krzyzacy commented May 31, 2017

krzyzacy commented May 31, 2017

krzyzacy commented Jun 1, 2017

MrHohn commented Jun 1, 2017 •

edited

Loading

krzyzacy commented Jun 1, 2017

krzyzacy commented Jun 1, 2017

k8s-ci-robot commented Jun 1, 2017

krzyzacy commented Jun 1, 2017

krzyzacy commented Jun 1, 2017

Quota 'SUBNETWORKS' exceeded in e2e tests #46713

Quota 'SUBNETWORKS' exceeded in e2e tests #46713

Comments

crassirostris commented May 31, 2017

crassirostris commented May 31, 2017

krzyzacy commented May 31, 2017

krzyzacy commented May 31, 2017

fejta commented May 31, 2017

krzyzacy commented May 31, 2017

krzyzacy commented May 31, 2017

bowei commented May 31, 2017

krzyzacy commented May 31, 2017

fejta commented May 31, 2017

krzyzacy commented May 31, 2017

krzyzacy commented May 31, 2017

cblecker commented May 31, 2017 • edited Loading

krzyzacy commented May 31, 2017

j3ffml commented May 31, 2017

cblecker commented May 31, 2017

krzyzacy commented May 31, 2017

cblecker commented May 31, 2017

krzyzacy commented May 31, 2017

krzyzacy commented May 31, 2017

krzyzacy commented May 31, 2017

krzyzacy commented Jun 1, 2017

MrHohn commented Jun 1, 2017 • edited Loading

krzyzacy commented Jun 1, 2017

krzyzacy commented Jun 1, 2017

k8s-ci-robot commented Jun 1, 2017

krzyzacy commented Jun 1, 2017

krzyzacy commented Jun 1, 2017

cblecker commented May 31, 2017 •

edited

Loading

MrHohn commented Jun 1, 2017 •

edited

Loading