Skip to content

Many services in a single namespace leads to assorted problems. #8498

Closed
@mattmoor

Description

/area API
/area autoscale

What version of Knative?

HEAD

Description of the problem.

I wanted to push on our limits a bit, and so I wrote the very innovative (patent pending 🤣 ) script below. I plotted the latency between creationTimestamp and status.conditions[Ready].lastTransitionTime here.

A few observations in this context:

  1. Revision creation latency creeps up over time.
  2. (anecdotally by curling) the cold start latency of ALL ksvcs creeps up over time (from 2-3s to 10-12s, with all time being spent between "Container create" and "Container start")
  3. After 1198 services were deployed, I started seeing deployments fail with:
RevisionFailed: Revision "foo-1200-sjwyy-1" failed with message: Container failed with: standard_init_linux.go:211: exec user process caused "argument list too long
  1. When the above happened, we stop being able to cold start new services (they crash loop with the same message)!

Here's where it gets interesting... On a whim, I tried picking back up in a second namespace, and things work! Not only do they work, but cold start latency for the new services is back down!

Steps to Reproduce the Problem

I needed a GKE cluster with at least 10 nodes (post-master resize) to tolerate the number of services this creates. I was playing with this in the context of mink, but there's no reason that would affect what I'm seeing.

#!/bin/bash -e

for i in $(seq 1 1500); do
  kn service create foo-$i --image=gcr.io/knative-samples/autoscale-go:0.1
  sleep 10
done

I gathered the latencies as CSV with:

kubectl get ksvc -ojsonpath='{range .items[*]}{.metadata.name},{.metadata.creationTimestamp},{.status.conditions[?(@.type=="Ready")].lastTransitionTime}{"\n"}{end}' | pbcopy

Metadata

Assignees

No one assigned

    Labels

    area/APIAPI objects and controllerskind/bugCategorizes issue or PR as related to a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions