Ansible operator-sdk v1.5.0 with updated kube-rbac-proxy:v0.8.0 fails to run with permission denied #4684

slopezz · 2021-03-19T16:49:50Z

Bug Report

After upgrading an ansible operator to operator-sdk v1.5.0, operator controller-manager pod never gets running because of error on kube-rbac-proxy container (which on operator-sdk v1.5.0 has been upgraded from v0.5.0 to v0.8.0):

Generated from kubelet on ip-10-96-4-241.ec2.internal
Error: container create failed: time="2021-03-19T15:51:12Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"

I've seen some issues related to that in go (#4402, kubernetes-sigs/kubebuilder#1978), for example on go operators v1.5.0, the scaffolding is:

Adding to Dockerfile USER 65532:65532
Adding securityContext: allowPrivilegeEscalation: false to both proxy and manager containers

securityContext:
  allowPrivilegeEscalation: false

But on go operators v1.5.0, where this error does not appear, it is still being used kube-rbac-proxy:v0.5.0.

What did you do?

Create operator-sdk scaffolding using operator-sdk v1.5.0.

What did you expect to see?

Controller-manager containers running OK.

What did you see instead? Under which circumstances?

Controller-manager kube-rbac-proxy container failing because of error:

Generated from kubelet on ip-10-96-4-241.ec2.internal
Error: container create failed: time="2021-03-19T15:51:12Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"

Environment

Operator type:

/language ansible

Kubernetes cluster type: Openshift v4.7.0

$ operator-sdk version

operator-sdk version: "v1.5.0", commit: "98f30d59ade2d911a7a8c76f0169a7de0dec37a0", kubernetes version: "1.19.4", go version: "go1.15.5", GOOS: "linux", GOARCH: "amd64"

$ kubectl version

Client Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.13-dispatcher", GitCommit:"fd22db44e150011eccc8729db223945384460143", GitTreeState:"clean", BuildDate:"2020-07-24T07:27:52Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.0+bd9e442", GitCommit:"bd9e4421804c212e6ac7ee074050096f08dda543", GitTreeState:"clean", BuildDate:"2021-02-11T23:05:38Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}

Possible Solution

I tried to update container securityContext without success, error persisted.

Finally I have solved this error by downgrading kube-rbac-proxy from v0.8.0 to v0.5.0 (actually, go operator-sdk v1.5.0 stays with v0.5.0, it seems that only ansible-operator v1.5.0 has upgraded to v0.8.0 introducing the bug).

Additional context

The text was updated successfully, but these errors were encountered:

estroz · 2021-03-19T21:54:46Z

FWIW, the default Go operator project for v1.5.0 uses kube-rbac-proxy v0.8.0, and all operator types pass CI.

Does this happen with a newly-initialized operator, or to some operator that is trying to upgrade?
Can you try adding runAsNonRoot: true (which is currently in master to your Deployment like so

# config/manager/manager.yaml

spec:
  selector:
    matchLabels:
      control-plane: controller-manager
  replicas: 1
  template:
    metadata:
      labels:
        control-plane: controller-manager
    spec:
+     securityContext:
+       runAsNonRoot: true
      containers

/triage support

slopezz · 2021-03-22T10:33:21Z

FWIW, the default Go operator project for v1.5.0 uses kube-rbac-proxy v0.8.0, and all operator types pass CI.

Does this happen with a newly-initialized operator, or to some operator that is trying to upgrade?
Can you try adding runAsNonRoot: true (which is currently in master to your Deployment like so
# config/manager/manager.yaml

spec:
  selector:
    matchLabels:
      control-plane: controller-manager
  replicas: 1
  template:
    metadata:
      labels:
        control-plane: controller-manager
    spec:
+     securityContext:
+       runAsNonRoot: true
      containers
/triage support

Hi @estroz This is an ansible operator which I'm upgrading from operator-sdk v0.18.1 to v1.5.0 using the recommended migration path of initializing a project from scratch, and I'm using Openshift 4.7, which has more security restrictions than vanilla kubernetes, I've checked that it seems you are using kind on the ci?

I have tested what you suggested of adding that specific security context and didnt work:

$ git diff
diff --git a/config/default/manager_auth_proxy_patch.yaml b/config/default/manager_auth_proxy_patch.yaml
index 58dade9..92e80ff 100644
--- a/config/default/manager_auth_proxy_patch.yaml
+++ b/config/default/manager_auth_proxy_patch.yaml
@@ -10,7 +10,7 @@ spec:
     spec:
       containers:
       - name: kube-rbac-proxy
-        image: gcr.io/kubebuilder/kube-rbac-proxy:v0.5.0
+        image: gcr.io/kubebuilder/kube-rbac-proxy:v0.8.0
         args:
         - "--secure-listen-address=0.0.0.0:8443"
         - "--upstream=http://127.0.0.1:8080/"
diff --git a/config/manager/manager.yaml b/config/manager/manager.yaml
index fb3d02a..9409fc7 100644
--- a/config/manager/manager.yaml
+++ b/config/manager/manager.yaml
@@ -23,6 +23,8 @@ spec:
         control-plane: controller-manager
     spec:
       serviceAccountName: controller-manager
+      securityContext:
+        runAsNonRoot: true
       containers:
         - name: manager
           args:

Then deploy the changes:

$ make deploy 
cd config/manager && /home/slopez/bin/kustomize edit set image controller=quay.io/3scale/prometheus-exporter-operator:v0.3.0-alpha.11
/home/slopez/bin/kustomize build config/default | kubectl apply -f -
namespace/prometheus-exporter-operator-system unchanged
customresourcedefinition.apiextensions.k8s.io/prometheusexporters.monitoring.3scale.net unchanged
serviceaccount/prometheus-exporter-operator-controller-manager unchanged
role.rbac.authorization.k8s.io/prometheus-exporter-operator-leader-election-role unchanged
role.rbac.authorization.k8s.io/prometheus-exporter-operator-manager-role unchanged
clusterrole.rbac.authorization.k8s.io/prometheus-exporter-operator-metrics-reader unchanged
clusterrole.rbac.authorization.k8s.io/prometheus-exporter-operator-proxy-role unchanged
rolebinding.rbac.authorization.k8s.io/prometheus-exporter-operator-leader-election-rolebinding unchanged
rolebinding.rbac.authorization.k8s.io/prometheus-exporter-operator-manager-rolebinding unchanged
clusterrolebinding.rbac.authorization.k8s.io/prometheus-exporter-operator-proxy-rolebinding unchanged
service/prometheus-exporter-operator-controller-manager-metrics-service unchanged
deployment.apps/prometheus-exporter-operator-controller-manager configured
servicemonitor.monitoring.coreos.com/prometheus-exporter-operator-controller-manager-metrics-monitor unchanged

And the error appears:

Regarding what I said about go-operator v1.5.0 and kube-rbac-proxy:v0.8.0, sorry for the confusion, we have just upgraded a go operator to v1.5.0 , and we forgot to upgrade the kube-rbac-proxy.

slopezz · 2021-03-22T11:02:36Z

@estroz Quick update with more details, we have tested the go operator-sdk v1.5.0 and kube-rbac-proxy:v0.8.0 I was referring on previous message (a part from the ansible operator) and:

On OCP 4.5.7 it works
On OCP 4.6.17 it fails:

Events:
  Type     Reason          Age               From                                  Message
  ----     ------          ----              ----                                  -------
  Normal   Scheduled       <unknown>                                               Successfully assigned roi-test/marin3r-controller-manager-86857586df-59sfh to ip-10-96-7-225.ec2.internal
  Normal   AddedInterface  32s               multus                                Add eth0 [10.129.2.86/23]
  Normal   Pulling         32s               kubelet, ip-10-96-7-225.ec2.internal  Pulling image "gcr.io/kubebuilder/kube-rbac-proxy:v0.8.0"
  Normal   Pulled          29s               kubelet, ip-10-96-7-225.ec2.internal  Successfully pulled image "gcr.io/kubebuilder/kube-rbac-proxy:v0.8.0" in 2.728065908s
  Warning  Failed          28s               kubelet, ip-10-96-7-225.ec2.internal  Error: container create failed: time="2021-03-22T10:50:24Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"
  Normal   Pulling         28s               kubelet, ip-10-96-7-225.ec2.internal  Pulling image "quay.io/3scale/marin3r:v0.8.0-dev.4"
  Normal   Pulled          25s               kubelet, ip-10-96-7-225.ec2.internal  Successfully pulled image "quay.io/3scale/marin3r:v0.8.0-dev.4" in 3.201375208s
  Normal   Created         25s               kubelet, ip-10-96-7-225.ec2.internal  Created container manager
  Normal   Started         25s               kubelet, ip-10-96-7-225.ec2.internal  Started container manager
  Warning  Failed          24s               kubelet, ip-10-96-7-225.ec2.internal  Error: container create failed: time="2021-03-22T10:50:29Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"
  Warning  Failed          23s               kubelet, ip-10-96-7-225.ec2.internal  Error: container create failed: time="2021-03-22T10:50:30Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"
  Normal   Pulled          7s (x3 over 24s)  kubelet, ip-10-96-7-225.ec2.internal  Container image "gcr.io/kubebuilder/kube-rbac-proxy:v0.8.0" already present on machine
  Warning  Failed          7s                kubelet, ip-10-96-7-225.ec2.internal  Error: container create failed: time="2021-03-22T10:50:46Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"

On OCP 4.7.0 it fails:

Events:
  Type     Reason          Age               From                                   Message
  ----     ------          ----              ----                                   -------
  Normal   Scheduled       <unknown>                                                Successfully assigned roi-test/marin3r-controller-manager-86857586df-4n666 to ip-10-96-11-248.ec2.internal
  Normal   AddedInterface  36s               multus                                 Add eth0 [10.128.3.199/23]
  Warning  Failed          36s               kubelet, ip-10-96-11-248.ec2.internal  Error: container create failed: time="2021-03-22T10:44:53Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"
  Normal   Pulling         36s               kubelet, ip-10-96-11-248.ec2.internal  Pulling image "quay.io/3scale/marin3r:v0.8.0-dev.4"
  Normal   Pulled          32s               kubelet, ip-10-96-11-248.ec2.internal  Successfully pulled image "quay.io/3scale/marin3r:v0.8.0-dev.4" in 3.954517265s
  Normal   Created         32s               kubelet, ip-10-96-11-248.ec2.internal  Created container manager
  Normal   Started         32s               kubelet, ip-10-96-11-248.ec2.internal  Started container manager
  Warning  Failed          31s               kubelet, ip-10-96-11-248.ec2.internal  Error: container create failed: time="2021-03-22T10:44:58Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"
  Warning  Failed          30s               kubelet, ip-10-96-11-248.ec2.internal  Error: container create failed: time="2021-03-22T10:44:59Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"
  Warning  Failed          19s               kubelet, ip-10-96-11-248.ec2.internal  Error: container create failed: time="2021-03-22T10:45:10Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"
  Normal   Pulled          4s (x5 over 36s)  kubelet, ip-10-96-11-248.ec2.internal  Container image "gcr.io/kubebuilder/kube-rbac-proxy:v0.8.0" already present on machine
  Warning  Failed          4s                kubelet, ip-10-96-11-248.ec2.internal  Error: container create failed: time="2021-03-22T10:45:25Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"

There seems to be a problem with 0.8.0 in openshift 4.6 and 4.7 where the rabc-proxy fails to start. Reverting to 0.5.0 while this issue is analyzed. Check operator-framework/operator-sdk#4684 for more details. This reverts commit 6c4c763.

camilamacedo86 · 2021-03-22T11:17:50Z

It probably will be solved with #4655 (master).
See that it will use the service account and also will have the SCC config. See: https://github.com/operator-framework/operator-sdk/blob/master/testdata/ansible/memcached-operator/config/manager/manager.yaml#L36-L37

There seems to be a problem with 0.8.0 in openshift 4.6 and 4.7 where the rabc-proxy fails to start. Reverting to 0.5.0 while this issue is analyzed. Check operator-framework/operator-sdk#4684 for more details. This reverts commit 6c4c763.

slopezz · 2021-03-22T15:23:18Z

@camilamacedo86 Outputs from my previous comment #4684 (comment) refer to a go-operator v1.5.0 already using this pod and container securityContext, as well as the kube-rbac-proxy v0.8.0.

In addition, I have just tested that changes on the ansible operator (using its own serviceaccount), without success, same error:

$ git diff
diff --git a/config/default/manager_auth_proxy_patch.yaml b/config/default/manager_auth_proxy_patch.yaml
index 58dade9..92e80ff 100644
--- a/config/default/manager_auth_proxy_patch.yaml
+++ b/config/default/manager_auth_proxy_patch.yaml
@@ -10,7 +10,7 @@ spec:
     spec:
       containers:
       - name: kube-rbac-proxy
-        image: gcr.io/kubebuilder/kube-rbac-proxy:v0.5.0
+        image: gcr.io/kubebuilder/kube-rbac-proxy:v0.8.0
         args:
         - "--secure-listen-address=0.0.0.0:8443"
         - "--upstream=http://127.0.0.1:8080/"
diff --git a/config/manager/manager.yaml b/config/manager/manager.yaml
index fb3d02a..36b82c3 100644
--- a/config/manager/manager.yaml
+++ b/config/manager/manager.yaml
@@ -23,6 +23,8 @@ spec:
         control-plane: controller-manager
     spec:
       serviceAccountName: controller-manager
+      securityContext:
+        runAsNonRoot: true
       containers:
         - name: manager
           args:
@@ -36,6 +38,8 @@ spec:
                 fieldRef:
                   fieldPath: metadata.annotations['olm.targetNamespaces']
           image: controller:latest
+          securityContext:
+            allowPrivilegeEscalation: false
           livenessProbe:
             httpGet:
               path: /healthz


$ make deploy
cd config/manager && /home/slopez/bin/kustomize edit set image controller=quay.io/3scale/prometheus-exporter-operator:v0.3.0-alpha.11
/home/slopez/bin/kustomize build config/default | kubectl apply -f -
namespace/prometheus-exporter-operator-system unchanged
customresourcedefinition.apiextensions.k8s.io/prometheusexporters.monitoring.3scale.net unchanged
serviceaccount/prometheus-exporter-operator-controller-manager unchanged
role.rbac.authorization.k8s.io/prometheus-exporter-operator-leader-election-role unchanged
role.rbac.authorization.k8s.io/prometheus-exporter-operator-manager-role unchanged
clusterrole.rbac.authorization.k8s.io/prometheus-exporter-operator-metrics-reader unchanged
clusterrole.rbac.authorization.k8s.io/prometheus-exporter-operator-proxy-role unchanged
rolebinding.rbac.authorization.k8s.io/prometheus-exporter-operator-leader-election-rolebinding unchanged
rolebinding.rbac.authorization.k8s.io/prometheus-exporter-operator-manager-rolebinding unchanged
clusterrolebinding.rbac.authorization.k8s.io/prometheus-exporter-operator-proxy-rolebinding unchanged
service/prometheus-exporter-operator-controller-manager-metrics-service unchanged
deployment.apps/prometheus-exporter-operator-controller-manager configured
servicemonitor.monitoring.coreos.com/prometheus-exporter-operator-controller-manager-metrics-monitor unchanged


$ oc get pods -n prometheus-exporter-operator-system
NAME                                                              READY   STATUS                 RESTARTS   AGE
prometheus-exporter-operator-controller-manager-5d8d8f69bflzl5q   2/2     Running                0          5h6m  # the one with v0.5.0
prometheus-exporter-operator-controller-manager-68588876878thxk   1/2     CreateContainerError   0          58s # new one


$ oc describe pod prometheus-exporter-operator-controller-manager-68588876878thxk -n prometheus-exporter-operator-system
...
Events:
  Type     Reason          Age                    From               Message
  ----     ------          ----                   ----               -------
  Normal   Scheduled       3m38s                  default-scheduler  Successfully assigned prometheus-exporter-operator-system/prometheus-exporter-operator-controller-manager-68588876878thxk to ip-10-96-11-248.ec2.internal
  Normal   Started         3m36s                  kubelet            Started container manager
  Normal   AddedInterface  3m36s                  multus             Add eth0 [10.128.2.31/23]
  Warning  Failed          3m36s                  kubelet            Error: container create failed: time="2021-03-22T15:12:47Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"
  Normal   Pulled          3m36s                  kubelet            Container image "quay.io/3scale/prometheus-exporter-operator:v0.3.0-alpha.11" already present on machine
  Normal   Created         3m36s                  kubelet            Created container manager
  Warning  Failed          3m35s                  kubelet            Error: container create failed: time="2021-03-22T15:12:48Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"
  Warning  Failed          3m34s                  kubelet            Error: container create failed: time="2021-03-22T15:12:49Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"
  Warning  Failed          3m23s                  kubelet            Error: container create failed: time="2021-03-22T15:13:00Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"
  Warning  Failed          3m9s                   kubelet            Error: container create failed: time="2021-03-22T15:13:14Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"
  Warning  Failed          2m57s                  kubelet            Error: container create failed: time="2021-03-22T15:13:26Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"
  Warning  Failed          2m45s                  kubelet            Error: container create failed: time="2021-03-22T15:13:38Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"
  Warning  Failed          2m31s                  kubelet            Error: container create failed: time="2021-03-22T15:13:52Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"
  Warning  Failed          2m19s                  kubelet            Error: container create failed: time="2021-03-22T15:14:04Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"
  Normal   Pulled          111s (x11 over 3m36s)  kubelet            Container image "gcr.io/kubebuilder/kube-rbac-proxy:v0.8.0" already present on machine
  Warning  Failed          111s (x2 over 2m6s)    kubelet            (combined from similar events): Error: container create failed: time="2021-03-22T15:14:32Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied"

estroz · 2021-03-30T17:06:14Z

@slopezz try setting the user/group in the kube-rbac-proxy container

# config/default/manager_auth_proxy_patch.yaml

         ports:
         - containerPort: 8443
           name: https
+        securityContext:
+          runAsUser: 65532
+          runAsGroup: 65534
       - name: manager
         args:
         - "--health-probe-bind-address=:8081"

Got this from brancz/kube-rbac-proxy#101

andrewazores · 2021-03-31T20:02:10Z

@slopezz try setting the user/group in the kube-rbac-proxy container

# config/default/manager_auth_proxy_patch.yaml

         ports:
         - containerPort: 8443
           name: https
+        securityContext:
+          runAsUser: 65532
+          runAsGroup: 65534
       - name: manager
         args:
         - "--health-probe-bind-address=:8081"

Got this from brancz/kube-rbac-proxy#101

I've tried this and see a failure with this message on the ReplicaSet:

  status:
    conditions:
    - lastTransitionTime: "2021-03-31T20:00:23Z"
      message: 'pods "container-jfr-operator-controller-manager-79b689cf47-" is forbidden:
        unable to validate against any security context constraint: [spec.containers[0].securityContext.runAsUser:
        Invalid value: 65532: must be in the ranges: [1000610000, 1000619999]]'
      reason: FailedCreate
      status: "True"
      type: ReplicaFailure

I'm using CRC for testing, so:

$ crc version
CodeReady Containers version: 1.24.0+5f06e84b
OpenShift version: 4.7.2 (embedded in executable)

estroz · 2021-03-31T20:08:40Z

@andrewazores you may actually need to use this OCP image instead of the "upstream" one. Additionally looks like that image uses user 65534 instead of 65532.

andrewazores · 2021-03-31T20:30:14Z

I have just tried with that alternate image but see the same "out of range" failure I posted above. From what other reading I have done this appears to be due to security constraints applied within my cluster, although I'm unsure if that's CRC-specific or from the general OpenShift version installed.

slopezz · 2021-04-08T14:06:17Z

@estroz I have tested the image that you suggested openshift4/ose-kube-rbac-proxy:v4.7.0 with the securitycontext suggested changes (but without forcing anmy specific user 65534/65532 which is not permitted on openshift) , and it works as expected, there is no error:

$ git diff
diff --git a/config/default/manager_auth_proxy_patch.yaml b/config/default/manager_auth_proxy_patch.yaml
index 58dade9..bc70b8b 100644
--- a/config/default/manager_auth_proxy_patch.yaml
+++ b/config/default/manager_auth_proxy_patch.yaml
@@ -10,7 +10,7 @@ spec:
     spec:
       containers:
       - name: kube-rbac-proxy
-        image: gcr.io/kubebuilder/kube-rbac-proxy:v0.5.0
+        image: registry.redhat.io/openshift4/ose-kube-rbac-proxy:v4.7.0
         args:
         - "--secure-listen-address=0.0.0.0:8443"
         - "--upstream=http://127.0.0.1:8080/"
diff --git a/config/manager/manager.yaml b/config/manager/manager.yaml
index fb3d02a..36b82c3 100644
--- a/config/manager/manager.yaml
+++ b/config/manager/manager.yaml
@@ -23,6 +23,8 @@ spec:
         control-plane: controller-manager
     spec:
       serviceAccountName: controller-manager
+      securityContext:
+        runAsNonRoot: true
       containers:
         - name: manager
           args:
@@ -36,6 +38,8 @@ spec:
                 fieldRef:
                   fieldPath: metadata.annotations['olm.targetNamespaces']
           image: controller:latest
+          securityContext:
+            allowPrivilegeEscalation: false
           livenessProbe:
             httpGet:
               path: /healthz


$ make deploy 
cd config/manager && /home/slopez/bin/kustomize edit set image controller=quay.io/3scale/prometheus-exporter-operator:v0.3.0-alpha.11
/home/slopez/bin/kustomize build config/default | kubectl apply -f -
namespace/prometheus-exporter-operator-system created
customresourcedefinition.apiextensions.k8s.io/prometheusexporters.monitoring.3scale.net created
serviceaccount/prometheus-exporter-operator-controller-manager created
role.rbac.authorization.k8s.io/prometheus-exporter-operator-leader-election-role created
role.rbac.authorization.k8s.io/prometheus-exporter-operator-manager-role created
clusterrole.rbac.authorization.k8s.io/prometheus-exporter-operator-metrics-reader created
clusterrole.rbac.authorization.k8s.io/prometheus-exporter-operator-proxy-role created
rolebinding.rbac.authorization.k8s.io/prometheus-exporter-operator-leader-election-rolebinding created
rolebinding.rbac.authorization.k8s.io/prometheus-exporter-operator-manager-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-exporter-operator-proxy-rolebinding created
service/prometheus-exporter-operator-controller-manager-metrics-service created
deployment.apps/prometheus-exporter-operator-controller-manager created
servicemonitor.monitoring.coreos.com/prometheus-exporter-operator-controller-manager-metrics-monitor created

$ oc get pods -n prometheus-exporter-operator-system
NAME                                                              READY   STATUS    RESTARTS   AGE
prometheus-exporter-operator-controller-manager-77f956bd4785sgl   2/2     Running   0          91s

The problem is, that my operator is intended to run on both vanilla k8s and openshift, and this registry registry.redhat.io requires authentication, inside openshift it works without any required change, but outside ocp not (example of a random vm):

# docker run registry.redhat.io/openshift4/ose-kube-rbac-proxy:v4.7.0
Unable to find image 'registry.redhat.io/openshift4/ose-kube-rbac-proxy:v4.7.0' locally
docker: Error response from daemon: Get https://registry.redhat.io/v2/openshift4/ose-kube-rbac-proxy/manifests/v4.7.0: unauthorized: Please login to the Red Hat Registry using your Customer Portal credentials. Further instructions can be found here: https://access.redhat.com/RegistryAuthentication

This is a workaround with the permission error in issue: operator-framework/operator-sdk#4684 Signed-off-by: Wayne Sun <gsun@redhat.com>

sqqqrly · 2021-06-08T15:22:13Z

I also hit this error using a helm-based operator.

Using either of these two images worked. v0.8.0 fails with the nonroot error. Note that v4.7.0 generated an image pull error. v4.7 is what is specified in the tutorial at [1]. It works for me.

     # config/default/manager_auth_proxy_patch.yaml
  10     spec:
  11       containers:
  12       - name: kube-rbac-proxy
  13         #image: gcr.io/kubebuilder/kube-rbac-proxy:v0.5.0
  14         image: registry.redhat.io/openshift4/ose-kube-rbac-proxy:v4.7

The tutorial [1] also says to use FROM registry.redhat.io/openshift4/ose-helm-operator:v4.7 in the Dockerfile. This fails for me with unknown flag: --health-probe-bind-address. FROM quay.io/operator-framework/helm-operator:v1.8.0 works.

[1] https://docs.openshift.com/container-platform/4.7/operators/operator_sdk/helm/osdk-helm-tutorial.html#osdk-prepare-supported-images_osdk-helm-tutorial

tlwu2013 · 2021-06-15T20:33:37Z

The tutorial [1] also says to use FROM registry.redhat.io/openshift4/ose-helm-operator:v4.7 in the Dockerfile. This fails for me with unknown flag: --health-probe-bind-address. FROM quay.io/operator-framework/helm-operator:v1.8.0 works.

hey @sqqqrly, I assume you are using SDK 1.6.0+, can you confirm?

The registry.redhat.io/openshift4/ose-helm-operator:v4.7 is a downstream image based on helm-operator:v1.3.0. Therefore it doesn't work with the arg --health-probe-bind-address being scaffolded by the later SDK version.

sqqqrly · 2021-06-16T15:16:37Z

╰─➤  operator-sdk version
operator-sdk version: "v1.8.0", commit: "d3bd87c6900f70b7df618340e1d63329c7cd651e", kubernetes version: "1.20.2", go version: "go1.16.4", GOOS: "linux", GOARCH: "amd64"

Thanks for the health probe note. I am hitting that and now I know why.

camilamacedo86 · 2021-06-21T19:26:03Z

Hi @slopezz,

The downstream repo is now updated with this tag version, see:

https://github.com/openshift/ocp-release-operator-sdk/blob/master/testdata/go/v3/memcached-operator/config/default/manager_auth_proxy_patch.yaml#L13

Note that it has mock projects which are tested against OCP. In this way, that ensures that is working on OCP as well.

Then, the next downstream release will be using its latest version as well. So, IMHO it seems that can be closed.

I will close this one, however, @slopezz if you face any issue with the next downstream release for 4.8 could you please raise a new issue and add the steps performed for we are able to reproduce it? Also, it might better fit via Bugzilla since seen a specific vendor issue and not part of the upstream scope at all.

c/c @fabianvf @jmrodri

openshift-ci-robot added the language/ansible Issue is related to an Ansible operator project label Mar 19, 2021

This was referenced Mar 19, 2021

Upgrade prometheus-exporter-operator to operator-sdk v1 3scale-ops/prometheus-exporter-operator#20

Closed

Upgrade operator to operator-sdk v1.5.0 3scale-ops/prometheus-exporter-operator#21

Merged

openshift-ci-robot added the triage/support Indicates an issue that is a support question. label Mar 19, 2021

jberkhahn added this to the Backlog milestone Mar 22, 2021

jberkhahn assigned estroz, camilamacedo86 and fabianvf Mar 22, 2021

andrewazores mentioned this issue Mar 26, 2021

Deployment broken in default namespace cryostatio/cryostat-operator#180

Open

This was referenced Apr 14, 2021

Prometheus ServiceMonitor failing to scrape operator metrics served though kube-proxy HTTPS 8443 port #4764

Closed

Disable kube-rbac-proxy from prometheus-exporter-operator controller-manager 3scale-ops/prometheus-exporter-operator#26

Merged

planetf1 mentioned this issue Apr 20, 2021

Go operator with sdk fails to start with set in config.json failed: permission denied #4813

Closed

waynesun09 added a commit to waynesun09/reportportal-operator that referenced this issue Apr 23, 2021

Update kube-rbac-proxy image

9785091

This is a workaround with the permission error in issue: operator-framework/operator-sdk#4684 Signed-off-by: Wayne Sun <gsun@redhat.com>

camilamacedo86 closed this as completed Jun 21, 2021

vadimeisenbergibm mentioned this issue Oct 13, 2021

For OpenShift, the OpenShift kube-rbac-proxy image must be used project-flotta/flotta-operator#24

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ansible operator-sdk v1.5.0 with updated kube-rbac-proxy:v0.8.0 fails to run with permission denied #4684

Ansible operator-sdk v1.5.0 with updated kube-rbac-proxy:v0.8.0 fails to run with permission denied #4684

slopezz commented Mar 19, 2021

estroz commented Mar 19, 2021

slopezz commented Mar 22, 2021

slopezz commented Mar 22, 2021

camilamacedo86 commented Mar 22, 2021

slopezz commented Mar 22, 2021

estroz commented Mar 30, 2021

andrewazores commented Mar 31, 2021

estroz commented Mar 31, 2021

andrewazores commented Mar 31, 2021 •

edited

Loading

slopezz commented Apr 8, 2021

sqqqrly commented Jun 8, 2021 •

edited

Loading

tlwu2013 commented Jun 15, 2021

sqqqrly commented Jun 16, 2021 •

edited

Loading

camilamacedo86 commented Jun 21, 2021

Ansible operator-sdk v1.5.0 with updated kube-rbac-proxy:v0.8.0 fails to run with permission denied #4684

Ansible operator-sdk v1.5.0 with updated kube-rbac-proxy:v0.8.0 fails to run with permission denied #4684

Comments

slopezz commented Mar 19, 2021

Bug Report

What did you do?

What did you expect to see?

What did you see instead? Under which circumstances?

Environment

Possible Solution

Additional context

estroz commented Mar 19, 2021

slopezz commented Mar 22, 2021

slopezz commented Mar 22, 2021

camilamacedo86 commented Mar 22, 2021

slopezz commented Mar 22, 2021

estroz commented Mar 30, 2021

andrewazores commented Mar 31, 2021

estroz commented Mar 31, 2021

andrewazores commented Mar 31, 2021 • edited Loading

slopezz commented Apr 8, 2021

sqqqrly commented Jun 8, 2021 • edited Loading

tlwu2013 commented Jun 15, 2021

sqqqrly commented Jun 16, 2021 • edited Loading

camilamacedo86 commented Jun 21, 2021

andrewazores commented Mar 31, 2021 •

edited

Loading

sqqqrly commented Jun 8, 2021 •

edited

Loading

sqqqrly commented Jun 16, 2021 •

edited

Loading