Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[e2e test failure] Simple pod should support exec through kubectl proxy #50466

Closed
ericchiang opened this issue Aug 10, 2017 · 20 comments
Closed
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/cli Categorizes an issue or PR as relevant to SIG CLI.
Milestone

Comments

@ericchiang
Copy link
Contributor

/cc @kubernetes/sig-cli-bugs

This test has started failing on GKE and GCI-GKI e2e tests:

https://k8s-testgrid.appspot.com/release-master-blocking#gci-gke
https://k8s-testgrid.appspot.com/release-master-blocking#gke

Example failure

https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gke/12920

/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/kubectl/kubectl.go:457
Expected error:
    <exec.CodeExitError>: {
        Err: {
            s: "error running &{/workspace/kubernetes/platforms/linux/amd64/kubectl [kubectl --server=https://35.188.213.188 --kubeconfig=/tmp/gke-kubecfg386178168 --server=http://127.0.0.1:45529 --namespace=e2e-tests-kubectl-nlzj2 exec nginx echo running in container] []  <nil>  W0810 18:25:44.244920    4112 http.go:342] Error reading backend response: unexpected EOF\nerror: error sending request: Post http://127.0.0.1:45529/api/v1/namespaces/e2e-tests-kubectl-nlzj2/pods/nginx/exec?command=echo&command=running&command=in&command=container&container=nginx&container=nginx&stderr=true&stdout=true: unexpected EOF\n [] <nil> 0xc4211d3830 exit status 1 <nil> <nil> true [0xc4203e6828 0xc4203e6840 0xc4203e6858] [0xc4203e6828 0xc4203e6840 0xc4203e6858] [0xc4203e6838 0xc4203e6850] [0x11bf750 0x11bf750] 0xc420f2db60 <nil>}:\nCommand stdout:\n\nstderr:\nW0810 18:25:44.244920    4112 http.go:342] Error reading backend response: unexpected EOF\nerror: error sending request: Post http://127.0.0.1:45529/api/v1/namespaces/e2e-tests-kubectl-nlzj2/pods/nginx/exec?command=echo&command=running&command=in&command=container&container=nginx&container=nginx&stderr=true&stdout=true: unexpected EOF\n\nerror:\nexit status 1\n",
        },
        Code: 1,
    }
    error running &{/workspace/kubernetes/platforms/linux/amd64/kubectl [kubectl --server=https://35.188.213.188 --kubeconfig=/tmp/gke-kubecfg386178168 --server=http://127.0.0.1:45529 --namespace=e2e-tests-kubectl-nlzj2 exec nginx echo running in container] []  <nil>  W0810 18:25:44.244920    4112 http.go:342] Error reading backend response: unexpected EOF
    error: error sending request: Post http://127.0.0.1:45529/api/v1/namespaces/e2e-tests-kubectl-nlzj2/pods/nginx/exec?command=echo&command=running&command=in&command=container&container=nginx&container=nginx&stderr=true&stdout=true: unexpected EOF
     [] <nil> 0xc4211d3830 exit status 1 <nil> <nil> true [0xc4203e6828 0xc4203e6840 0xc4203e6858] [0xc4203e6828 0xc4203e6840 0xc4203e6858] [0xc4203e6838 0xc4203e6850] [0x11bf750 0x11bf750] 0xc420f2db60 <nil>}:
    Command stdout:
    
    stderr:
    W0810 18:25:44.244920    4112 http.go:342] Error reading backend response: unexpected EOF
    error: error sending request: Post http://127.0.0.1:45529/api/v1/namespaces/e2e-tests-kubectl-nlzj2/pods/nginx/exec?command=echo&command=running&command=in&command=container&container=nginx&container=nginx&stderr=true&stdout=true: unexpected EOF
    
    error:
    exit status 1
    
not to have occurred
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/util.go:2066```
@k8s-ci-robot k8s-ci-robot added sig/cli Categorizes an issue or PR as relevant to SIG CLI. kind/bug Categorizes issue or PR as related to a bug. labels Aug 10, 2017
@ericchiang ericchiang added this to the v1.8 milestone Aug 10, 2017
@mengqiy
Copy link
Member

mengqiy commented Aug 11, 2017

Not sure if it is related to recent change in #49534.

@pwittrock
Copy link
Member

@bowei Looks like this is only failing on GKE. @liggitt suggested this might be due to the way the master / node network is setup on GKE.

@pwittrock
Copy link
Member

cc
@kubernetes/sig-network-bugs
@kubernetes/sig-cli-bugs

@k8s-ci-robot k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. kind/bug Categorizes issue or PR as related to a bug. labels Aug 15, 2017
@pwittrock
Copy link
Member

/priority critical-urgent

@k8s-ci-robot k8s-ci-robot added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Aug 15, 2017
@jagosan
Copy link
Contributor

jagosan commented Aug 15, 2017

/assign @bowei

@jagosan
Copy link
Contributor

jagosan commented Aug 15, 2017

@bowei can you please take a look at @pwittrock's comment and update / reassign if appropriate? Thanks!

@bowei
Copy link
Member

bowei commented Aug 15, 2017

I'll take a look

@bowei
Copy link
Member

bowei commented Aug 15, 2017

From a cursory glance: if networking is at fault, it would result in all of the tests that traverse the SSH tunnel path to fail with a timeout, which I'm not seeing here.

@bowei
Copy link
Member

bowei commented Aug 15, 2017

This test does not seem to have ever passed, at least not in the testgrid history.

@pwittrock
Copy link
Member

@liggitt

Thanks for your suggestion of looking into the networking on GKE. My understanding is that you didn't think the failure we are seeing on was related to kubectl itself.

The kubectl test appears to be the only one breaking. Should I expect to see other test failures if the failure is not related to the request sent by kubectl, but instead how it is processed by the apiserver?

@liggitt
Copy link
Member

liggitt commented Aug 15, 2017

Should I expect to see other test failures if the failure is not related to the request sent by kubectl, but instead how it is processed by the apiserver?

I'm not sure we test any other websocket requests in e2e, so maybe not?

@jagosan
Copy link
Contributor

jagosan commented Aug 15, 2017

/unassign @bowei
/assign @mengqiy
/remove-sig network

Thanks for looking @bowei. @mengqiy - any more insight to add here?

@k8s-ci-robot k8s-ci-robot assigned mengqiy and unassigned bowei Aug 15, 2017
@k8s-ci-robot k8s-ci-robot removed the sig/network Categorizes an issue or PR as relevant to SIG Network. label Aug 15, 2017
@pwittrock
Copy link
Member

pwittrock commented Aug 15, 2017

@liggitt

Should we be adding those tests, or is it not something that makes sense to do?

@mengqiy
Copy link
Member

mengqiy commented Aug 16, 2017

/assign @apelisse
/unassign
@apelisse is taking over this issue and have potential fix.

@k8s-ci-robot k8s-ci-robot assigned apelisse and unassigned mengqiy Aug 16, 2017
@smarterclayton
Copy link
Contributor

The core bug here is that gke uses a different transport wrapper than normal gce installs and unwrapping the transport is required to upgrade passthrough proxies.

@apelisse
Copy link
Member

apelisse commented Aug 17, 2017

And I think that's what I've addressed in #50775 (making the GKE transport unwrappable), but as we can see, it fails for another reason. Any idea?

@liggitt
Copy link
Member

liggitt commented Aug 17, 2017

#50775 is necessary but not sufficient.

copied from #50775 (comment)

the issue is with

// dial dials the backend at req.URL and writes req to it.
func dial(req *http.Request, transport http.RoundTripper) (net.Conn, error) {
conn, err := DialURL(req.URL, transport)
if err != nil {
return nil, fmt.Errorf("error dialing backend: %v", err)
}
if err = req.Write(conn); err != nil {
conn.Close()
return nil, fmt.Errorf("error sending request: %v", err)
}
return conn, err
}
that does not actually use the transport to handle the request, only to dial. That means that TLS-authentication methods like client certs work, and header authentication methods like bearer tokens do not.

can recreate this by running kubectl proxy using bearer token authentication and not cert auth, then try kubectl exec ... --server=http://localhost:8001

@smarterclayton
Copy link
Contributor

smarterclayton commented Aug 17, 2017 via email

apelisse pushed a commit to apelisse/kubernetes that referenced this issue Aug 21, 2017
As reported in kubernetes#50466,
this test doesn't work in GKE because the transport layer doesn't work
with dialing.

As the feature that is broken in GKE is new and didn't work before, it
is safe to juste ignore the test and consider the feature as "still not
working" in GKE.
@apelisse apelisse added the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Aug 21, 2017
@pwittrock
Copy link
Member

/assign @caesarxuchao Are you a good person to assign this to, or is there someone better?

apelisse pushed a commit to apelisse/kubernetes that referenced this issue Aug 22, 2017
As reported in kubernetes#50466,
this test doesn't work in GKE because the transport layer doesn't work
with dialing.

As the feature that is broken in GKE is new and didn't work before, it
is safe to juste ignore the test and consider the feature as "still not
working" in GKE.
@jagosan
Copy link
Contributor

jagosan commented Aug 22, 2017

/assign @cheftako

k8s-github-robot pushed a commit that referenced this issue Aug 24, 2017
Automatic merge from submit-queue

Skip "Simple pod should support exec through kubectl proxy" test

As reported in #50466,
this test doesn't work in GKE because it uses a bearer token and the feature only works with client certs.

As the feature that is broken in GKE is new and didn't work before, it
is safe to juste ignore the test and consider the feature as "still not
working" in GKE.

**What this PR does / why we need it**: Fixes the broken test in https://k8s-testgrid.appspot.com/release-master-blocking#gke

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: works-around #50466

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
hh pushed a commit to ii/kubernetes that referenced this issue Aug 30, 2017
…_wrap

Automatic merge from submit-queue (batch tested with PRs 50775, 51397, 51168, 51465, 51536)

Allow bearer requests to be proxied by kubectl proxy

Use a fake transport to capture changes to the request and then surface
them back to the end user.

Fixes kubernetes#50466

@liggitt no tests yet, but works locally
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/cli Categorizes an issue or PR as relevant to SIG CLI.
Projects
None yet
Development

No branches or pull requests

10 participants