Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid redundant copying of tars during kube-up for gce if the same file already exists #46792

Merged

Conversation

ianchakeres
Copy link
Contributor

@ianchakeres ianchakeres commented Jun 1, 2017

What this PR does / why we need it:

Whenever I execute cluster/kube-up.sh it copies my tar files to google cloud, even if the files haven't changed. This PR checks to see whether the files already exist, and avoids uploading them again. These files are large and can take a long time to upload.

Which issue this PR fixes: fixes #46791

Special notes for your reviewer:

Here is the new output:

cluster/kube-up.sh
... Starting cluster in us-central1-b using provider gce
... calling verify-prereqs
... calling verify-kube-binaries
... calling kube-up
Project: PROJECT
Zone: us-central1-b
+++ Staging server tars to Google Storage: gs://kubernetes-staging-PROJECT/kubernetes-devel
+++ kubernetes-server-linux-amd64.tar.gz uploaded earlier, cloud and local file md5 match (md5 = 3a095kcf27267a71fe58f91f89fab1bc)

Release note:
cluster/kube-up.sh on gce now avoids redundant copying of kubernetes tars if the local and cloud files' md5 hash match

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 1, 2017
@k8s-ci-robot
Copy link
Contributor

Hi @ianchakeres. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with @k8s-bot ok to test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jun 1, 2017
@ianchakeres
Copy link
Contributor Author

@pwittrock - please take a quick peak at this PR and let me know if you recommend any changes.

@k8s-github-robot k8s-github-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. release-note-label-needed labels Jun 1, 2017
@ianchakeres
Copy link
Contributor Author

/sig cluster-lifecycle

@k8s-ci-robot k8s-ci-robot added the sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. label Jun 3, 2017
@ianchakeres
Copy link
Contributor Author

@k8s-bot ok to test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jun 5, 2017
@k8s-ci-robot
Copy link
Contributor

@ianchakeres: you can't request testing unless you are a kubernetes member.

In response to this:

@k8s-bot ok to test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@pwittrock
Copy link
Member

@k8s-bot ok to test

@pwittrock
Copy link
Member

@ianchakeres Please add a release note in the form show in the template:

e.g.
Look at the markdown and copy the 3x-tick tags:

cluster/kube-up.sh now reuses tars it has already uploaded if the hash matches the local copy

if [[ -n ${remote_tar_hash} ]]; then
local -r local_tar_hash=$(gsutil hash -h -m ${tar} 2>/dev/null | grep "Hash (md5):" | awk -F ':' '{print $2}')
if [[ "${remote_tar_hash}" == "${local_tar_hash}" ]]; then
echo "+++ ${basename_tar} uploaded earlier and hash matches"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe output the hash value as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Working on an update now.

#if it matches, then don't bother uploading it again
local -r remote_tar_hash=$(gsutil hash -h -m ${staging_path}/${basename_tar} 2>/dev/null | grep "Hash (md5):" | awk -F ':' '{print $2}')
if [[ -n ${remote_tar_hash} ]]; then
local -r local_tar_hash=$(gsutil hash -h -m ${tar} 2>/dev/null | grep "Hash (md5):" | awk -F ':' '{print $2}')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine, but would be a bit cleaner if pulled into a function to return the hash for a given name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I'll update the next revision to use a separate function for calling gsutil to get the hash and parse the output.


#check whether this tar alread exists and has the same hash
#if it matches, then don't bother uploading it again
local -r remote_tar_hash=$(gsutil hash -h -m ${staging_path}/${basename_tar} 2>/dev/null | grep "Hash (md5):" | awk -F ':' '{print $2}')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work against a remote tar? I am not familiar with gsutil but the description is The hash command calculates hashes on a local file. Is this working on a remote file or a local file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this particular case the hash is operating on the remote file, and I tested that it works.

Looks like that documentation you referenced is out of date, and this feature was added to support hash on cloud files in gsutil v4.21. Here is the related issue for tracking hash of cloud files - GoogleCloudPlatform/gsutil#369

@pwittrock pwittrock added this to the v1.8 milestone Jun 13, 2017
@pwittrock
Copy link
Member

/assign

@k8s-github-robot k8s-github-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-label-needed labels Jun 13, 2017
@k8s-github-robot k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 13, 2017
@ianchakeres
Copy link
Contributor Author

@k8s-bot pull-kubernetes-e2e-gce-etcd3 test this

@pwittrock
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 14, 2017
@pwittrock
Copy link
Member

@vishh Would you take a look at this since it looks like you can approve?

@ianchakeres
Copy link
Contributor Author

@k8s-bot pull-kubernetes-e2e-gce-etcd3 test this

@ianchakeres
Copy link
Contributor Author

Yesterday I encountered some test flakes inside e2e-gce-etcd3 on this PR #46792. I think they relate to these two reported flaky tests issues #43520 & #47446.

From my investigation the failures mention timeouts.

Here are the the failed test runs' output:
https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/46792/pull-kubernetes-e2e-gce-etcd3/36002/
https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/46792/pull-kubernetes-e2e-gce-etcd3/35975/
https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/pr-logs/pull/46792/pull-kubernetes-e2e-gce-etcd3/35842/

@ianchakeres
Copy link
Contributor Author

@vishh can you please take a quick look and approve this PR? Change is about ~25 lines and half are clarifying comments.

If you have any questions or comments, just let me know.

@vishh
Copy link
Contributor

vishh commented Jun 17, 2017

/lgtm
/approve

@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ianchakeres, pwittrock, vishh

Associated issue: 46791

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 17, 2017
# Copy a release tar and its accompanying hash.
function copy-to-staging() {
local -r staging_path=$1
local -r gs_url=$2
local -r tar=$3
local -r hash=$4
local -r basename_tar=$(basename ${tar})

#check whether this tar alread exists and has the same hash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: already

@ianchakeres
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce-etcd3

@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ianchakeres, pwittrock, vishh

Associated issue: 46791

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 47403, 46646, 46906, 46527, 46792)

@k8s-github-robot k8s-github-robot merged commit cdc9770 into kubernetes:master Jun 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Avoid copying tars from local machine to gce during kube-up if the same tars already exist
7 participants