Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy v3 data dir when performing backup #3860

Merged
merged 4 commits into from
Apr 26, 2017

Conversation

sdodson
Copy link
Member

@sdodson sdodson commented Apr 5, 2017

@sdodson sdodson requested a review from ingvagabund April 5, 2017 19:45
@@ -13,6 +12,8 @@
role: etcd
local_facts: {}
when: "'etcd' not in openshift"
- set_fact:
timestamp: "{{ lookup('pipe', 'date +%Y%m%d%H%M%S') }}"
Copy link
Member Author

@sdodson sdodson Apr 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this because without this it was actually re-evaluating the lookup each time {{ timestamp }} was used below which meant that the backup location it told you existed usually didn't unless the set of tasks happened within the same second, on my test vm at least that never happened but maybe that's only because i've added additional backup steps.

path: "{{ openshift.etcd.etcd_data_dir }}/member/snap/db"
register: v3_db

- name: Copy etcd v3 data store
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is mentioned in [1] as well: "Recovering a cluster first needs a snapshot of the keyspace from an etcd member. A snapshot may either be taken from a live member with the etcdctl snapshot save command or by copying the member/snap/db file from an etcd data directory."

[1] https://github.com/coreos/etcd/blob/master/Documentation/op-guide/recovery.md#snapshotting-the-keyspace

@ingvagabund
Copy link
Member

aos-ci-test

@openshift-bot
Copy link

success: aos-ci-jenkins/OS_unit_tests for 483e58d (logs)

@openshift-bot
Copy link

error: aos-ci-jenkins/OS_3.6_containerized for 483e58d (logs)

@openshift-bot
Copy link

success: "aos-ci-jenkins/OS_3.5_NOT_containerized, aos-ci-jenkins/OS_3.5_NOT_containerized_e2e_tests" for 483e58d (logs)

@openshift-bot
Copy link

success: "aos-ci-jenkins/OS_3.5_containerized, aos-ci-jenkins/OS_3.5_containerized_e2e_tests" for 483e58d (logs)

@openshift-bot
Copy link

success: "aos-ci-jenkins/OS_3.6_NOT_containerized, aos-ci-jenkins/OS_3.6_NOT_containerized_e2e_tests" for 483e58d (logs)

Fixes

TASK [Copy etcd v3 data store]
*************************************************
fatal: [host.redhat.com]: FAILED! => {
    "changed": true,
    "cmd": [
        "cp",
        "-a",
        "/var/lib/etcd//member/snap",
        "/var/lib/origin/etcd-backup-pre-upgrade-20170407055413/member/"
    ],
    "delta": "0:00:00.003152",
    "end": "2017-04-07 01:54:17.584685",
    "failed": true,
    "rc": 1,
    "start": "2017-04-07 01:54:17.581533",
    "warnings": []
}

STDERR:

cp: cannot create directory
?/var/lib/origin/etcd-backup-pre-upgrade-20170407055413/member/?: No
such file or directory
@sdodson
Copy link
Member Author

sdodson commented Apr 7, 2017

aos-ci-test

@openshift-bot
Copy link

success: aos-ci-jenkins/OS_unit_tests for da3c31c (logs)

@openshift-bot
Copy link

error: aos-ci-jenkins/OS_3.5_containerized for da3c31c (logs)

@openshift-bot
Copy link

error: aos-ci-jenkins/OS_3.6_containerized for da3c31c (logs)

@openshift-bot
Copy link

success: "aos-ci-jenkins/OS_3.5_NOT_containerized, aos-ci-jenkins/OS_3.5_NOT_containerized_e2e_tests" for da3c31c (logs)

@openshift-bot
Copy link

success: "aos-ci-jenkins/OS_3.6_NOT_containerized, aos-ci-jenkins/OS_3.6_NOT_containerized_e2e_tests" for da3c31c (logs)

@sdodson
Copy link
Member Author

sdodson commented Apr 7, 2017

[merge]

sdodson added 2 commits April 10, 2017 16:18
Because containerized installs don't mount /var/lib/origin and we
switched to running the backup inside the container that meant that we
were backing up the etcd data into a directory inside the container
filesystem. Since we have no other volume mounted we need to backup into
/var/lib/etcd.
@openshift-bot
Copy link

[test]ing while waiting on the merge queue

@openshift-bot
Copy link

Evaluated for openshift ansible test up to 974f01c

@openshift-bot
Copy link

continuous-integration/openshift-jenkins/test FAILURE (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_openshift_ansible/10/) (Base Commit: 5fe1609)

@sdodson
Copy link
Member Author

sdodson commented Apr 24, 2017

aos-ci-test

@openshift-bot
Copy link

success: "aos-ci-jenkins/OS_3.5_NOT_containerized, aos-ci-jenkins/OS_3.5_NOT_containerized_e2e_tests" for 974f01c (logs)

@openshift-bot
Copy link

success: "aos-ci-jenkins/OS_3.6_NOT_containerized, aos-ci-jenkins/OS_3.6_NOT_containerized_e2e_tests" for 974f01c (logs)

@openshift-bot
Copy link

success: "aos-ci-jenkins/OS_3.6_containerized, aos-ci-jenkins/OS_3.6_containerized_e2e_tests" for 974f01c (logs)

@openshift-bot
Copy link

success: "aos-ci-jenkins/OS_3.5_containerized, aos-ci-jenkins/OS_3.5_containerized_e2e_tests" for 974f01c (logs)

@sdodson
Copy link
Member Author

sdodson commented Apr 24, 2017

[merge]

@sdodson
Copy link
Member Author

sdodson commented Apr 26, 2017

flake on openshift/origin#13067
[merge]

@openshift-bot
Copy link

Evaluated for openshift ansible merge up to 974f01c

@openshift-bot
Copy link

openshift-bot commented Apr 26, 2017

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/merge_pull_request_openshift_ansible/291/) (Base Commit: 4805e68)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants