From da92a37daa0f728b37a70972b8ebed3e115a28ed Mon Sep 17 00:00:00 2001 From: wcai Date: Mon, 4 Dec 2023 12:10:47 -0800 Subject: [PATCH] add swaphost and host_expansion instruction document --- docs/host_expansion.md | 83 ++++++++++++++++++++++++++++++++++++++++ docs/swaphost.md | 86 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 169 insertions(+) create mode 100644 docs/host_expansion.md create mode 100644 docs/swaphost.md diff --git a/docs/host_expansion.md b/docs/host_expansion.md new file mode 100644 index 00000000..dcebfa57 --- /dev/null +++ b/docs/host_expansion.md @@ -0,0 +1,83 @@ +[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0) +# Juno Host Expansion Instruction +Juno is a high scalable and available distributed key value store. In Juno architecture, the whole storage space is partitioned into a fixed number (e.g. 1024) of logical shards, and shards are mapped to storage nodes within the cluster. When a cluster scaling out or scaling in (storage nodes are either added or removed to/from the cluster), some shards must be redistributed to different nodes to reflect new cluster topology. Following are the instuctions on how to expand(or shrink) the juno storageserv nodes. + +## Expand or Shrink Storage Host +### Step0 (pre-requisite) +Deploy juno storageserv (and or junoserv) to all new boxes + +junostorageserv on all the original boxes need to be up + +Pre-insert data to original cluster so it can be used for later validation(optional but suggested) + +Overall, the expansion contains below steps: +```bash +1. Markdown one zone at a time to stop incoming real time traffic to this zone +2. Run command to update cluster topo to new cluster in etcd +3. Start storageserv on new box for relavant zone. +4. Run command to start redistribution + 4a. if requests failed to forward after serveral auto retry, run resume command to do redistribution again + 4b. if after 3a still some requests fail, restart source storageserv so the failed one can start forward again. +5. Run command to commit if previous steps are all successful. +``` + +Loop through zone0 to zone5 to finish redistribution for all zones +Retrieve pre-inserted data from storageserv for validation (optional) + + +### Step1 +under junoclustercfg, run +```bash + ./clustermgr --config config.toml --cmd zonemarkdown --type set -zone 0 (1,2,3,4) +``` +verify markdown works by checking junostorageserv state log that no new request coming + +### Step2 +under junoclustercfg, run +```bash + ./clustermgr -new_config config.toml_new -cmd redist -type prepare -zone 0 (1,2,3,4) +``` +NOTE: A UI monitoring http link will be generated in redistserv.pid. It can be used for monitoring + +### Step3 +start junostorageserv on new box for relavant zone -- zone 0(1,2,3,4) + +### Step4 +under junoclustercfg, run +```bash + ./clustermgr -new_config config.toml_new -ratelimit 5000 -cmd redist -type start_tgt --zone 0 (1,2,3,4) + ./clustermgr -new_config config.toml_new -ratelimit 5000 -cmd redist -type start_src --zone 0 (1,2,3,4) +``` +NOTE: the ratelimit needs to be tuned for each different system. Depending on old/new cluster, the rate setting + will be different. For example,expansion from 5 to 10 boxes or expansion from 5 to 15 boxes, rate will be + different + +#### Step4a (only if requests forward final failed after several auto retry) +under junoclustercfg, run +```bash + ./clustermgr --new_config config.toml_new --cmd redist --type resume -ratelimit 5000 -zone 0(1,2,3,4) +``` + +#### Step4b (only if 4a still doesn't fix the failure) +restart source storageserv and wait for redistribution complete + +### Step5 +under junoclustercfg, run +```bash + ./clustermgr -new_config config.toml_new --cmd redist --type commit -zone 0(1,2,3,4) +``` +Loop around zone0 to zone5 to complete all zones' redistribution + +## Validation (Optional but suggest) + +### Steps +run juno client tool to get shard map which contains ss ip:port +```bash + ./junocli ssgrp -c config.toml_new -hex=false key +``` + +run juno client tool to verify if key exists in the expected ss. ip:port is the one get from previoius command +```bash + ./junocli read -s ip:port key +``` + diff --git a/docs/swaphost.md b/docs/swaphost.md new file mode 100644 index 00000000..6df77992 --- /dev/null +++ b/docs/swaphost.md @@ -0,0 +1,86 @@ +[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0) +# Juno Host Swap Instruction +Bad Juno nodes may have to be swapped out on a live cluster. The node that needs to be swapped can be running etcd (junoclusterserv), or juno storage or both. Following is a step-by-step guide to perform the swap. + +## Swapping a Storage Host +### Step0 (optional) +Pre-insert some data via junocli tool before host swap, after host swap, retrieve data and see if all data are able to be retrieved. + +### Step1 +Deploy junostorageserv (and or junoserv) on New_Host1, both services don't need to be start up, just deploy. +update the junoclustercfg config.toml by changing old box into new box, make a package and deploy it to new box. + +```bash +Old Config New Config +SSHosts=[ SSHosts=[ +# Zone 0 # Zone 0 +[ [ + "Host1" "New_Host1" +], ], +# Zone 1 # Zone 1 +[ [ + "Host2" "Host2" +], ], +# Zone 2 # Zone 2 +[ [ + "Host3" "Host3" +], +# Zone 3 # Zone 3 +[ [ + "Host4" "Host4" +], ], +# Zone 4 # Zone 4 +[ [ + "Host5" "Host5" +] ] +] ] +``` +Make sure storageserv are up on all the boxes other than the bad box. + +### Step2 +If to be replaced box is a bad box, this step can be skipped. If to be replaced box is a good box, shutdown +junostorageserv on to be replaced box, copy rocksdb_junostorageserv from it to new box on the same location. + +### Step3 +On the new box (the cluster config contains New_Host1), from junoclustercfg directory, run ./swaphost.sh. +This step will bump up the junocluster config version in etcd and all the running junoserv and junostorageserv +hosts will update their cluster map accordingly after script run. + +### Step4 +Start up junostorageserv (and or junoserv) on New_Host1. It will fetch the latest junoclustercfg from etcd. + +### Step5 (Optional) +Validation - use junocli to retrieve pre-inserted data, all data should be able to retrieve. + +### Step6 +Once junoserv on New_Host1 works fine, if there is LB in front of junoserv, fix LB to replace Host1 with New_Host1 + +Deploy the updated junoclustercfg package which contains New_Host1 to all the junoclustercfg boxes. All boxes have +same version of junoclustercfg package after that. + +## Swapping host which has etcd server runs on +The etcd cluster has three or five hosts depending on 3 quorum or 5 quorum - Host1^Host2^Host3^Host4^Host5 + +Identify a new host (New_Host1) for the swap. Make sure etcd servers are up on all hosts except the bad one. +Host1 is to be swapped with New_Host1 + +### Step1 +Change the etcdsvr.txt under junoclusterserv +```bash +Old etcdsvr.txt New etcdsvr.txt +[etcdsvr] [etcdsvr] +initial_cluster = "Host1^Host2^Host3^Host4^Host5" initial_cluster = "New_Host1^Host2^Host3^Host4^Host5" +``` +Build the junoclusterserv package and deploy to new box (New_Host1) + +### Step2 +On the old box (Host1), shutdown junoclusterserv by shutdown.sh under junoclusterserv + +On the new box(New_Host1), under junoclusterserv, first run join.sh, then run start.sh to have the new box +join the members of quorum + +### Step3 +Deploy and start the new junoclusterserv package one by one to all other junoclusterserv boxes + +### Step4 +Fix LB of etcd to replace old Host1 with New_Host1.