Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This PR duplicates the linux CI cluster. This is the first in a three-PR plan to implement #6400 safely while people are working. I usually do cluster updates over the weekend because they require shutting down the entire CI system for about two hours. This is unfortunately not practical while people are working, and timezones make it difficult for me to find a time where people are not working during the week. So instead the plan is as follows: 1. Create a duplicate of our CI cluster (this PR). 2. Wait for the new cluster to be operational (~90-120 minutes ime). 3. In the Azure Pipelines config screen, disable all the nodes of the "old" cluster, so all new jobs get assigned to the temp cluster. Wait for all jobs to finish on the old cluster. 4. Update the old cluster. Wait for it to be deployed. (Second PR.) 5. In Azure, disable temp nodes, wait for jobs to drain. 6. Delete temp nodes (third PR). Reviewing this PR is best done by verifying you can reproduce the following shell session: ``` $ diff vsts_agent_linux.tf vsts_agent_linux_temp.tf 4,7c4,5 < resource "secret_resource" "vsts-token" {} < < data "template_file" "vsts-agent-linux-startup" { < template = "${file("${path.module}/vsts_agent_linux_startup.sh")}" --- > data "template_file" "vsts-agent-linux-startup-temp" { > template = "${file("${path.module}/vsts_agent_linux_startup_temp.sh")}" 16c14 < resource "google_compute_region_instance_group_manager" "vsts-agent-linux" { --- > resource "google_compute_region_instance_group_manager" "vsts-agent-linux-temp" { 18,19c16,17 < name = "vsts-agent-linux" < base_instance_name = "vsts-agent-linux" --- > name = "vsts-agent-linux-temp" > base_instance_name = "vsts-agent-linux-temp" 24,25c22,23 < name = "vsts-agent-linux" < instance_template = "${google_compute_instance_template.vsts-agent-linux.self_link}" --- > name = "vsts-agent-linux-temp" > instance_template = "${google_compute_instance_template.vsts-agent-linux-temp.self_link}" 36,37c34,35 < resource "google_compute_instance_template" "vsts-agent-linux" { < name_prefix = "vsts-agent-linux-" --- > resource "google_compute_instance_template" "vsts-agent-linux-temp" { > name_prefix = "vsts-agent-linux-temp-" 52c50 < startup-script = "${data.template_file.vsts-agent-linux-startup.rendered}" --- > startup-script = "${data.template_file.vsts-agent-linux-startup-temp.rendered}" $ diff vsts_agent_linux_startup.sh vsts_agent_linux_startup_temp.sh 149c149 < su --command "sh <(curl https://nixos.org/nix/install) --daemon" --login vsts --- > su --command "sh <(curl -sSfL https://nixos.org/nix/install) --daemon" --login vsts $ ``` and reviewing that diff, rather than looking at the added files in their entirety. The name changes are benign and needed for Terraform to appropriately keep track of which node belongs to the old vs the temp group. The only change that matters is the new group has the `-sSfL` flag so they will actually boot up. (Hopefully.) CHANGELOG_BEGIN CHANGELOG_END
- Loading branch information