Skip to content

Uninstall DCE #238

Open
Open
@michaelpetersubc

Description

@michaelpetersubc

Is your feature request related to a problem? Please describe.
To clean up after a bug I need to recompile the dce software. It appears that to get things right I would have to redeploy (to change scripts running on aws).

As far as I can see, deploying again recreates everything. For example, the api gateway would change.

While I am testing I would just like to remove everything that was installed by the first deploy, then deploy it again with the new software. Even a list in the docs that describes what to remove manually would help. More generally a suggestion about how to go about repairing bugs or upgrading the software would be nice.

However in production, all the users of the service would seem to have to adjust their settings (the gateway url for example) to cope with an upgrade. So a feature that updates the software without changing basic setting or duplicating scripts at aws would help.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Activity

nathanagood

nathanagood commented on Jan 23, 2020

@nathanagood
Contributor

Hi, @michaelpetersubc!

As far as I can see, deploying again recreates everything. For example, the api gateway would change.

Could you please provide the steps you used to deploy DCE? There are ways to make updates in place and even tear down (destroy) what you created, but the exact steps depend on how you deployed DCE.

Best regards, @nathanagood

michaelpetersubc

michaelpetersubc commented on Jan 23, 2020

@michaelpetersubc
Author

Thanks @nathanagood
If I am understanding the question correctly, I followed the quickstart.
dce init followed by dce deploy. After that I copied the gateway url to .dce.yaml, set up the child account, created a dce account using the child account.

My current plan is to try to manually remove everything at aws (except the child account), recompile dce, then start again. If there is a better way to set it up in order to make updates in place, I'll do that.

nathanagood

nathanagood commented on Jan 23, 2020

@nathanagood
Contributor

Hi, @michaelpetersubc. Un-deployment is a limitation in the versions of the dce CLI v0.3.1 and below. We have a solution for destroying resources deployed by the newer versions of dce in the process of being released.

In the meantime, we are putting a solution together to un-deploy the DCE resources created by these earlier versions. We will share that with you as soon as we have it.

michaelpetersubc

michaelpetersubc commented on Jan 23, 2020

@michaelpetersubc
Author

Thanks @nathanagood, that answers my question. I'll watch for the un-deploy feature, in the meantime, I'll try to resolve it manually, might learn something.

joshmarsh

joshmarsh commented on Jan 31, 2020

@joshmarsh
Contributor

Hi @michaelpetersubc, we’ve put together an example script for deleting resources from older versions of DCE. It will delete any resources tagged AppName=DCE or with identifiers containing the unique namespace that DCE attaches to everything it deploys. Here are the steps for using it:

  • Ensure jq is installed and on your path
  • Ensure your AWS CLI is configured for the AWS Account containing DCE
  • Look up your unique DCE namespace. This is a string appended to resources that DCE deploys into your account. e.g.
    • Lambdas such as accounts-dce-{namespace}
    • IAM roles such as account-reset-codebuild-dce-{namespace}
    • Cloudwatch event rules such as populate-reset-queue-dce-{namespace}
  • Execute the script ./delete-dce.sh [region] [namespace]. It is important to use the correct namespace, as miscellaneous resources containing this substring will be deleted. For example, if you type ./delete-dce.sh us-east-1 inst, then miscellaneous resources containing the substring inst in their arn/name might be deleted.

Running bulk delete operations against an AWS account is risky, particularly when you have other things in the account that you don’t want deleted. We recommend reading through this script or using it as a guide if you have concerns about accidentally deleting other resources in your account.

As @nathanagood mentioned, dce-cli version v0.4.0 supports deleting dce via locally cached terraform state file, binary, and backend configuration. Here’s how it’s done:

# change into the directory containing the terraform binary dce-cli used for deployment
cd ~/.dce/.cache/terraform/0.12.18
# inititalize terraform using the cached main.tf
./terraform init ~/.dce/.cache/module
# run terraform destroy using the cached main.tf
./terraform destroy ~/.dce/.cache/module

We recommend deleting the ~/.dce configuration directory and starting over from dce init if you would like to redeploy dce after destroying it via this method.

We haven’t created a cli command to make this convenient yet. Our goal is to provide convenient mechanisms for deploying, upgrading, and deleting dce. This is an iteration towards that goal, and feedback such as yours helps tremendously in guiding our design. Please let us know if you need any more help.

changed the title uninstall dce Uninstall DCE on Feb 6, 2020
michaelpetersubc

michaelpetersubc commented on Feb 11, 2020

@michaelpetersubc
Author

@joshmarsh , @nathanagood
Thanks, the script for the older versions works well, seems to have cleaned up everything, including a botched installation that I had partially cleaned up.
Version 0.4.0 works as advertised, now to try it for real. The application is for managing grad students computations.
Your documentation is now slightly inconsistent with the new version. For example, the .dce.yaml file is no longer used and the deploy script sets the api gateway url automatically.

joshmarsh

joshmarsh commented on Feb 12, 2020

@joshmarsh
Contributor

@joshmarsh , @nathanagood
Thanks, the script for the older versions works well, seems to have cleaned up everything, including a botched installation that I had partially cleaned up.
Version 0.4.0 works as advertised, now to try it for real. The application is for managing grad students computations.
Your documentation is now slightly inconsistent with the new version. For example, the .dce.yaml file is no longer used and the deploy script sets the api gateway url automatically.

Thanks for the feedback @michaelpetersubc. Looks like we missed a few places when we updated the docs last. We'll get on that soon.

michaelpetersubc

michaelpetersubc commented on Feb 12, 2020

@michaelpetersubc
Author

@joshmarsh , @nathanagood
I have to step back one, it appears the default setting for aws-nuke is dry run, so when a lease ends the account is not cleared

020/02/12 17:23:46 INFO: Nuke is set in Dry Run mode and will not remove any resources and cannot set back the state of the DCE child account Please set 'RESET_NUKE_DRY_RUN' to not 'true' to exit Dry Run mode.

which isn't as advertised (the docs say you can reset it to dry run using terraform). The error message gives a sort of sensible fix, but honestly I can't figure out how to implement the fix. I deployed with dce not terraform. Is there a way I can manually reset dry run without a redeploy?

eschwartz

eschwartz commented on Feb 17, 2020

@eschwartz
Contributor

Hi @michaelpetersubc -- just want to let you know that I'm looking into this. I should have some useful info for you later today.

eschwartz

eschwartz commented on Feb 17, 2020

@eschwartz
Contributor

Ok @michaelpetersubc, I think I can help you manually reconfigure your DCE deployment to enable aws-nuke to run in --no-dry-run mode.

  1. Login to your AWS web console
  2. Navigate to Services > CodeBuild > Build projects
  3. Select the account-reset-<namespace> project, where <namespace> is the namespace you used to deploy DCE (or a random ID)
  4. Select the Edit dropdown, then select Environment
  5. Expand the Additional Configuration section, and scroll down to the Environment Variables subsection
  6. You should see a env var configured for with RESET_NUKE_TOGGLE = false. Change this value to true.
  7. Click Update environment

Subsequent account reset jobs should run aws-nuke in --no-dry-run mode.

Please let me know if you run into any problems with this, or if it doesn't work as expected.


For added context, DCE v0.24.0 introduced a change to enable aws-nuke to run in --no-dry-run mode by default. The latest version of dce-cli is still tied to DCE v0.23.0, so it does not include this change.

We are working on a new release of dce-cli, to upgrade to the latest version of DCE. We also have plans to support additional deployment options, to make it easier to configure these types of parameters.

michaelpetersubc

michaelpetersubc commented on Feb 18, 2020

@michaelpetersubc
Author

@eschwartz Thanks for the very clear instruction, that worked, and I would never have guessed how to do that. The reset now works and removes everything created while the lease is being used.

However, the reset also removes the admin permission from the trusted role as it did before, so the child account is never returned to the ready state. This is the same problem that started all this - maybe that problem wasn't fixed initially - or at least I haven't managed to install the right version of the updated software. I believe that problem is an issue with dce not with dce-cli which is what I updated.

5 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Uninstall DCE · Issue #238 · Optum/dce