Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

installation: rewrite the code #52541

Merged
merged 70 commits into from
Aug 15, 2024

Conversation

howardjohn
Copy link
Member

This PR effectively rewrites the operator/ codebase. Over the past few weeks, I have been looking into refactoring the codebase to both simplify it dramatically, given its new reduced requirements, and allow it to be structured in a way that allows us to evolve our installation strategy. Namely, an increasingly converging behavior with Helm, migration paths to helm, etc. I have spent considerable time attempting to refactor the codebase to do so, and what I have ultimately concluded is that the way the codebase is currently structured is (somewhat subtly, at first glance) extremely problematic.

Aside from usual issues around tech debt and an organically evolving codebase, both of which impact the codebase heavily, there is a systemic issues in how data flows through the system. We have a variety of critical objects:

  • IstioOperator with values/meshConfig as a Struct
  • Values translated to a Values, MeshConfig translated to a MeshConfig
  • All of the above translated to map[string]any
  • All of the above translated to structpb.Struct
  • All of the above translated to JSON
  • All of the above translated to YAML

Through a simple install, we transition between these states an incredible amount of times. In one case, we wanted to translate to an IstioOperatorSpec into an IstioOperator; to do so, we converted to yaml, to a map, back to yaml, then to a Go template, then unmarshalled back into an IstioOperator. While that case is clearly an easy fix, there are many other cases where the structure of the codebase requires us to do constantly move back and forth.

My original attempts at cleaning up the codebase were to simply cleanup the discrepancies with values and meshConfig not being typed. I did two different prototypes of this, one keeping IstioOperator as a proto and another one just using standard Go structs. Both of these were not able to be completed. The root of the problem is we are really inherintly dealing with unstructured data. Most of the codebase is doing things like "merge these two maps", "set values.foo.bar=1", etc -- these are extremely hard to do with structured data. We currently do this with a custom-made library, tpath, which is incredibly complicated to handle all the edge cases of the structured data. Many times we avoid this by translating between various type formats (as discussed above), but this introduces even more complexity, as we have all these alternative paths to manipulate data.

This PR instead treats a map[string]any as the core data structure, wrapped in a values.Map type. Not-coincidentally, this is also how the Helm codebase is structured (originally, I had directly used most of Helm's libraries. However, for backwards compatibility we have different --set semantics unfortunately).

Beyond that, we now have clearly devised installation phases:

  • Merge all inputs into a single unified configuration
  • For each component, render the chart. Explicitly apply postprocessing steps (overlays) independently
  • for manifest generate: print and exit.
  • for install: apply to the cluster.

This structure should be more conducive to future feature work. For instance, if we want to perform a helm migration tooling, we can easily insert logic after the 'merge all inputs'.

Overall, the intent of this PR is to have no user facing behavioral changes, with two minor exceptions:

  • istioctl manifest generate --component - filters to one component. Can already be done by just only enabling that component, does not exist on istioctl install
  • istioctl manifest generate -o <dir> - output files to a dir instead of a file. It hasn't worked since at least 1.18, so I am pretty confident there is no usage

These were excluded just to clean up the code mostly

Fixes #52469

@howardjohn howardjohn added the release-notes-none Indicates a PR that does not require release notes. label Aug 6, 2024
@istio-testing
Copy link
Collaborator

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@istio-testing istio-testing added do-not-merge/work-in-progress Block merging of a PR because it isn't ready yet. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Aug 6, 2024
@howardjohn
Copy link
Member Author

/test all

@howardjohn howardjohn force-pushed the operator/2.0-attempt branch from 8be5fa8 to 2d828d6 Compare August 6, 2024 23:05
@howardjohn
Copy link
Member Author

/test all

2 similar comments
@howardjohn
Copy link
Member Author

/test all

@howardjohn
Copy link
Member Author

/test all

@howardjohn
Copy link
Member Author

/test all

2 similar comments
@howardjohn
Copy link
Member Author

/test all

@howardjohn
Copy link
Member Author

/test all

@howardjohn howardjohn force-pushed the operator/2.0-attempt branch from 006fe1d to 4d333fd Compare August 7, 2024 20:24
@howardjohn
Copy link
Member Author

/test all

1 similar comment
@howardjohn
Copy link
Member Author

/test all

@howardjohn howardjohn force-pushed the operator/2.0-attempt branch from 6628f47 to 052565f Compare August 7, 2024 23:00
@howardjohn
Copy link
Member Author

/test all

@howardjohn
Copy link
Member Author

/retest

@howardjohn howardjohn force-pushed the operator/2.0-attempt branch from 052565f to e5d5aa5 Compare August 8, 2024 16:26
@howardjohn
Copy link
Member Author

/test all

@howardjohn howardjohn force-pushed the operator/2.0-attempt branch from e5d5aa5 to e205dc5 Compare August 8, 2024 17:04
@howardjohn
Copy link
Member Author

/test all

@howardjohn howardjohn marked this pull request as ready for review August 8, 2024 17:41
@howardjohn howardjohn requested review from a team and costinm as code owners August 8, 2024 17:41
@howardjohn howardjohn force-pushed the operator/2.0-attempt branch from 8563ffb to 33e0f2e Compare August 15, 2024 14:59
@istio-testing istio-testing removed the needs-rebase Indicates a PR needs to be rebased before being merged label Aug 15, 2024
@istio-testing istio-testing merged commit 1ad41e1 into istio:master Aug 15, 2024
27 checks passed
howardjohn added a commit to howardjohn/istio that referenced this pull request Aug 15, 2024
I have no clue how this happened.

istio#52668 fixed these,
istio#52541 accidentally reverted in a
rebase. But the tests didn't fail... but now it randomly is again.
@howardjohn howardjohn mentioned this pull request Aug 15, 2024
istio-testing pushed a commit that referenced this pull request Aug 15, 2024
I have no clue how this happened.

#52668 fixed these,
#52541 accidentally reverted in a
rebase. But the tests didn't fail... but now it randomly is again.
luksa pushed a commit to luksa/istio that referenced this pull request Oct 14, 2024
* upstream/master:
  installer: fix missing labels on uninstall (istio#52717)
  Update various dependencies (istio#52683)
  Workaround calico issue in integration test (istio#52720)
  Auto detect cilium incompatibility during install (istio#52624)
  Fix lint issues (istio#52721)
  Update Go dependencies to 1.31 (istio#52661)
  Automator: update proxy@master in istio/istio@master (istio#52712)
  Automator: update common-files@master in istio/istio@master (istio#52685)
  installation: rewrite the code (istio#52541)
  Automator: update ztunnel@master in istio/istio@master (istio#52711)
  adding ds support for autoallocated IP in getAllAddressesForProxy (istio#52680)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-notes-none Indicates a PR that does not require release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Untangle "IOP" mess
6 participants