Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

public-api/viz split #5560

Merged
merged 23 commits into from
Jan 21, 2021
Merged

public-api/viz split #5560

merged 23 commits into from
Jan 21, 2021

Conversation

alpeb
Copy link
Member

@alpeb alpeb commented Jan 18, 2021

These changes are grouped into 8 commits described and linked to as follows (I'll eventually followup with more commits addressing feedback).

1) A couple more protobuf changes:

  • Moved healthcheck.proto back from viz to proto/common as it remains being used by the main healthcheck.go library (it was moved to viz by Separate observability API #5510).
  • Extracted from viz.proto the IP-related types and put them in /controller/gen/common/net to be used by both the public and the viz APIs.

2) Chart templates for new viz linkerd-metrics-api pod

3) Spin-off viz healthcheck:

  • Created viz/pkg/healthcheck/healthcheck.go that wraps the original pkg/healthcheck/healthcheck.go while adding the vizNamespace and vizAPIClient fields which were removed from the core healthcheck. That way the core healthcheck doesn't have any dependencies on viz, and viz' healthcheck can now be used to retrieve viz api clients.
  • The core and viz healthcheck libs are now abstracted out via the new healthcheck.Runner interface.
  • Moved to viz the linkerd-data-plane checks because they rely on Prometheus, but they're currently commented-out pending the wiring up of linkerd viz check --proxy. Refactored the data plane checks so they don't rely on calling ListPods
  • The core healthcheck unit tests dealing with the viz api have been removed and will be refactored later into a new set of viz healthcheck unit tests.
  • The checks in viz/cmd/check.go have been moved to viz/pkg/healthcheck/healthcheck.go as well, so check.go's sole responsibility is dealing with command business. This command also now retrieves its viz api client through viz' healthcheck.
  • The getNamespaceOfExtensions() function has been refactored into viz.pkg.GetVizNamespace() as is consumed from a few other places as well.
  • multicluster/cmd/check.go also now relies on viz' api because it hits the Gateways() endpoint that depends on Prometheus. In a followup, we should move that api into some MC component (the gateway itself?) so multicluster can be installed without viz
  • Ditto for multicluster/cmd/gateways.go

4) Remove linkerd-controller dependency on Prometheus:

5) Move observability gRPC from linkerd-controller to viz:

We continue removing the linkerd-controller dependencies on viz stuff, except for things required for the tap server, which can be tackled separately afterwards.

  • Created a new gRPC server under viz/metrics-api moving prometheus-dependent functions out of the core gRPC server and into it (same thing for the accompaigning http server).
  • Did the same for the PublicAPIClient (now called just Client) interface. The VizAPIClient interface disappears as it's enough to just rely on the viz ApiClient protobuf type.
  • Moved the other files implementing the rest of the gRPC functions from controller/api/public to viz/metrics-api (edge.go, stat_summary.go, etc.).
  • Also simplified some type names to avoid stuttering.

6) linkerd-metrics-api bootstrap files:

Here we complete the work started in #5554 ("Chart templates for new viz linkerd-metrics-api pod") by providing the Dockerfile, main.go command and the bin/ scripts for building the new linkerd-metrics-api pod under the linkerd-viz namespace.

At the same time, we strip out of the public-api's main.go file the prometheus parameters and other no longer relevant bits.

Finally, updated tests related to linkerd-metrics-api.

7) linkerd-web updates:

linkerd-web requires connecting with both the public-api and the viz api, so both addresses (and the viz namespace) are now provided as parameters to the container.

We're also updating some dependencies that still were pointing to the public-api and now should point to the viz api.

8) CLI updates and other minor things:

Changes to command files under cli/cmd:

  • Updated endpoints.go according to new API interface name.
  • Updated version.go, dashboard and uninstall.go to pull the viz namespace dynamically.

Changes to command files under viz/cmd:

  • edges.go, routes.go, stat.go and top.go: point to dependencies that were moved from public-api to viz.

Other changes to have tests pass:

  • Added metrics-api to list of docker images to build in actions workflows.
  • In bin/fmt exclude protobuf generated files instead of entire directories because directories could contain both generated and non-generated code (case in point: viz/metrics-api).
  • Update Helm readme files.

Closes #5328

@alpeb alpeb requested a review from a team as a code owner January 18, 2021 20:13
@alpeb alpeb changed the title CLI updates for public-api/viz split public-api/viz split 8/8: CLI updates Jan 18, 2021
@alpeb alpeb force-pushed the alpeb/public-api-viz-cmd-updates branch from 05e07cd to ec908e4 Compare January 18, 2021 21:01
@alpeb alpeb force-pushed the alpeb/linkerd-web-public-viz branch from 7f6cc2f to 3a7a13b Compare January 20, 2021 16:25
@alpeb alpeb force-pushed the alpeb/public-api-viz-cmd-updates branch from af186b1 to 2ac0956 Compare January 20, 2021 16:33
@alpeb alpeb changed the base branch from alpeb/linkerd-web-public-viz to main January 20, 2021 18:44
@alpeb alpeb changed the title public-api/viz split 8/8: CLI updates public-api/viz split Jan 20, 2021
alpeb added 7 commits January 20, 2021 14:28
- Moved `healthcheck.proto` back from viz to `proto/common` as it remains being used by the main `healthcheck.go` library (it was moved to viz by #5510).
- Extracted from `viz.proto` the IP-related types and put them in `/controller/gen/common/net` to be used by both the public and the viz APIs.
- Created `viz/pkg/healthcheck/healthcheck.go` that wraps the original `pkg/healthcheck/healthcheck.go` while adding the `vizNamespace` and `vizAPIClient` fields which were removed from the core `healthcheck`. That way the core healthcheck doesn't have any dependencies on viz, and viz' healthcheck can now be used to retrieve viz api clients.
- The core and viz healthcheck libs are now abstracted out via the new `healthcheck.Runner` interface.
- ~~Moved to viz the `linkerd-data-plane` checks because they rely on Prometheus, but they're currently commented-out pending the wiring up of `linkerd viz check --proxy`.~~ Refactored the data plane checks so they don't rely on calling `ListPods`
- ~~The core healthcheck unit tests dealing with the viz api have been removed and will be refactored later into a new set of viz healthcheck unit tests.~~
- The checks in `viz/cmd/check.go` have been moved to `viz/pkg/healthcheck/healthcheck.go` as well, so `check.go`'s sole responsibility is dealing with command business. This command also now retrieves its viz api client through viz' healthcheck.
- ~~The `getNamespaceOfExtensions()` function has been refactored into `viz.pkg.GetVizNamespace()` as is consumed from a few other places as well.~~
- `multicluster/cmd/check.go` also now relies on viz' api because it hits the `Gateways()` endpoint that depends on Prometheus. **In a followup, we should move that api into some MC component (the gateway itself?) so multicluster can be installed without viz**
- Ditto for `multicluster/cmd/gateways.go`
- Removed the `global.prometheusUrl` config in the core values.yml.
- Removed `linkerd-controller` dependencies on Prometheus.
- Leave the Heartbeat's  `-prometheus` flag hard-coded temporarily. TO-DO: have it automatically discover viz and pull Prometheus' endpoint (#5352).
We continue removing the linkerd-controller dependencies on viz stuff, except for things required for the tap server, which can be tackled separately afterwards.

- Created a new gRPC server under `viz/metrics-api` moving prometheus-dependent functions out of the core gRPC server and into it (same thing for the accompaigning http server).
- Did the same for the `PublicAPIClient` (now called just `Client`) interface. The `VizAPIClient` interface disappears as it's enough to just rely on the viz `ApiClient` protobuf type.
- Moved the other files implementing the rest of the gRPC functions from `controller/api/public` to `viz/metrics-api` (`edge.go`, `stat_summary.go`, etc.).
- Also simplified some type names to avoid stuttering.
Here we complete the work started in #5554 ("Chart templates for new viz linkerd-metrics-api pod") by providing the Dockerfile, `main.go` command and the `bin/` scripts for building the new `linkerd-metrics-api` pod under the `linkerd-viz` namespace.

At the same time, we strip out of the public-api's `main.go` file the prometheus parameters and other no longer relevant bits.

Finally, updated tests related to linkerd-metrics-api.
`linkerd-web` requires connecting with both the public-api and the viz api, so both addresses (and the viz namespace) are now provided as parameters to the container.

We're also updating some dependencies that still were pointing to the public-api and now should point to the viz api.
Changes to command files under `cli/cmd`:
- Updated `endpoints.go` according to new API interface name.
- Updated `version.go`, `dashboard` and `uninstall.go` to pull the viz namespace dynamically.

Changes to command files under `viz/cmd`:
- `edges.go`, `routes.go`, `stat.go` and `top.go`: point to dependencies that were moved from public-api to viz.

Other changes to have tests pass:
- Added `metrics-api` to list of docker images to build in actions workflows.
- In `bin/fmt` exclude protobuf generated files instead of entire directories because directories could contain both generated and non-generated code (case in point: `viz/metrics-api`).
- Update Helm readme files.
@alpeb alpeb force-pushed the alpeb/public-api-viz-cmd-updates branch 2 times, most recently from 9ec1ad6 to 8fe5e64 Compare January 20, 2021 21:08
@alpeb alpeb force-pushed the alpeb/public-api-viz-cmd-updates branch from 8fe5e64 to f379c1e Compare January 20, 2021 21:11
@alpeb alpeb force-pushed the alpeb/public-api-viz-cmd-updates branch from d26bd2d to 0e1d6d5 Compare January 20, 2021 22:51
Copy link
Contributor

@kleimkuhler kleimkuhler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still have some more reviewing testing to do, but leaving first round of comments.

bin/fmt Outdated Show resolved Hide resolved
cli/cmd/version.go Outdated Show resolved Hide resolved
controller/api/public/client.go Outdated Show resolved Hide resolved
controller/api/public/client.go Show resolved Hide resolved
controller/api/public/grpc_server.go Show resolved Hide resolved
multicluster/cmd/gateways.go Outdated Show resolved Hide resolved
pkg/healthcheck/healthcheck.go Outdated Show resolved Hide resolved
viz/cmd/uninstall.go Outdated Show resolved Hide resolved
viz/metrics-api/client/client.go Outdated Show resolved Hide resolved
viz/metrics-api/client/client.go Show resolved Hide resolved
}

func (s *grpcServer) SelfCheck(ctx context.Context, in *healthcheckPb.SelfCheckRequest) (*healthcheckPb.SelfCheckResponse, error) {
// TODO: Reenable this check just for checking the control plane can
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to make this work without taking a dependency on linkerd-viz, we'd need to introduce a new API or move SelfCheck to a shared location. To be honest, I'm not sure if it's even worth it. Especially since this process doesn't even need to talk to the k8s API. Perhaps a better check would be to look for errors in the destination or identity controller logs or kubernetes event streams?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think we can get rid of this TODO

multicluster/cmd/check.go Outdated Show resolved Hide resolved
@adleong
Copy link
Member

adleong commented Jan 21, 2021

I'm getting a panic when trying to use viz commands before installing the viz extension:

l viz stat deploy -n linkerd
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x23654f2]

goroutine 1 [running]:
github.com/linkerd/linkerd2/viz/pkg/healthcheck.(*HealthChecker).vizCategory.func1(0x2aacfe0, 0xc0003bfe00, 0x6fc23ac00, 0x2aacfe0)
	/linkerd-build/viz/pkg/healthcheck/healthcheck.go:65 +0x72
github.com/linkerd/linkerd2/pkg/healthcheck.(*HealthChecker).runCheck(0xc0002c6600, 0x27b5a0b, 0xb, 0xc000a878e8, 0x28986d8, 0x0)
	/linkerd-build/pkg/healthcheck/healthcheck.go:1506 +0x14c
github.com/linkerd/linkerd2/pkg/healthcheck.(*HealthChecker).RunChecks(0xc0002c6600, 0x28986d8, 0x2)
	/linkerd-build/pkg/healthcheck/healthcheck.go:1470 +0x298
github.com/linkerd/linkerd2/viz/pkg/healthcheck.(*HealthChecker).RunChecks(...)
	/linkerd-build/viz/pkg/healthcheck/healthcheck.go:54
github.com/linkerd/linkerd2/viz/pkg/api.CheckClientOrRetryOrExit(0x27b05cd, 0x7, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/linkerd-build/viz/pkg/api/api.go:37 +0xff
github.com/linkerd/linkerd2/viz/pkg/api.CheckClientOrExit(...)
	/linkerd-build/viz/pkg/api/api.go:18
github.com/linkerd/linkerd2/viz/cmd.NewCmdStat.func1(0xc0008b1340, 0xc0008e85a0, 0x1, 0x3, 0x0, 0x0)
	/linkerd-build/viz/cmd/stat.go:183 +0x21d
github.com/spf13/cobra.(*Command).execute(0xc0008b1340, 0xc0008e8570, 0x3, 0x3, 0xc0008b1340, 0xc0008e8570)
	/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:842 +0x453
github.com/spf13/cobra.(*Command).ExecuteC(0x38ec200, 0x0, 0x24a9a20, 0xc000182058)
	/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:950 +0x349
github.com/spf13/cobra.(*Command).Execute(...)
	/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:887
main.main()
	/linkerd-build/cli/main.go:10 +0x2d

- Fixed panic when issuing viz commands when the extension is not
  installed.
- Removed unnecessary wrapping functions `NewInternalPublicClient` and
  `NewExternalPublicClient`.
- Created new constant in `viz/cmd/root` holding "linkerd-viz".
- Moved `LinkerdControlPlaneVersionChecks` in `healthcheck.go` to its
  original location in that same file.
- Removed commented out `SelfCheck` in the public-api's `grpc_server.go`
@alpeb
Copy link
Member Author

alpeb commented Jan 21, 2021

Most of the feedback has been addressed (still need to see if I can get rid of the tap functions in the public-api, and to have linkerd check skip the Gateways() call if viz is absent):

  • Fixed panic when issuing viz commands when the extension is not installed.
  • Removed unnecessary wrapping functions NewInternalPublicClient and NewExternalPublicClient.
  • Created new constant in viz/cmd/root holding "linkerd-viz".
  • Moved LinkerdControlPlaneVersionChecks in healthcheck.go to its original location in that same file.
  • Removed commented out SelfCheck in the public-api's grpc_server.go

alpeb added 2 commits January 20, 2021 20:42
...that have been superseded by the tap APIServer for a while.
Rolled back `multicluster/cmd/check.go` changes so that we keep on using
the public-api client, instead of viz'. If the viz extension is not
detected then the "all gateway mirrors are healthy" check will just
issue a warning as show below. Otherwise, a viz client is instantiated and
`Gateways()` is called on it.

```
$ linkerdf cli mc check
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ controller pod is running
√ can initialize the client
√ can query the control plane API

linkerd-multicluster
--------------------
√ Link CRD exists
√ Link resources are valid
        * target
√ remote cluster access credentials are valid
        * target
√ clusters share trust anchors
        * target
√ service mirror controller has required permissions
        * target
√ service mirror controllers are running
        * target
× all gateway mirrors are healthy
        failed to fetch gateway metrics: could not find the linkerd-viz extension
    see https://linkerd.io/checks/#l5d-multicluster-gateways-endpoints for hints
√ all mirror services have endpoints
√ all mirror services are part of a Link

Status check results are ×
```
@alpeb
Copy link
Member Author

alpeb commented Jan 21, 2021

In the last push I rolled back multicluster/cmd/check.go changes so that we keep on using
the public-api client, instead of viz'. If the viz extension is not
detected then the "all gateway mirrors are healthy" check will just
issue a warning as show below. Otherwise, a viz client is instantiated and
Gateways() is called on it.

$ linkerdf cli mc check
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ controller pod is running
√ can initialize the client
√ can query the control plane API

linkerd-multicluster
--------------------
√ Link CRD exists
√ Link resources are valid
        * target
√ remote cluster access credentials are valid
        * target
√ clusters share trust anchors
        * target
√ service mirror controller has required permissions
        * target
√ service mirror controllers are running
        * target
× all gateway mirrors are healthy
        failed to fetch gateway metrics: could not find the linkerd-viz extension
    see https://linkerd.io/checks/#l5d-multicluster-gateways-endpoints for hints
√ all mirror services have endpoints
√ all mirror services are part of a Link

Status check results are ×

Copy link
Contributor

@kleimkuhler kleimkuhler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry there are a few merge conflicts after the tap-injector merge!

viz/pkg/healthcheck/healthcheck.go Outdated Show resolved Hide resolved
Copy link
Contributor

@kleimkuhler kleimkuhler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! 🏁

Copy link
Member

@adleong adleong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the list of images in bin/docker-push needs to be updated.

multicluster/cmd/check.go Outdated Show resolved Hide resolved
@alpeb
Copy link
Member Author

alpeb commented Jan 21, 2021

In the last push, linkerd mc check doesn't error when viz is not available. I'm also properly setting the log in multicluster/cmd/root.go so that it properly displays messages when --verbose is used:

$ bin/go-run cli mc check 
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ controller pod is running
√ can initialize the client
√ can query the control plane API

linkerd-multicluster
--------------------
√ Link CRD exists
√ Link resources are valid
        * target
√ remote cluster access credentials are valid
        * target
√ clusters share trust anchors
        * target
√ service mirror controller has required permissions
        * target
√ service mirror controllers are running
        * target
√ all mirror services have endpoints
√ all mirror services are part of a Link

Status check results are √



$ bin/go-run cli mc check --verbose
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ controller pod is running
DEBU[0000] Starting port forward to https://0.0.0.0:43271/api/v1/namespaces/linkerd/pods/linkerd-controller-85b69dcc46-84pdb/portforward?timeout=30s 34587:8085 
DEBU[0000] Port forward initialised                     
DEBU[0000] Expecting API to be served over [http://localhost:34587/api/v1/] 
√ can initialize the client
DEBU[0000] Making gRPC-over-HTTP call to [http://localhost:34587/api/v1/Version] [] 
DEBU[0000] Response from [http://localhost:34587/api/v1/Version] had headers: map[Content-Length:[56] Content-Type:[application/octet-stream] Date:[Thu, 21 Jan 2021 22:38:10 GMT]] 
DEBU[0000] gRPC-over-HTTP call returned status [200 OK] and content length [56] 
√ can query the control plane API

linkerd-multicluster
--------------------
√ Link CRD exists
√ Link resources are valid
        * target
√ remote cluster access credentials are valid
        * target
√ clusters share trust anchors
        * target
√ service mirror controller has required permissions
        * target
√ service mirror controllers are running
        * target
DEBU[0000] Skipping check: all gateway mirrors are healthy. Reason: failed to fetch gateway metrics 
√ all mirror services have endpoints
√ all mirror services are part of a Link

Status check results are √

Copy link
Member

@adleong adleong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🌮🌮🌮

@alpeb alpeb merged commit 8ac5360 into main Jan 21, 2021
@alpeb alpeb deleted the alpeb/public-api-viz-cmd-updates branch January 21, 2021 23:26
jijeesh pushed a commit to jijeesh/linkerd2 that referenced this pull request Mar 23, 2021
…ings into a new viz component 'linkerd-metrics-api' (linkerd#5560)

* Protobuf changes:
- Moved `healthcheck.proto` back from viz to `proto/common` as it remains being used by the main `healthcheck.go` library (it was moved to viz by linkerd#5510).
- Extracted from `viz.proto` the IP-related types and put them in `/controller/gen/common/net` to be used by both the public and the viz APIs.

* Added chart templates for new viz linkerd-metrics-api pod

* Spin-off viz healthcheck:
- Created `viz/pkg/healthcheck/healthcheck.go` that wraps the original `pkg/healthcheck/healthcheck.go` while adding the `vizNamespace` and `vizAPIClient` fields which were removed from the core `healthcheck`. That way the core healthcheck doesn't have any dependencies on viz, and viz' healthcheck can now be used to retrieve viz api clients.
- The core and viz healthcheck libs are now abstracted out via the new `healthcheck.Runner` interface.
- Refactored the data plane checks so they don't rely on calling `ListPods`
- The checks in `viz/cmd/check.go` have been moved to `viz/pkg/healthcheck/healthcheck.go` as well, so `check.go`'s sole responsibility is dealing with command business. This command also now retrieves its viz api client through viz' healthcheck.

* Removed linkerd-controller dependency on Prometheus:
- Removed the `global.prometheusUrl` config in the core values.yml.
- Leave the Heartbeat's `-prometheus` flag hard-coded temporarily. TO-DO: have it automatically discover viz and pull Prometheus' endpoint (linkerd#5352).

* Moved observability gRPC from linkerd-controller to viz:
- Created a new gRPC server under `viz/metrics-api` moving prometheus-dependent functions out of the core gRPC server and into it (same thing for the accompaigning http server).
- Did the same for the `PublicAPIClient` (now called just `Client`) interface. The `VizAPIClient` interface disappears as it's enough to just rely on the viz `ApiClient` protobuf type.
- Moved the other files implementing the rest of the gRPC functions from `controller/api/public` to `viz/metrics-api` (`edge.go`, `stat_summary.go`, etc.).
- Also simplified some type names to avoid stuttering.

* Added linkerd-metrics-api bootstrap files. At the same time, we strip out of the public-api's `main.go` file the prometheus parameters and other no longer relevant bits.

* linkerd-web updates: it requires connecting with both the public-api and the viz api, so both addresses (and the viz namespace) are now provided as parameters to the container.

* CLI updates and other minor things:
- Changes to command files under `cli/cmd`:
  - Updated `endpoints.go` according to new API interface name.
  - Updated `version.go`, `dashboard` and `uninstall.go` to pull the viz namespace dynamically.
- Changes to command files under `viz/cmd`:
  - `edges.go`, `routes.go`, `stat.go` and `top.go`: point to dependencies that were moved from public-api to viz.
- Other changes to have tests pass:
  - Added `metrics-api` to list of docker images to build in actions workflows.
  - In `bin/fmt` exclude protobuf generated files instead of entire directories because directories could contain both generated and non-generated code (case in point: `viz/metrics-api`).

* Add retry to 'tap API service is running' check

* mc check shouldn't err when viz is not available. Also properly set the log in multicluster/cmd/root.go so that it properly displays messages when --verbose is used

Signed-off-by: Jijeesh <jijeesh.ka@gmail.com>
jijeesh pushed a commit to jijeesh/linkerd2 that referenced this pull request Apr 21, 2021
…ings into a new viz component 'linkerd-metrics-api' (linkerd#5560)

* Protobuf changes:
- Moved `healthcheck.proto` back from viz to `proto/common` as it remains being used by the main `healthcheck.go` library (it was moved to viz by linkerd#5510).
- Extracted from `viz.proto` the IP-related types and put them in `/controller/gen/common/net` to be used by both the public and the viz APIs.

* Added chart templates for new viz linkerd-metrics-api pod

* Spin-off viz healthcheck:
- Created `viz/pkg/healthcheck/healthcheck.go` that wraps the original `pkg/healthcheck/healthcheck.go` while adding the `vizNamespace` and `vizAPIClient` fields which were removed from the core `healthcheck`. That way the core healthcheck doesn't have any dependencies on viz, and viz' healthcheck can now be used to retrieve viz api clients.
- The core and viz healthcheck libs are now abstracted out via the new `healthcheck.Runner` interface.
- Refactored the data plane checks so they don't rely on calling `ListPods`
- The checks in `viz/cmd/check.go` have been moved to `viz/pkg/healthcheck/healthcheck.go` as well, so `check.go`'s sole responsibility is dealing with command business. This command also now retrieves its viz api client through viz' healthcheck.

* Removed linkerd-controller dependency on Prometheus:
- Removed the `global.prometheusUrl` config in the core values.yml.
- Leave the Heartbeat's `-prometheus` flag hard-coded temporarily. TO-DO: have it automatically discover viz and pull Prometheus' endpoint (linkerd#5352).

* Moved observability gRPC from linkerd-controller to viz:
- Created a new gRPC server under `viz/metrics-api` moving prometheus-dependent functions out of the core gRPC server and into it (same thing for the accompaigning http server).
- Did the same for the `PublicAPIClient` (now called just `Client`) interface. The `VizAPIClient` interface disappears as it's enough to just rely on the viz `ApiClient` protobuf type.
- Moved the other files implementing the rest of the gRPC functions from `controller/api/public` to `viz/metrics-api` (`edge.go`, `stat_summary.go`, etc.).
- Also simplified some type names to avoid stuttering.

* Added linkerd-metrics-api bootstrap files. At the same time, we strip out of the public-api's `main.go` file the prometheus parameters and other no longer relevant bits.

* linkerd-web updates: it requires connecting with both the public-api and the viz api, so both addresses (and the viz namespace) are now provided as parameters to the container.

* CLI updates and other minor things:
- Changes to command files under `cli/cmd`:
  - Updated `endpoints.go` according to new API interface name.
  - Updated `version.go`, `dashboard` and `uninstall.go` to pull the viz namespace dynamically.
- Changes to command files under `viz/cmd`:
  - `edges.go`, `routes.go`, `stat.go` and `top.go`: point to dependencies that were moved from public-api to viz.
- Other changes to have tests pass:
  - Added `metrics-api` to list of docker images to build in actions workflows.
  - In `bin/fmt` exclude protobuf generated files instead of entire directories because directories could contain both generated and non-generated code (case in point: `viz/metrics-api`).

* Add retry to 'tap API service is running' check

* mc check shouldn't err when viz is not available. Also properly set the log in multicluster/cmd/root.go so that it properly displays messages when --verbose is used

Signed-off-by: Jijeesh <jijeesh.ka@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

update multicluster gatesways command to use linkerd-viz extension
3 participants