-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding edges endpoint to public API #2793
Conversation
Integration test results for 90dc181: success 🎉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! The test instructions and main.go are super useful, thanks!
I hit a snag because the local public-api expects to connect prometheus, which requires a port-forward to linkerd-prometheus in k8s. Instead I just connected the main.go you provided directly to a public-api running in k8s:
$ kubectl -n linkerd port-forward svc/linkerd-controller-api 8085
$ bin/go-run .
github.com/linkerd/linkerd2
ok:<edges:<src:<type:"pod" name:"web-57b7f9db85-nq8hb" > dst:<type:"pod" name:"voting-689f845d98-lbh2x" > client_id:"web" server_id:"voting" > edges:<src:<type:"pod" name:"vote-bot-7466ffc7f7-prf8g" > dst:<type:"pod" name:"web-57b7f9db85-nq8hb" > client_id:"default" server_id:"web" > edges:<src:<type:"pod" name:"web-57b7f9db85-nq8hb" > dst:<type:"pod" name:"emoji-646ddcc5f9-dfv5t" > client_id:"web" server_id:"emoji" > >
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great! I left a few comments about some unused code but overall this looks good. I am looking forward to one-sided edge stats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work! Agree with the comments others have posted.
I had some questions about the use of label_replace
. Just to confirm my understanding is correct, are we using label_replace
so that for example for pod queries, pod
becomes labelled destination_pod
in the inbound query, and dst_pod
becomes destination_pod
in the outbound query? And then in processEdgeMetrics
we're using that destination_pod
key for the matching?
Could one also accomplish this matching by doing the queries without label_replace, and then matching them up by querying the labels directly in processEdgeMetrics
? it seems like in that case label_replace
isn't saving us that much code. or maybe this replacement is doing more that I'm not considering?
@@ -15,8 +14,8 @@ type edgeResult struct {
}
const (
- inboundIdentityQuery = "label_replace(sum(increase(response_total%s[1m])) by (%s,client_id), \"destination_%s\", \"$1\", \"%s\", \"(.+)\")"
- outboundIdentityQuery = "label_replace(sum(increase(response_total%s[1m])) by (%s, dst_%s, server_id, no_tls_reason), \"destination_%s\", \"$1\", \"dst_%s\", \"(.+)\")"
+ inboundIdentityQuery = "sum(increase(response_total%s[1m])) by (%s,client_id)"
+ outboundIdentityQuery = "sum(increase(response_total%s[1m])) by (%s, dst_%s, server_id, no_tls_reason)"
)
func (s *grpcServer) Edges(ctx context.Context, req *pb.StatSummaryRequest) (*pb.EdgesResponse, error) {
@@ -55,8 +54,8 @@ func (s *grpcServer) getEdges(ctx context.Context, req *pb.StatSummaryRequest, t
labelsOutbound := labels.Merge(promDirectionLabels("outbound"))
labelsInbound := labels.Merge(promDirectionLabels("inbound"))
- inboundQuery := fmt.Sprintf(inboundIdentityQuery, labelsInbound, resourceType, resourceType, resourceType)
- outboundQuery := fmt.Sprintf(outboundIdentityQuery, labelsOutbound, resourceType, resourceType, resourceType, resourceType)
+ inboundQuery := fmt.Sprintf(inboundIdentityQuery, labelsInbound, resourceType)
+ outboundQuery := fmt.Sprintf(outboundIdentityQuery, labelsOutbound, resourceType, resourceType)
inboundResult, err := s.queryProm(ctx, inboundQuery)
if err != nil {
@@ -77,7 +76,8 @@ func processEdgeMetrics(req *pb.StatSummaryRequest, inbound, outbound model.Vect
dstIndex := map[model.LabelValue]model.Metric{}
srcIndex := map[model.LabelValue][]model.Metric{}
keys := map[model.LabelValue]struct{}{}
- resourceLabelReplacement := "destination_" + resourceType
+ resourceReplacementInbound := resourceType
+ resourceReplacementOutbound := "dst_" + resourceType
formatMsg := map[string]string{
"disabled": "Disabled",
@@ -93,7 +93,7 @@ func processEdgeMetrics(req *pb.StatSummaryRequest, inbound, outbound model.Vect
// the communication was one-sided (i.e. a probe or another instance where the src/dst are not both known)
// in future the edges command will support one-sided edges
if _, ok := sample.Metric[model.LabelName("client_id")]; ok {
- key := sample.Metric[model.LabelName(resourceLabelReplacement)]
+ key := sample.Metric[model.LabelName(resourceReplacementInbound)]
dstIndex[key] = sample.Metric
keys[key] = struct{}{}
}
@@ -102,7 +102,7 @@ func processEdgeMetrics(req *pb.StatSummaryRequest, inbound, outbound model.Vect
for _, sample := range outbound {
// skip any outbound results that do not have a server_id for same reason as above section
if _, ok := sample.Metric[model.LabelName("server_id")]; ok {
- key := sample.Metric[model.LabelName(resourceLabelReplacement)]
+ key := sample.Metric[model.LabelName(resourceReplacementOutbound)]
if _, ok := srcIndex[key]; !ok {
srcIndex[key] = []model.Metric{}
}
Integration test results for 9aaf2ce: success 🎉 |
Thanks @siggy @dadjeibaah @rmars for awesome feedback! I have incorporated your changes except for the outstanding questions (I left those threads unresolved on this page) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great. a couple more comments. 👍
also, works!
$ bin/go-run .
ok:<edges:<src:<type:"pod" name:"linkerd-web-85877d47f7-wb2ts" > dst:<type:"pod" name:"linkerd-controller-64dd44c9d5-fgxk8" > client_id:"linkerd-web" server_id:"linkerd-controller" > edges:<src:<type:"pod" name:"linkerd-controller-64dd44c9d5-fgxk8" > dst:<type:"pod" name:"linkerd-prometheus-58b9c8f7bd-4wlwh" > client_id:"linkerd-controller" server_id:"linkerd-prometheus" > >
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! I like this implementation a lot, and only had a few minor quibbles. It would be great to add an edges_test.go
file eventually, but doesn't have to happen as part of this PR.
Integration test results for 6dda24c: success 🎉 |
Integration test results for 6b73c05: fail 😕 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⭐️ Great, thanks for updating! Looks good to me, but would be great to have @siggy / @rmars @dadjeibaah give it another look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀 🐑 niceeee!
@scottcarol what errors are you getting, also, currently reviewing your PR. |
@dadjeibaah the integration tests are failing when they are run by l5d-bot but not when they are run on my local machine. The last line in the gist from l5d-bot is |
Ahh, that seems unexpected. I would rerun the tests I think the build had a minor hiccup or something. |
6b73c05
to
6b2b576
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Thanks for updating! ⭐️ 📦
Integration test results for 6b2b576: success 🎉 |
Using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome work! agree with @klingerf re: adding edges_test.go
at some point. 👍 🚢
return edge, nil | ||
} | ||
|
||
func processEdgeMetrics(inbound, outbound model.Vector, resourceType string) []*pb.Edge { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this code is super nice, i wonder if we could leverage it for stats down the road. 👍
This PR is the API portion of #2708.
It adds an endpoint to the public API to allow us to query Prometheus for edge data, in order to display identity information for connections between Linkerd proxies.
This PR ONLY includes changes to the controller and protobuf.
The corresponding CLI code for a
linkerd edges
query is written, but I want feedback on this portion before making the second PR.Notes
pods
anddeployments
.linkerd stat
(for example,authority
) it's unclear if the current results are complete/useful, I need feedback on this.Prometheus querying
For each edge, we are returning:
src name
- resource namedst name
- resource nameclient_id
- service account name for sourceserver_id
- service account name for destinationno_tls_reason
- reason for no identity, if none knownTo get this info from Prometheus we are combining information from two different queries, an inbound and outbound query.
We use
label_replace
in Prometheus to create a key that we can matchinbound
tooutbound
queries with, so we can populate the wholeedge
result.Remember that we are currently filtering out edges that are one-sided (i.e. health probes, Prometheus, tap) and also edges where we cannot definitively match the inbound/outbound queries together. That will be the next step.
To Test
If you make Protobuf changes, run:
You should see
starting HTTP server on :8085
Make a temporary
main.go
file that you can test API calls with, in the/linkerd2
directory (suggested code at bottom of this PR)In a separate terminal window, run
bin/go-run .
See what data is returned.
pods
inbooksapp
, you can compare to the result from:(Note, some of those results may be one-sided (i.e. a probe) so not all will appear)
=============
Suggested
main.go
file contents, you can switch out differentNamespace
andType
values: