-
Notifications
You must be signed in to change notification settings - Fork 7.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
do not apply rate limits for sds server #41213
Conversation
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
🤔 🐛 You appear to be fixing a bug in Go code, yet your PR doesn't include updates to any test files. Did you forget to add a test? Courtesy of your friendly test nag. |
envoyproxy/envoy#21600 - This might improve things but still rate limiting is not needed for sds |
pilot/pkg/xds/discovery.go
Outdated
@@ -154,6 +154,9 @@ type DiscoveryServer struct { | |||
// ClusterAliases are aliase names for cluster. When a proxy connects with a cluster ID | |||
// and if it has a different alias we should use that a cluster ID for proxy. | |||
ClusterAliases map[cluster.ID]cluster.ID | |||
|
|||
// Indicates this is running as an Sds Server. | |||
SdsServer bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i donot think this is a good mark. I am thinking not add any filed here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why you think we should add any fields here?
Do you have any alternatives to identify otherwise? we can drive this by req.typeURL if you think that is better - I can change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry we can not do by typeURL at the place where we apply rate limit (new stream init)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, can we just make the rate limit configurable and make it MaxInt for SDS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
configurable you mean in DiscoveryServer? I can make that change but some log lines are also confusing that for each SDS stream - connection id keep incrementing. Should we leave them as is?
new connection for node:istio-ingressgateway-processing-us-west-2c-6c74f945bc-6cpvz.sfproxy**-13**
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
@howardjohn made changes. PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rate limit part lgtm, not sure about the logs
pilot/pkg/xds/ads.go
Outdated
@@ -147,11 +147,19 @@ func (s *DiscoveryServer) receive(con *Connection, identities []string) { | |||
req, err := con.stream.Recv() | |||
if err != nil { | |||
if istiogrpc.IsExpectedGRPCError(err) { | |||
log.Infof("ADS: %q %s terminated", con.peerAddr, con.conID) | |||
if req != nil && req.TypeUrl == v3.SecretType { | |||
log.Infof("ADS: connection terminated for secrets:%v", strings.Join(req.ResourceNames, ",")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Secret is used even for Istiod, and may have 10000s of secrets, not sure we should log this
@@ -58,6 +60,8 @@ var _ model.XdsResourceGenerator = &sdsservice{} | |||
|
|||
func NewXdsServer(stop chan struct{}, gen model.XdsResourceGenerator) *xds.DiscoveryServer { | |||
s := xds.NewXDS(stop) | |||
// No ratelimit for SDS calls in agent. | |||
s.DiscoveryServer.RequestRateLimit = rate.NewLimiter(rate.Limit(math.MaxInt), 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: rate.Inf
exists, but its basically the same
@howardjohn fixed logs |
@howardjohn can you PTAL when you get chance? |
@@ -58,6 +59,8 @@ var _ model.XdsResourceGenerator = &sdsservice{} | |||
|
|||
func NewXdsServer(stop chan struct{}, gen model.XdsResourceGenerator) *xds.DiscoveryServer { | |||
s := xds.NewXDS(stop) | |||
// No ratelimit for SDS calls in agent. | |||
s.DiscoveryServer.RequestRateLimit = rate.NewLimiter(rate.Inf, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about not setting limit at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be good. I see we already handle zero limit
istio/pilot/pkg/xds/discovery.go
Line 662 in 247e396
if s.requestRateLimit.Limit() == 0 { |
pilot/pkg/xds/ads.go
Outdated
@@ -147,11 +147,15 @@ func (s *DiscoveryServer) receive(con *Connection, identities []string) { | |||
req, err := con.stream.Recv() | |||
if err != nil { | |||
if istiogrpc.IsExpectedGRPCError(err) { | |||
log.Infof("ADS: %q %s terminated", con.peerAddr, con.conID) | |||
if req != nil && req.TypeUrl != v3.SecretType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you check not secret type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because for secret types we do not want to log this messages as it can be 1000s as it logs for every secret.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np, without ratelimit, will error happen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually important for new connection with out this check
2022-09-29T03:33:23.515201Z info ads ADS: new connection for node:istio-ingressgateway-processing-us-west-2c-6c74f945bc-6cpvz.sfproxy-13
lots of these mis leading logs with come. So I protected at both places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can happen only at L180, how about changing to debug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that log at info level is useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we are hiding a useful error message. This seems to be because we are mixing "SecretType" to mean "istio-agent SDS server", when its not the case...
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel comfortable hiding all the secret errors. IMO if we want to modify logging (which is a big if, it seems over complex), it should be done as DiscoveryServer params, not based on Type
pilot/pkg/xds/ads.go
Outdated
@@ -147,11 +147,15 @@ func (s *DiscoveryServer) receive(con *Connection, identities []string) { | |||
req, err := con.stream.Recv() | |||
if err != nil { | |||
if istiogrpc.IsExpectedGRPCError(err) { | |||
log.Infof("ADS: %q %s terminated", con.peerAddr, con.conID) | |||
if req != nil && req.TypeUrl != v3.SecretType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we are hiding a useful error message. This seems to be because we are mixing "SecretType" to mean "istio-agent SDS server", when its not the case...
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
Since there is no consensus on logging changes, I have reverted them for now. I certainly think 2022-09-29T03:33:23.515201Z info ads ADS: new connection for node:istio-ingressgateway-processing-us-west-2c-6c74f945bc-6cpvz.sfproxy-13 is pretty confusing log line in agent but I will think about it a bit more to get proper solution. |
When DiscoveryServer is instantiated as a SDS Service in agent, it applies default rate limit of 1 which prints log lines like this in agent
Also the log lines like
for every secret request is confusing.
This PR fixes both