Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Acknowledge different types of error from NotifyOnDNSMsg #36990

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

vipul-21
Copy link
Contributor

NotifyOnDNSMsg can send three types of error for each call from DNS proxy, i.e. ErrTimeout, ErrNoEndpoint and ErrDNSMsgDetails. This commit introduced the acknowledgment of these error while processing the req/response in DNS prxoy and send refused if there is an error. This may not be true always. For example, during the timeout error in case of request, it be returned without forwarding the req further. The PR enhances the ability to either refuse the processing of msg further based on the type of dns msg and the response from the callback. Types of error:

  • ErrTimeout: Proxy context timeout
  • ErrNoEndpoint: Endpoint is nil
  • ErrDNSMsgDetails: Error processing DNS Msg
Type Of Msg ErrTimeout ErrNoEndpoint ErrDNSMsgDetails
Request Return with warning log send refused send refused
Response Forward the response with warning log send refused send refused

Note: Planning to clean up the code as well by using the NotifyOnDNSMsgContext for encompassing all the other parameters sent to the callback. But will create a separate PR for that.

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jan 15, 2025
@github-actions github-actions bot added the sig/policy Impacts whether traffic is allowed or denied based on user-defined policies. label Jan 15, 2025
@vipul-21 vipul-21 marked this pull request as ready for review January 15, 2025 16:07
@vipul-21 vipul-21 requested a review from a team as a code owner January 15, 2025 16:07
@vipul-21 vipul-21 requested a review from doniacld January 15, 2025 16:07
NotifyOnDNSMsg can send three types of error for each call.
The DNS proxy needs to forward/deny the processing further
based on these errors.

Signed-off-by: Vipul Singh <singhvipul@microsoft.com>
@vipul-21 vipul-21 force-pushed the singhvipul/errorhandling branch from 9d00c48 to 17365e5 Compare January 15, 2025 16:19
@tklauser tklauser requested a review from bimmlerd January 16, 2025 10:31
@tklauser
Copy link
Member

This commit introduced the acknowledgment of these error while processing the req/response in DNS prxoy and send refused if there is an error.

cc @bimmlerd as the author of that commit

Copy link
Member

@bimmlerd bimmlerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution!

I think I need a bit of context to understand what you're trying to solve. Is there an issue which describes what the problem is?

}
switch notifyOnDNSMsgContext.ErrorType {
case ErrTimeout:
scopedLog.Warn("Timeout waiting for processing of DNS query")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you motivate a bit more why it's better to have the client time out than being sent a REFUSED? Is this about retries?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes correct, if it is refused there won't be retry from the client.

DNSProxyResponse DNSProxyType = "response"
)

// There are three type of Error returned by the NotifyOnDNSMsgFunc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// There are three type of Error returned by the NotifyOnDNSMsgFunc
// There are three types of errors returned by the NotifyOnDNSMsgFunc

}
switch notifyOnDNSMsgContext.ErrorType {
case ErrTimeout:
scopedLog.Warn("Timeout waiting for processing of DNS response")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch doesn't return after the log, which means we write the response to the client even if notifyOnDNSMsg may have failed. I think this is the intention?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is correct. This is intentional. Proxy request timeout can occur after we get the response and in that case we just log the warning and forward the response.

@vipul-21
Copy link
Contributor Author

Thanks @bimmlerd for the review ! :)

I think I need a bit of context to understand what you're trying to solve. Is there an issue which describes what the problem is?

The notifyOnDNSMsg sends an error if there is a DNS message processing issue. We typically don't ack errors in the ServeDNS method and when we do, we always send "refused," which isn't always appropriate. The goal of adding these error types to notifyOnDNSMsg is to ensure DNS requests/responses are denied or forwarded based on processing errors. (Future updates will use NotifyOnDNSMsgContext to send all parameters for better readability).
There is no active issue for this.

@vipul-21 vipul-21 requested a review from bimmlerd January 16, 2025 18:02
Copy link
Member

@bimmlerd bimmlerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the PR steers the behaviour in the right direction, but it's rather difficult to reason through. As a proposal, do you think it'd make the code clearer if we had different callbacks? I was thinking maybe one for an error encountered (with an error kind similar to what this proposes) and one for request/response each?
I feel like some of the difficulty stems from the fact that this notifyOnDNSMsg is called both in the error and the normal case.

@vipul-21
Copy link
Contributor Author

I agree with making the error callback and request/response(each) separated. I looked at the callbacks from DNS proxy and divided them into these categories(there are total 12 callbacks):

  • Error on DNS Proxy end
    This can happen at different processing times while serving the DNS req.
    • In case of error before the endpoint is not retrieved: We only update the stat(proxy context) and return.
    • In case of error after endpoint is retrieved: We update the stat and ep proxy stat given the dns message(request) is parsed properly. There is a response callback too where we need to update the dns data as well.
  • Allow/Deny a DNS request
    • We update the stat and ep proxy stat given the dns message(request) is parsed properly.
  • Response from upstream dns
    • We update the stat and ep proxy stat as well dns data in endpoint.

Does it make sense to separate out the stat/ep proxy state updates i.e. UpdateProxyStats ?
Based on that, there will be 3 callbacks :

  • NotifyOnDNSProxyError(): Will be called if there is an error on DNS proxy side to update stat, ep proxy stat. (Calls UpdateProxyStats)
  • NotifyOnRequest(): Will be called when we want to forward an allowed DNS request. It will call the UpdateProxyStats first and if there is an error will return a Boolean along with the error if the request can be forwarded or not. (Similar to what is proposed in the PR)
  • NotifyOnResponse(): Will be called for response from upstream to update the dns data. This will internally call UpdateProxyStats first and then update the DNS data in endpoint. In case of error, handles the error if the response can be forwarded or not.

There is also recording of the log which happens at the end that needs to be handled for each callback. Let me know your thoughts.

@bimmlerd
Copy link
Member

12 callbacks

Yeah it's pretty tangled at the moment😞 There's probably no way around trying to write down the full control flow for each of these and then try to find a clearer pattern which fits.

The other approach would be to reason from first principles - i.e. what could the proxy do itself and for what does it really need to call out to other things.

From a brief look through the code, it seems like the proxy currently wants to report errors to the daemon. I'm unsure whether this is truly necessary, or whether the tracing and metrics stuff could be handled internally to the proxy. If so, the query handling would probably also not need a callback. (Unless we factor out the DNS policy (not toFQDN) part from ServerDNS too - that could also be a separate component being called into by the proxy.)

Response handling, on the other hand, must be forwarded to the policy subsystem (i.e. the core of notifyOnDNSMessage gated by being a response) - that part should certainly stay.

(BTW, let's not make the next iteration use function pointers, but have the DNS proxy struct take an interface which other components can implement. That makes testing and reuse easier too, for the purposes of the SDP)

Copy link

This pull request has been automatically marked as stale because it
has not had recent activity. It will be closed if no further activity
occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Feb 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies. stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants