Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resolver/dns: support custom dns authority #2265

Merged
merged 1 commit into from
Sep 25, 2018

Conversation

elliots
Copy link
Contributor

@elliots elliots commented Aug 15, 2018

As described in https://github.com/grpc/grpc/blob/master/doc/naming.md, allows the use of a custom authority (dns server) with the dns resolver on go 1.9+. Gives an error in < go 1.9 if an authority is set.

e.g.

resolver.Target{Endpoint: "mydomain.com", Authority: "8.8.8.8"}

I'm not sure how to go about testing it though. Just check that the authority is passed through but continue as before in the tests?

See: #2173

@elliots elliots force-pushed the customdnsauthority branch 5 times, most recently from 3ce2cd6 to 211bde6 Compare August 15, 2018 04:25
@elliots
Copy link
Contributor Author

elliots commented Aug 15, 2018

I ran into some issues with different versions of go, and this got pretty ugly pretty quickly.

@kriskowal
Copy link

Looks like CI fails a vet check, and cursory inspection suggests unrelated issue with generated code mismatching expectations.

@menghanl
Copy link
Contributor

The vet failure should be fixed by #2263. Please rebase and try again.

@@ -1,4 +1,4 @@
// +build go1.8
// +build go1.6, !go1.9

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filename becomes a bit of a misnomer. Idea: go18down, go19up.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, pending vet fix.

@elliots
Copy link
Contributor Author

elliots commented Aug 17, 2018

Rebased on top of #2263 and renamed version specific files as suggested.

Is there anything I can add to the tests for this functionality?

@kriskowal
Copy link

Hello, @lyuxuan. We are very interested in this feature at Uber. Please let us know what we can do to support its progress to release. Thank you @elliots, I was assigned this issue the day before you submitted your change!

@elliots elliots force-pushed the customdnsauthority branch from e5ff36a to fa6a7d9 Compare August 18, 2018 02:20
@elliots
Copy link
Contributor Author

elliots commented Aug 18, 2018

Cleaned up the default resolver stuff a little. I think it's looking ok now. Still no tests, but not sure how to do that.

@elliots elliots changed the title resolver/dns: support custom dns authority WIP resolver/dns: support custom dns authority Aug 18, 2018
PreferGo: true,
Dial: func(ctx context.Context, network, address string) (net.Conn, error) {
dialer := net.Dialer{}
return dialer.DialContext(ctx, "udp", authority)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we pass network through instead of punching in "udp"? I believe DNS as a protocol at least has the option of falling back to TCP if messages exceed the UDP datagram size limit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Just pass network to DialContext, and let the golang net package handle the fallback.

@@ -121,6 +127,12 @@ func (b *dnsBuilder) Scheme() string {
return "dns"
}

type netResolver interface {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating a fake or mock for this interface might be the key to building up some coverage. We could also subvert a global function pointer for customAuthorityResolver for the duration of tests.

Alternately we can stand up a UDP socket and fake the DNS protocol to verify that the custom authority catches the traffic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using github.com/miekg/dns to implement my dns server, would it be ok to bring in that dependency for a test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's preferred not to pull in extra dependency.

I agree with @kriskowal suggestion about having a global function pointer for usage inside customAuthorityResolver. We don't really need to have a round trip of DNS request/response, we just need to verify the resolver try to dial to the authority specified in the target string.


return &go18upResolver{
resolver: &net.Resolver{
PreferGo: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for making PreferGo true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have to set it to true in order to use a custom dialler
( according to golang/go#19268 (comment) )

)

func customAuthorityResolver(authority string) (netResolver, error) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: delete blank line

@@ -0,0 +1,29 @@
// +build go1.6, !go1.9
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change file name to pre_go19.go

lookupHost = net.DefaultResolver.LookupHost
lookupSRV = net.DefaultResolver.LookupSRV
lookupTXT = net.DefaultResolver.LookupTXT
import (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change file name to go18_test.go

@@ -0,0 +1,44 @@
// +build go1.9
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change file name to go19.go


/*
*
* Copyright 2017 gRPC authors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2018


/*
*
* Copyright 2017 gRPC authors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2018


/*
*
* Copyright 2017 gRPC authors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2018


/*
*
* Copyright 2017 gRPC authors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2018

@@ -0,0 +1,44 @@
// +build go1.6, !go1.8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change file name to pre_go18.go

@kriskowal
Copy link

@lyuxuan do you have recommendations for improving test coverage? Asking for a friend ;)

@elliots elliots force-pushed the customdnsauthority branch from fa6a7d9 to fe90869 Compare August 22, 2018 23:47
@lyuxuan
Copy link
Contributor

lyuxuan commented Aug 24, 2018

Hi @kriskowal, I think having a test verifying DNS contacting the custom DNS authority when target string containing authority info is enough for this change. You are very welcome to bring up more test cases :). Thanks!

@lyuxuan
Copy link
Contributor

lyuxuan commented Aug 30, 2018

@elliots, friendly ping. Let us know if you have any question/problem. Thanks!

@elliots elliots force-pushed the customdnsauthority branch from fe90869 to 2242818 Compare August 31, 2018 22:48
@elliots
Copy link
Contributor Author

elliots commented Aug 31, 2018

All the tests in testDNSResolver are now run twice, once with no custom authority (as before) and once with one. I check to see the value has been passed in, then resolve using the test resolver.

@elliots
Copy link
Contributor Author

elliots commented Aug 31, 2018

I'm not sure how to go about the dial test...

@lyuxuan
Copy link
Contributor

lyuxuan commented Sep 4, 2018

Thanks for the update, @elliots. For testing, I think it's not necessary to include the dial test right now. And

All the tests in testDNSResolver are now run twice, once with no custom authority (as before) and once with one.

This will run the exactly same logic twice. Can you make a separate test for verifying the right authority address is passed to the custom DNS dialer? Thanks!

@elliots
Copy link
Contributor Author

elliots commented Sep 7, 2018

Added a new test that only checks that the resolver receives the custom authority.

@lyuxuan
Copy link
Contributor

lyuxuan commented Sep 8, 2018

Hi @elliots, thanks for the quick update. Appreciate it.

I may not have been very clear about what I meant by "dial test". I was suggesting it's not necessary to have a dns server set up and dial to it in the test. But I expect the Dial here to still be invoked. Instead of assigning the inline function to Dial, we can define a global variable to be same as the inline function, and assign the global variable to Dial. In the test, we set the global variable to be a made up dial function which does not really dial, but just verifies the dial target is the authority specified (plus the port :53 if the authority does not have a port).

)

type go18upResolver struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually an issue with our testing script (not using the most recent sdk for appengine test). I've fixed it through #2311.
Also, let's just support custom authority dialer from go1.9 (where type alias is supported, and thus the two context types are regarded as the same). Please change the build tags accordingly.

}

func ensureAuthorityPort(target string) (targetWithPort string, err error) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: delete empty line.

customAuthorityDialler = oldCustomAuthorityDialler
}()
for _, a := range tests {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 101-112 are unnecessary. ensureAuthorityPort will be called inside b.Build so we can just check the error returned from b.Build.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Errors from the dialler aren't returned from b.Build

customAuthorityDialler = func(authority string) func(ctx context.Context, network, address string) (net.Conn, error) {
	return func(ctx context.Context, network, address string) (net.Conn, error) {
		return nil, errors.New("NOPE")
	}
}

b := NewBuilder()
cc := &testClientConn{target: "foo.bar.com"}
r, err := b.Build(resolver.Target{Endpoint: "foo.bar.com", Authority: a.authority}, cc, resolver.BuildOption{})
if r != nil {
	r.Close()
}

if err == nil {
	t.Errorf("Should not be nil")
}

Am I missing something, or is that an issue with the builder?

Copy link
Contributor

@lyuxuan lyuxuan Sep 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't care the error from the customAuthorityDialler, but we do want to check the error from customAuthorityResolver (which is essentially the error returned by ensureAuthorityPort if not nil). I think the error from customAuthorityResolver is returned by b.Build.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, that's what I did. Just checking if it should be returning that error. Removed the todos and recommitted.

}
return func(ctx context.Context, network, address string) (net.Conn, error) {
var dialer net.Dialer
return dialer.DialContext(ctx, network, authority)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need to dial in the test, just return something like nil, errors.New("no need to dial").


b := NewBuilder()
cc := &testClientConn{target: "foo.bar.com"}
r, _ := b.Build(resolver.Target{Endpoint: "foo.bar.com", Authority: a.authority}, cc, resolver.BuildOption{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check the error returned by b.Build, essentially the logic from Line 101-108.

}

// target host is ipv4 or host-name
return fmt.Sprintf("%s:%s", host, port), nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems ensureAuthorityPort share substantial logic with parseTarget. Could you please merge these two functions into one. So parseTarget takes an extra argument like defaultPort. And here for ensureAuthorityPort, you calls parseTarget with the new arg to be "53", and then call the net.JoinHostPort on the returned host and port.

@kriskowal
Copy link

@elliots Please let me know if you would like me to take a round of addressing feedback. I can send a commit from a fork if need be.

@elliots
Copy link
Contributor Author

elliots commented Sep 20, 2018

Thanks @kriskowal, I'll try to get the fixes in a bit quicker so that shouldn't be necessary. But if I don't respond for a day, you should feel free to take it and get it across the line :)

@elliots elliots force-pushed the customdnsauthority branch 4 times, most recently from 2ff27eb to 15db8e2 Compare September 21, 2018 01:21
}()

for _, a := range tests {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: delete empty line.

for _, a := range tests {

customAuthorityDialler = func(authority string) func(ctx context.Context, network, address string) (net.Conn, error) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: delete empty line

return func(ctx context.Context, network, address string) (net.Conn, error) {
if authority != a.authorityWant {
t.Errorf("wrong custom authority passed to resolver. input: %s expected: %s actual: %s", a.authority, a.authorityWant, authority)
t.FailNow()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why t.FailNow() is needed?

if !a.expectError && err != nil {
t.Errorf("unexpected error using custom authority %s: %s", a.authority, err)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: delete empty line

}

}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: delete empty line

}

var customAuthorityResolver = func(authority string) (netResolver, error) {
host, port, err := parseTarget(authority, "53")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

define a constant(e.g. defaultDNSSvrPort) for "53", and replace it here and in tests.


return func(ctx context.Context, network, address string) (net.Conn, error) {
if authority != a.authorityWant {
t.Errorf("wrong custom authority passed to resolver. input: %s expected: %s actual: %s", a.authority, a.authorityWant, authority)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the the dialler will be called in a separate goroutine(i.e. the watcher goroutine), the Errorf here may not be called before the main test goroutine exit.

The better solution here is to create a channel of error type. And inside the custom dial function, push an error(or nil for no error) to this channel. And in the main goroutine, wait until an error is pushed to the channel, and check whether it is nil or not.

for _, a := range tests {
	errChan := make(chan error, 1)
	customAuthorityDialler = func(...) {
		...
    	return func(ctx context.Context, network, address string) (net.Conn, error) {
	  		if authority != a.authorityWant {
        		errChan <- fmt.Errorf(...)
			} else {
				errChan <- nil
			}
		...
	}
    ...
	// wait until the check in the dial function completes
	err := <- errChan
	// verify whether there's an error or not.
	if err != nil {
		t.Errorf(...)
	}
}              

Copy link
Contributor

@lyuxuan lyuxuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @elliots for your continuous effort and responsiveness! Just some nits that need a quick fix. Otherwise LGTM!


func (*testResolver) LookupHost(ctx context.Context, host string) ([]string, error) {
return hostLookup(host)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a blank line between functions.

}
func (*testResolver) LookupSRV(ctx context.Context, service, proto, name string) (string, []*net.SRV, error) {
return srvLookup(service, proto, name)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a blank line between functions.


func (*preGo19Resolver) LookupHost(ctx context.Context, host string) ([]string, error) {
return net.LookupHost(host)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a blank line between functions.

}
func (*preGo19Resolver) LookupSRV(ctx context.Context, service, proto, name string) (string, []*net.SRV, error) {
return net.LookupSRV(service, proto, name)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a blank line between functions.

},
{
"4.3.2.1",
"4.3.2.1:53",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to change this line to "4.3.2.1:"+defaultDNSSvrPort,, so in the future if we need to change the default port for DNS server, we don't need to also change the tests.

},
{
"::1",
"[::1]:53",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, s/53/defaultDNSSvrPort

},
{
"[::1]",
"[::1]:53",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, s/53/defaultDNSSvrPort

},
{
"dnsserver.com",
"dnsserver.com:53",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, s/53/defaultDNSSvrPort

wip add custom resolver test
@elliots
Copy link
Contributor Author

elliots commented Sep 25, 2018

@lyuxuan I think much more time was spent reviewing my changes than the changes took, thank you :)

@lyuxuan lyuxuan added this to the 1.16 Release milestone Sep 25, 2018
@lyuxuan lyuxuan added the Type: Feature New features or improvements in behavior label Sep 25, 2018
@lyuxuan lyuxuan merged commit ebea9b5 into grpc:master Sep 25, 2018
@kriskowal
Copy link

Yes! 🙏

@tywkeene
Copy link

tywkeene commented Sep 25, 2018

Looks like this merge is broken

# google.golang.org/grpc/resolver/dns
../../google.golang.org/grpc/resolver/dns/go19.go:30:18: undefined: netResolver
../../google.golang.org/grpc/resolver/dns/go19.go:42:55: undefined: netResolver
../../google.golang.org/grpc/resolver/dns/go19.go:43:32: too many arguments in call to parseTarget
	have (string, string)
	want (string)

Building in golang:1.10 on docker 17.12.1-ce

@lyuxuan
Copy link
Contributor

lyuxuan commented Sep 25, 2018

@tywkeene, can you provide more info about how this merge breaks you?

@tywkeene
Copy link

@tywkeene, can you provide more info about how this merge breaks you?

It's just a pretty cut and dry typo/build error that I posted above.

Not sure what else I could post that would help. Could you be more specific?

@lyuxuan
Copy link
Contributor

lyuxuan commented Sep 25, 2018

I am wondering how you updated your code base and got this failure(details like what's the command you use to update the code base and build, etc. will be helpful). In this merge, it added the netResolver interface and extended parseTarget to take two arguments (in dns_resolver.go). From the error message, it looks like you are getting the new go19.go file, but not getting the updated dns_resolver.go file. I am not sure what's happening here.

@tywkeene
Copy link

I am wondering how you updated your code base and got this failure(details like what's the command you use to update the code base and build, etc. will be helpful).

RUN go get google.golang.org/grpc

In my Dockerfile from the golang:1.10 image. I'm curious how this made it through the CI

I've resolved this issue by pinning to the previous commit using dep

@lyuxuan
Copy link
Contributor

lyuxuan commented Sep 25, 2018

Could you please check whether your code base has the latest dns_resolver.go file (which includes the netResolver definition and updated parseTarget)? Thanks!

@nomis52
Copy link

nomis52 commented Oct 10, 2018

Are there any estimates on when this will be available in a release ? We're eager to have it.

@lyuxuan
Copy link
Contributor

lyuxuan commented Oct 10, 2018

Our next major release is Oct 23rd. Does it work for you?

@nomis52
Copy link

nomis52 commented Oct 11, 2018

That works. We can hold until then.

@lock lock bot locked as resolved and limited conversation to collaborators Apr 9, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Type: Feature New features or improvements in behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants