Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tailscale sometimes stops announcing PUBLIC_IP:LISTENING_PORT as endpoint #14494

Open
dionorgua opened this issue Dec 29, 2024 · 2 comments
Open
Labels
bug Bug connectivity Issues with general connectivity with Tailscale needs-triage waiting-for-info Issues that require participants to provide additional information to move forward

Comments

@dionorgua
Copy link

What is the issue?

Hi,

I've found that usually tailscale announces DISCOVERED_PUBLIC_IP:LISTENING_PORT as possible endpoint. I'm able to 'forward' port on router so I changed LISTENING_PORT on all important clients to be unique and configured router to forward that port.

So I've followed:

  • pihole listens on port 41646 and this port is forwarded on router to LAN address of that machine
  • storr listens on port 41648 and this port is forwarded on router to LAN address of storr

It works initially but for some unknown reason pihole machine stops announcing that endpoint.. Usually reboot fixes this for some time.. At the same time LAN_IP:41646 is still announced.

I've verified that once this happens port forwarding works (I can send something via nc -u to PUBLIC_IP:41646 and capture this on pihole via tcpdump.

Both machines are LXC containers with bridged network to router.

Any way to always announce of PUBLIC_IP:LISTENING_PORT?

pihole bug: BUG-433c4ffd2ed14573d8a2a6132f3fc38dc16f367593a5100792552d2828e461eb-20241229140805Z-d81e4d57c4615eed
storr: BUG-c98477da564c543ea82b6c5ad88a82d852479d2d3df1e5f801b82904d6187301-20241229141014Z-91bec7b9c8aefd56

Steps to reproduce

No response

Are there any recent changes that introduced the issue?

No response

OS

Linux

OS version

Debian 12 LXC container

Tailscale version

1.78.1

Other software

No response

Bug report

BUG-433c4ffd2ed14573d8a2a6132f3fc38dc16f367593a5100792552d2828e461eb-20241229140805Z-d81e4d57c4615eed

@bradfitz
Copy link
Member

I see... your node is bouncing back and forth between easy NAT and hard NAT. And Tailscale only advertises your locally configured listening $PORT environment variable (blended with your STUN-discovered IP) as a sort of last ditch desperate move when you're in hard NAT mode. When you're found to be in easy NAT mode, that endpoint (type EndpointSTUN4LocalPort) is no longer announced.

I suppose we could send that more aggressively. Perhaps:

  • if a port is set at all, which is rare.
  • if we've ever seen hard NAT for this node

But does this break Tailscale connectivity? You didn't mention if this is just something you noticed, or whether it's a problem.

Out of curiosity, what's your upstream router? A "SoftAtHome" it looks like? Which ISP is that?

@bradfitz bradfitz added connectivity Issues with general connectivity with Tailscale waiting-for-info Issues that require participants to provide additional information to move forward labels Dec 30, 2024
@dionorgua
Copy link
Author

Thanks for analyzing this! It's shitty Orange Funbox 6 router (Poland). I'm sure hardware is by SoftAtHome.

I've reported UPnP issue with it here: #14344

Unfortunately I can't easily change it because it's somehow integrated to Orange infrastructure. Certain router settings are available via usual admin interface (using router LAN IP). But certain settings can be changed from ISP web interface.
WAN port is connected to Fiber converter and runs PPPoE. In any case I've very limited access to settings. I know that UPnP mostly works, but tailscale fails sometimes. syncthing works stable.

Also it looks like port forwarding works (or mostly works). Could tailscale be confused by periodic downtime? I'm getting normal globally reachable IPv4 dynamic address. I think they need to break PPPoE session for some time to change it.

Another thing that I've access to is "DMZ" setting. I think that I can give up on this and put another router just after this thing. But I've not checked that it works and not sure whether UPnP will work with this. I think new should also be DHCP server and default gateway because otherwise I'll get #5502

That node is 'pihole'. And my tailnet is configured to use it as DNS override. So this issue breaks direct connectivity to phone over LTE. DERP mostly works but it feels much more laggy. I've even tried to host own DERP with much smaller latency than your Poland DERP servers. But it's also' tricky due to DNS issues. Periodically whole tailnet is stuck somehow and logs are full of DNS query errors to access control servers. I've found that it's much better if DERP server has both IP and domain name in ACL and if DERP server is configured so that /bootstrap-dns provides DNS info for control servers. But for some reason it's still happens sometimes.

PS. If you're talking about PORT= in /etc/default/tailscale I think it's always set and passed as tailscaled --port and your idea is to announce only if it's not default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug connectivity Issues with general connectivity with Tailscale needs-triage waiting-for-info Issues that require participants to provide additional information to move forward
Projects
None yet
Development

No branches or pull requests

2 participants