-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write salt/iv, addr together with content #160
Write salt/iv, addr together with content #160
Conversation
Sending salt, address and content seperately would generate three packets at the start of a connection. Since the size of salt is almost always 8, 16 or 32, it leaves a strong feature. Additionally, with TCP fastopen turned on, only the first write would be carried by SYN packet. Writing multiple times makes it pointless.
This decouples two sides and allow easier third party integration.
088e459
to
eb0109a
Compare
Salt and target address can be combined in a single packet because we know both pieces of information immediately when a client connects. Actual content is a bit challenging because for a generic TCP connection the client might not send any data but expect the server to send information (e.g. hello banner) first, though popular protocols like HTTP do not have this problem. A workaround could be to wait a couple millisecond for the client to send actual payload to combine with salt and target address in the first packet. Go still lacks proper support for TCP Fast Open. However if local side is on Linux kernel 4.11 and above, it could be hacked on right now. See golang/go#4842 for detail. |
As for the connection without initial data problem, shadowsocks-libev behaves the same so I guess it's good enough for most implementation? For the Fastopen, yes I am intended to implement using this method but we first need to send all data in one row in order to drop TTFB. |
1aa0e64
to
23b392e
Compare
23b392e
to
310a1b2
Compare
I just gave it a timeout. @riobard What do you think? |
To be honest I'm hesitant to add artificial delay. TLS 1.3 and TFO spends all effort to cut handshake time and we just spend that saving right away… |
I agree. But if it maintain the current behavior, TFO will have no effect. This change will save 1 RTT for normal connections if TFO is enabled and add 5ms for connections that don't send data on the initiator side. Maybe we can have an option to disable the behavior and allow setting the delay manually? |
Wait, I thought we do not currently have TFO yet? |
Yes. I mean even if you add TFO support, only the first packet will be sent in the handshake process. So TFO will have no effect if we send multiple packets for the request. I will add TFO support in another PR if this one is merged. |
@riobard Any update? |
@Anonymous-Someneese Sorry I've been busy recently and haven't got time to think about the issue. Maybe @lixin9311 could check the PR and see if it fits. |
I think it could be a more elegant way, if we pass target address to the |
I would be happy to send another PR addressing the same issue, but this patch currently doesn't look good to me. |
@lixin9311 Thank you for your work. But what is your solution to this problem? |
see |
Today I've got some time to think about this issue. There are three separate but related problems here:
Given the above understanding of the problem landscape I tend to think we just do the minimal work of fixing problem 1. Please share your thoughts and we could discuss. |
Yes, I agree. If we only need to send iv and address, we can manually merge those 2 packets, without any system calls. |
I think a code change should be implemented instead of changing socket label since many application uses this project as library. We should at least concatenat iv and addr packets to eliminate the protocol feature. |
I strongly recommend coalescing salt+address+initial data. It makes the size of the initial packet hard to predict, which is not the case for salt+address. We were able to implement this in outline-ss-server: Jigsaw-Code/outline-ss-server#73 In practice there's no delay most of the time, since clients will usually send data immediately. For server-speaks-first protocols, there's a minimal 10ms delay, but I don't think it matters. We are in the process of releasing this change to Outline Clients during this week and the next. We have previously made a change to coalesce the salt and initial server data, which is a lot simpler: That has been running on production servers for a few weeks now. |
Coalescing salt+address+initial data also removes the possibility of some replay attacks that observe where the boundaries of the encryption blocks are, since it becomes one single block with unpredictable length. |
@fortuna I think (haven't actually tried yet) that by turning on Nagle's algorithm for the first couple of packets we can get the same coalescing effect without explicit packing bytes? |
I'm curious to hear whether that works. I also haven't tried it. It would certainly be an easy way to implement the packet coalescing. However, from what I read about the Nagle's algorithm, it only buffers if there's unconfirmed data in the pipe, which I interpret as immediately sending the first write as its own packet, which wouldn't accomplish what we need. |
@fortuna My bad. After further thinking about it, what you said is correct, and Nagle's algo won't help in this case. |
OK, I've found the correct mechanism to implement this without breaking code layering and API compatibility. I was confused and thinking about TCP_NODELAY (Nagle's algo), but actually it's TCP_CORK that'll do wonders in the case. A naive solution is like this:
Caveats:
|
A proof of concept for client-side use 50676a7 |
The TCP_CORK solution seems convenient, but unfortunate that it's for Linux only. We opted to implement the in user space, which we have much more control and is cross-platform. Maybe you implement TCP_CORK in the interim before implementing in the user space? Feel free to copy what we did by the way. We effectively have a LazyWrite call in the ShadowsocksWriter that appends to the internal buffer, and a Flush call that writes to the TCP connection. Regular Write will also flush. We were able to do it without requiring an extra buffer. You just need to make sure there's space for the AEAD tag after the lazy data, in case you need to flush before the regular ReadFrom finishes the first read. |
BSD has TCP_NOPUSH, so the last commit works on both Linux and BSD. It is also trivial to implement corking (or traffic shaping in general) in user-space by wrapping |
Turns out a generic user-space TCP_CORK is very simple, see this feature branch https://github.com/shadowsocks/go-shadowsocks2/tree/feature-generic-cork |
Generic TCP corking has been merged in #186. Closing this now. |
Sending salt, address and content seperately would generate three packets during the start of a connection. Since the size of salt is almost always 8, 16 or 32 and the address are pretty much always 20-100 bytes, it leaves a strong feature.
Additionally, with TCP fastopen turned on, only the first write would be carried by
SYN packet. The other writes will have to wait until the handshake complete, making fastopen pointless.