Cached responses had a non-decrementing TTL before version 2.0.45 #1624
Description
Who is the bug affecting?
Users of dnscrypt-proxy.
What is affected by this bug?
DNS cache and TTL does not behave as it did in prior versions. TTL does not decrement on the second query for the same domain name and queries after that show erratic TTL behavior.
When does this occur?
It is most easily seen between the first and second queries for the same domain name with dnscrypt-proxy cache enabled.
Where does it happen?
Version 2.0.45: macOS, linux-arm, linux-arm64
How do we replicate the issue?
- install dnscrypt-proxy version 2.0.45 from GitHub releases. (reproduced on macOS, linux-arm, and linux-arm64)
- query a domain that will have it's TTL set according to the min-TTL defined in the .toml file.
- see in the response that it's TTL is set to that min-TTL value (e.g. 2399).
- make a 2nd query for the same domain name after a few seconds (or minutes) have passed.
- see in the response for the second query that it's TTL still shows as 2399 rather than the expected decremented TTL, but it has responded from cache according to the query log file.
- immediately make a 3rd query and see that it's TTL "jumps" down to a decremented TTL in accordance with how much time has passed since the 1st query.
Expected behavior (i.e. solution)
- The 1st query for a domain name would be cached for it's TTL or the min-TTL defined in the .toml file. (e.g. 2399 seconds).
- After waiting 90 seconds, a 2nd query for the same domain name would then show a decremented TTL of original TTL - 90. (e.g. 2309 seconds in this example)
Other Comments
Since updating to version 2.0.45 I have observed a change in cache behavior in comparison to prior versions. I am not sure if this is known or intended, but I thought I would ask. When a DNS query is made I would expect it to be cached for the TTL and then decrement in real time. On version 2.0.45 the first query shows a TTL of 2399 in accordance with the default .toml file. Then the second query also shows a TTL of 2399 no matter the amount of time that has passed in between the first and second query. If I waited 100 seconds between queries, for example, I would expect the TTL of the second query to have a TTL of 2299. You can tell that it is responding from cache, however, because the response time is much lower than the initial query and it is indicated in the query log. The third query will then "jump" to the expected TTL in relation to when the first query was made. As more queries for the same domain name are made, the response TTL continues to be odd. As seen in one of my screenshots, I made two queries about 103 seconds apart, however the TTL only decreases by 3.
I have re-run my tests using version 2.0.44 and it does not exhibit this behavior. The TTL decreases from the first to second query and also reliably decreases in accordance with the amount of time that has passed between queries.
I don't believe this is a "just me" issue as I have reproduced this behavior on the release binaries for macOS, linux-arm (raspberry pi), and linux-arm64 (router). I have also compared past versions and this is new as of 2.0.45. Version 2.0.44 behaves as expected.
Screenshots: