Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net_processing: Retry notfounds with more urgency #18238

Closed
wants to merge 1 commit into from

Conversation

ajtowns
Copy link
Contributor

@ajtowns ajtowns commented Mar 2, 2020

Anytime we see a NOTFOUND in response to a request for a tx, look through
each of our peers for anyone else who announced the tx, find one who
doesn't already have its inflight tx count maxed out, and of those,
make the one who'd look at it first, look at it asap.

Anytime we see a NOTFOUND in response to a request for a tx, look through
each of our peers for anyone else who announced the tx, find one who
doesn't already have its inflight tx count maxed out, and of those,
make the one who'd look at it first, look at it asap.
@fanquake fanquake added the P2P label Mar 2, 2020
@ajtowns
Copy link
Contributor Author

ajtowns commented Mar 2, 2020

This is an alternative approach to #15505 that's minimally invasive on the data structures. Doing something about this was suggested as a pre-req for #17303 in #17303 (comment)

@sdaftuar expressed some concerns about how much of a CPU hit it could be in worst case scenarios, but it doesn't look too bad in testing:

diff --git a/src/net_processing.cpp b/src/net_processing.cpp
index 3373f7f544..1b9e66456a 100644
--- a/src/net_processing.cpp
+++ b/src/net_processing.cpp
@@ -735,8 +735,13 @@ std::chrono::microseconds CalculateTxGetDataTime(const uint256& txid, std::chron
     return process_time;
 }
 
+static uint64_t xxx_time_spent GUARDED_BY(cs_main) = 0;
+static uint64_t xxx_invocations GUARDED_BY(cs_main) = 0;
+
 static void RetryProcessTx(CConnman& connman, const uint256& txid, const std::chrono::microseconds current_time) EXCLUSIVE_LOCKS_REQUIRED(cs_main)
 {
+    uint64_t time_start = GetTimeMicros();
+
     CNodeState::TxDownloadState* best_d = nullptr;
     std::chrono::microseconds best;
 
@@ -764,6 +769,12 @@ static void RetryProcessTx(CConnman& connman, const uint256& txid, const std::ch
             }
         }
     }
+
+    xxx_time_spent += GetTimeMicros() - time_start;
+    ++xxx_invocations;
+    if (xxx_invocations % 1000 == 0) {
+        LogPrintf("Time spent in RetryProcessTx %d.%03ds, %d us per call (%d calls)\n", xxx_time_spent / 1000000, (xxx_time_spent / 1000) % 1000, xxx_time_spent/xxx_invocations, xxx_invocations);
+    }
 }
 
 void RequestTx(CNodeState* state, const uint256& txid, std::chrono::microseconds current_time) EXCLUSIVE_LOCKS_REQUIRED(cs_main)
diff --git a/test/functional/p2p_notfound_perf.py b/test/functional/p2p_notfound_perf.py
new file mode 100755
index 0000000000..970452696c
--- /dev/null
+++ b/test/functional/p2p_notfound_perf.py
@@ -0,0 +1,63 @@
+#!/usr/bin/env python3
+# Copyright (c) 2017-2018 The Bitcoin Core developers
+# Distributed under the MIT software license, see the accompanying
+# file COPYING or http://www.opensource.org/licenses/mit-license.php.
+"""Test that we don't leak txs to inbound peers that we haven't yet announced to"""
+
+import time
+from test_framework.messages import msg_notfound, msg_inv, CInv
+from test_framework.mininode import P2PDataStore
+from test_framework.test_framework import BitcoinTestFramework
+from test_framework.util import (
+    assert_equal,
+)
+
+
+class P2PNode(P2PDataStore):
+    def on_inv(self, msg):
+        pass
+
+    def on_getdata(self, msg):
+        t = time.time()
+        self.notfound_queue.extend(msg.inv)
+        for inv in msg.inv:
+            self.getdata[inv] = t
+        while len(self.notfound_queue) >= 100:
+            self.send_message(msg_notfound(vec=self.notfound_queue[:100]))
+            self.notfound_queue = self.notfound_queue[100:]
+
+    def summary(self):
+        return len(self.getdata), len(self.notfound_queue)
+
+class P2PNotFoundPerf(BitcoinTestFramework):
+    def set_test_params(self):
+        self.num_nodes = 1
+
+    def run_test(self):
+        PEERS = 11
+        TRANSACTIONS = 99000
+
+        gen_node = self.nodes[0]  # The block and tx generating node
+        gen_node.generate(1)
+
+        inbound_peers = [ self.nodes[0].add_p2p_connection(P2PNode()) for _ in range(PEERS) ]
+        for inbound in inbound_peers:
+            inbound.getdata = {}
+            inbound.notfound_queue = []
+
+        for txbatch in range(TRANSACTIONS//100):
+            self.log.info("Doing batch %d" % (txbatch+1))
+            ann = [CInv(t=1, h=(txbatch*1000+i)) for i in range(100)]
+            for inbound in inbound_peers:
+                inbound.send_message(msg_inv(inv=ann))
+
+        #gen_node.logging(exclude=['net'])
+
+
+        for i in range(60):
+            self.log.info("State: " + " ".join("%d:%d" % inbound.summary() for inbound in inbound_peers))
+            time.sleep(15)
+
+
+if __name__ == '__main__':
+    P2PNotFoundPerf().main()

Gives: 2020-03-02T06:58:54.778347Z [msghand] Time spent in RetryProcessTx 18.754s, 17 us per call (1089000 calls) by the end for me, which doesn't seem too bad. It took about 12 minutes for all 99k transactions to get requested/notfound by all 11 peers, so 19 seconds total doesn't seem too bad. (The check for MAX_PEER_TX_IN_FLIGHT significantly cuts down on whether this actually does anything -- if there's lots of tx's announced by each peer, then most of them will always have most tx's in flight)

@DrahtBot
Copy link
Contributor

DrahtBot commented Mar 2, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@DrahtBot DrahtBot mentioned this pull request Mar 2, 2020
2 tasks
@naumenkogs
Copy link
Member

naumenkogs commented Mar 2, 2020

Concept ACK. Code looks good to me.
Can you provide more context to understand the test? Are you basically trying to show that DoS-vector your PR opens is probably less than those which already exist?

@ajtowns
Copy link
Contributor Author

ajtowns commented Mar 4, 2020

@naumenkogs Yeah; in a worst case scenario you could get 100 txids in a NOTFOUND message, have 125 peers to cycle through, and each peer could have announced 100k txs, so you end up with something like O( 12,500 * log(100000) ) operations as a result of a single message (albeit with a fair chunk of setup). The test indicates that 11 peers on my hardware gives a time of about 20us per txid, so per NOTFOUND with 110 peers (so a factor of 100*10) that might result in 20ms processing time on a single message. That seemed low enough to be worth a PR to me.

If that's not good enough, could make it a bit lower in the worst case fairly easily by not retrying NOTFOUNDs as aggressively (eg, only do it for the first 10 txids in a NOTFOUND message, or ban peers if they often report a txid as NOTFOUND despite having INVed it recently, or change the retry delay from 0s-2s to 2s-4s, etc); but it'd need more rework of the data structures to get it to work efficiently, and then you have to ensure the new data structures don't introduce different DoS possibilities.

@naumenkogs
Copy link
Member

No, I think this sounds reasonable. I'd rather not introduce new data structures. As I said, we probably already have easier ways to DoS a node.

luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request Mar 5, 2020
Anytime we see a NOTFOUND in response to a request for a tx, look through
each of our peers for anyone else who announced the tx, find one who
doesn't already have its inflight tx count maxed out, and of those,
make the one who'd look at it first, look at it asap.

Github-Pull: bitcoin#18238
Rebased-From: a204d15
@mzumsande
Copy link
Contributor

Concept ACK.

One difference to #15505 is that the retry-request there would be from outbound peers only, whereas this PR also includes inbound peers.
If I understand it correctly, this does not create any additional attack surface (e.g. for InvBlocks), because we keep the exact order of retries from the case where the peer does not answer at all (not even with NOTFOUND), just ask our next peer in line earlier than we would have otherwise.

@naumenkogs
Copy link
Member

naumenkogs commented Mar 22, 2020

@mzumsande there's a little threat that, unlike with prior behavior, we won't have a 1 minute window between first peer announced a tx and followed up with NOTFOUND, and we execute requests for the next peers, during which more honest outbound peers can come with announcements and get prioritized against inbound dishonest peers.

But the only ways I think this can be exploited require dropping a transaction from an honest announcer's mempool, which is probably a more critical vulnerability anyway, so this doesn't reduce the overall security.

}
}

std::chrono::microseconds process_time = current_time + GetRandMicros(MAX_NOTFOUND_RETRY_RANDOM_DELAY);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is 2-seconds random delay useful here? It can accidentally prioritize inbound nodes over outbound :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It already tests that process_time < best so the 2s delay will only bring a process time forward, not push it back past an inbound peer's process time.

I think the scenario is likely to be:

  • bunch of peers announce a new tx, outbound connections get polled asap, inbound connections get polled after 2s
  • you pick one of those, likely an outbound, and send a GETDATA
  • other peers announce, get a 60s-64s delay
  • the early announcing peers get their process time rescheduled with an additional 60-64s delay
  • a NOTFOUND comes back, and you pick whoever's delay was shortest; if it came back in under 2s, you'll next ask an inbound connection who announced early; if it came back later, you'll probably ask an outbound connection who announced after you'd sent the first GETDATA

Which is to say, I don't think we're consistently prioritising outbounds after the first request anyway.

I don't have a good rationale for the 2s random delay; instantly going from a NOTFOUND from one peer to a GETDATA to another peer seems risky to me though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was a bit confused, now what's on your mind makes sense to me.
I'm probably fine with leaving your 2s random delay.

Which is to say, I don't think we're consistently prioritising outbounds after the first request anyway.

I think we do? That's probably off-topic here, although i'd be curious why you think so.

std::chrono::microseconds best;

for (auto& el : mapNodeState) {
CNodeState::TxDownloadState* d = &el.second.m_tx_download;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we come up with a better variable name? :(
Maybe "tx_communication" or something. Yeah it sucks, but probably nothing is worse than 1-letter name for something meaningful like this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also best->best_time

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or I renamed best to lower_process_time while reviewing, but I agree names can be more meaningful.

@@ -4039,6 +4076,7 @@ bool PeerLogicValidation::SendMessages(CNode* pto)
// Erase this entry from tx_process_time (it may be added back for
// processing at a later time, see below)
tx_process_time.erase(tx_process_time.begin());
state.m_tx_download.m_tx_announced[txid] = std::chrono::microseconds::zero();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line handles a tricky corner case. Can we add comment to reduce the cognitive burned on code reviewer?
What I think happens is (correct me if I'm wrong):

  • request from X1, not found
  • request from X2, ignores us for 1 minute
  • request from X3, notfound
  • this line prevents us from querying X2 again (we would if this line is deleted)

Now that I wrote this, one could simply check the presence of m_tx_in_flight for that peer, and if it is in flight, not consider for re-query?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea was to stop you from doing INV x; NOTFOUND x; INV x repeatedly and making it harder for you to query other peers. But I think this introduces a bug -- the zero entry never gets cleared from m_tx_announced, because that's only cleared based on what's in in_flight and process_time and the txid isn't in either of those anymore.

Copy link
Contributor

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe commit message can be clearer and points to issue solved :

"Removing mapRelay would be a direct privacy improvement, but it may turn small-mempool peers as Dosers. If announced transactions are dropped from mempools and aren't available anymore in mapRelay, requesters will keep sending GETDATA until download expiration. By retrying requesting NOTFOUND transactions with different peers we avoid this issue. Note a dishonest peer can still withhold NOTFOUND to trigger same behavior from requester."?

std::chrono::microseconds best;

for (auto& el : mapNodeState) {
CNodeState::TxDownloadState* d = &el.second.m_tx_download;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or I renamed best to lower_process_time while reviewing, but I agree names can be more meaningful.

CNodeState::TxDownloadState* best_d = nullptr;
std::chrono::microseconds best;

for (auto& el : mapNodeState) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider to filter by nPreferredDownload to favor outbound peers and lower DoS risk? Only querying outbound peers should be good enough to achieve goal of finding NOTFOUND transactions but even if we don't success due to bad-connectivity of our outbounds, worst-case scenario we hit the 1-min window (and a transaction not being announced by our outbounds is less likely to be a honest announcement?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be even better to change the 'best' definition to prefer outbound peers where we can, but fall back to inbound peers if there are no outbound peers that have announced the tx.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already prefer outbound peers when we bump the process time (in CalculateTxGetDataTime()), and this chooses the lowest process time which will reflect that preference. I don't think it makes sense to complicate this further unless someone wants to do some real world testing on how well/badly the process time preferencing works in practice when the first thing we try results in a notfound.

Copy link
Contributor

@jnewbery jnewbery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK, but lots of questions.

My main concern is that TxDownloadState is becoming more complex, with data duplicated between member variables and only certain combinations being valid states. To reduce cognitive overload and risk of bugs, perhaps it should turned into a class with a well-defined interface for callers.

static void RetryProcessTx(CConnman& connman, const uint256& txid, const std::chrono::microseconds current_time) EXCLUSIVE_LOCKS_REQUIRED(cs_main)
{
CNodeState::TxDownloadState* best_d = nullptr;
std::chrono::microseconds best;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get a compiler warning that best might be used before initialization:

net_processing.cpp: In function ‘bool ProcessMessage(CNode*, const string&, CDataStream&, int64_t, const CChainParams&, CConnman*, BanMan*, const std::atomic<bool>&)’:
net_processing.cpp:756:27: warning: ‘best’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     if (best_d != nullptr && process_time < best) {
         ~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
net_processing.cpp:741:31: note: ‘best’ was declared here
     std::chrono::microseconds best;
                               ^~~~

The logic below means that this can't be read before set, but I don't see any harm in initializing to 0 or std::chrono::microseconds::max().

/* Store all the transactions a peer has recently announced,
* along with their process time
*/
std::map<uint256, std::chrono::microseconds> m_tx_announced;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be an unordered map for average constant-time lookup:

diff --git a/src/net_processing.cpp b/src/net_processing.cpp
index 3373f7f54..2e097d44e 100644
--- a/src/net_processing.cpp
+++ b/src/net_processing.cpp
@@ -30,6 +30,7 @@
 
 #include <memory>
 #include <typeinfo>
+#include <unordered_map>
 
 #if defined(NDEBUG)
 # error "Bitcoin cannot be compiled without assertions."
@@ -204,6 +205,11 @@ namespace {
     static std::vector<std::pair<uint256, CTransactionRef>> vExtraTxnForCompact GUARDED_BY(g_cs_orphans);
 } // namespace
 
+struct TxHasher
+{
+    size_t operator()(const uint256& hash) const { return ReadLE64(hash.begin()); }
+};
+
 namespace {
 /**
  * Maintain validation-specific state about nodes, protected by cs_main, instead
@@ -349,7 +355,7 @@ struct CNodeState {
         /* Store all the transactions a peer has recently announced,
          * along with their process time
          */
-        std::map<uint256, std::chrono::microseconds> m_tx_announced;
+        std::unordered_map<uint256, std::chrono::microseconds, TxHasher> m_tx_announced;
 
         //! Store transactions which were requested by us, with timestamp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you'd need that to remain constant/log time in worst-case/attack scenarios because the announced hash is completely under the control of an attacker.

//! Store all the transactions a peer has recently announced
std::set<uint256> m_tx_announced;
/* Store all the transactions a peer has recently announced,
* along with their process time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment could be slightly improved. The time is set to:

  • std::chrono::microseconds::zero() if the transaction has been requested from this peer (ie exists in m_tx_in_flight)
  • the process time if the transaction has not yet been requested from this peer (ie exists in m_tx_process_time)

@@ -3222,6 +3257,7 @@ bool static ProcessMessage(CNode* pfrom, const std::string& strCommand, CDataStr
std::vector<CInv> vInv;
vRecv >> vInv;
if (vInv.size() <= MAX_PEER_TX_IN_FLIGHT + MAX_BLOCKS_IN_TRANSIT_PER_PEER) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider moving all of the NOTFOUND processing to its own function ProcessNotFound()? Splitting the logic between here and RetryProcessTx() makes it more difficult to follow than if it was all in one place.

auto end = best_d->m_tx_process_time.end();
for (auto it = best_d->m_tx_process_time.lower_bound(best); it != end && it->first == best; ++it) {
if (it->second == txid) {
best_d->m_tx_process_time.erase(it);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm misunderstanding the logic here, but this will bring the NodeState's m_tx_process_time forward (to now + ~1second), but does nothing to the global g_already_asked_for time, which means that next time we go round SendMessages(), we won't actually rerequest the transaction, since last_request_time > current_time - GETDATA_TX_INTERVAL. Am I missing something?

If I'm right, I think we need to do something like reset g_already_asked_for to one minute ago.

@@ -3234,6 +3270,7 @@ bool static ProcessMessage(CNode* pfrom, const std::string& strCommand, CDataStr
}
state->m_tx_download.m_tx_in_flight.erase(in_flight_it);
state->m_tx_download.m_tx_announced.erase(inv.hash);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we're potentially rerequesting txs from different peers after receiving a NOTFOUND, I think we shouldn't erase them from a peer's m_tx_in_flight and m_tx_announced structures in receipt of a NOTFOUND, and keep them there until the TX_EXPIRY_INTERVAL in SendMessages(), otherwise there might be an attack where an adversary has multiple connections to you, and then juggles you between them by sending an INV for a transaction, holding on to the GETDATA, and then sending a NOTFOUND immediately followed by an INV reannouncing the same tx.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that means a peer that sends a NOTFOUND is treated exactly the same as one that just ignores the request; so NOTFOUND is just a "fyi, I'm never going to respond, so feel free to try someone else". That... seems like it makes a lot of sense? If we were to ever start punishing peers for (frequently) not responding to tx requests, maybe it would make sense to punish them for (equally frequently) sending notfounds too?

@fjahr
Copy link
Contributor

fjahr commented Mar 25, 2020

Concept ACK but echoing jnewbery's feedback I also still have questions and agree with most of his code comments. I also think that code is currently not tested, so adding an explicit test would be very valuable. Additionally, comments in the RetryProcessTx code would be helpful for understanding.

nit on the commit message: s/find one who doesn't already have its inflight/find all who don't already have their inflight/

Copy link
Contributor

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

if (d->m_tx_in_flight.size() >= MAX_PEER_TX_IN_FLIGHT) continue;
auto it = d->m_tx_announced.find(txid);
if (it != d->m_tx_announced.end()) {
if (best_d == nullptr || (it->second != std::chrono::microseconds::zero() && it->second < best)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition could probably be simplified considerably if (1) best is initialized appropriately and (2) the entry for the txid in m_tx_announced is removed on line 4079 rather than zeroed (like what is done on line 3272). Was zeroing chosen on line 4079 for a specific reason?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess for the reason is here :)

CNodeState::TxDownloadState* best_d = nullptr;
std::chrono::microseconds best;

for (auto& el : mapNodeState) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be even better to change the 'best' definition to prefer outbound peers where we can, but fall back to inbound peers if there are no outbound peers that have announced the tx.

@@ -344,8 +346,10 @@ struct CNodeState {
*/
std::multimap<std::chrono::microseconds, uint256> m_tx_process_time;

//! Store all the transactions a peer has recently announced
std::set<uint256> m_tx_announced;
/* Store all the transactions a peer has recently announced,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this in PR Review Club (https://bitcoincore.reviews/18238.html#l-115 and https://bitcoincore.reviews/18238.html#l-180), and there was some agreement that even for a small struct like this, the dependencies between the fields make it difficult to reason about and potentially bug-prone.

I think from a high level, in the future we might want a TxDownloadState class which contains:

  • {txid, timestamp, state} objects where
    • state can be 'announced' or 'requested'
    • timestamp refers to 'when to request' if state is announced, and 'when requested' if state is requested
  • a way to lookup the object by txid
  • a way to iterate through the 'announced' objects sorted by timestamp
  • a way to iterate through the 'requested' objects (doesn't necessarily need to be sorted by timestamp since there can only be MAX_PEER_TX_IN_FLIGHT = 100 of them)
  • public functions to add, remove, refresh timestamps, etc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm +1 on this, but maybe it'd be better to get wtxid relay #18044 in first?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had a first attempt at this: https://github.com/jnewbery/bitcoin/tree/2020-03-tx-download-class

No tests or comments yet, and it could be tidied up, but do you think this is heading in the right direction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was having a go at it too when I saw your comment, https://github.com/ajtowns/bitcoin/commits/202002-bump-notfound-wip -- separates out the current code into the class first, before changing it, which I think works better. Might be good to have some comments/style nits on the refactor?

@@ -75,6 +75,8 @@ static constexpr std::chrono::microseconds INBOUND_PEER_TX_DELAY{std::chrono::se
static constexpr std::chrono::microseconds GETDATA_TX_INTERVAL{std::chrono::seconds{60}};
/** Maximum delay (in microseconds) for transaction requests to avoid biasing some peers over others. */
static constexpr std::chrono::microseconds MAX_GETDATA_RANDOM_DELAY{std::chrono::seconds{2}};
/** Delay between receiving a NOTFOUND and trying the next peer. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why have any delay?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have some discussion here

@DrahtBot
Copy link
Contributor

DrahtBot commented Jun 4, 2020

🐙 This pull request conflicts with the target branch and needs rebase.

@ajtowns
Copy link
Contributor Author

ajtowns commented Jun 15, 2020

Obsoleted by #19184

@ajtowns ajtowns closed this Jun 15, 2020
fanquake added a commit that referenced this pull request Jul 14, 2020
f32c408 Make sure unconfirmed parents are requestable (Pieter Wuille)
c4626bc Drop setInventoryTxToSend based filtering (Pieter Wuille)
43f02cc Only respond to requests for recently announced transactions (Pieter Wuille)
b24a17f Introduce constant for mempool-based relay separate from mapRelay caching (Pieter Wuille)
a9bc563 Swap relay pool and mempool lookup (Pieter Wuille)

Pull request description:

  This implements the follow-up suggested here: #18861 (comment) . Instead of checking `setInventoryTxToSend`, maintain an explicit bloom filter with the 3500 most recently announced invs, and permit fetching any of these as long as they're in the relay pool or the mempool. In addition, permit relay from the mempool after just 2 minutes instead of 15.

  This:

  * Fixes the brief opportunity an attacker has to request unannounced invs just after the connection is established (pointed out by naumenkogs, see #18861 (comment)).
  * Guarantees that locally resubmitted invs after `filterInventoryKnown` rolls over can still be requested (pointed out by luke-jr, see #18861 (comment)).

  It adds 37 KiB of filter per peer.

  This is also a step towards dropping the relay pool entirely and always relaying from the mempool directly (see #17303), but that is still blocked by dealing properly with NOTFOUNDs (see #18238).

ACKs for top commit:
  jnewbery:
    reACK f32c408
  jonatack:
    re-ACK f32c408 per `git range-diff f7c19e8 2da7ee3 f32c408` and redid the following: code review, thought about motivation, DoS and privacy aspects, debug build to check for warnings after updating Clang from 6 to 11 since last review.
  ajtowns:
    re-ACK f32c408

Tree-SHA512: aa05b9fd01bad59581c4ec91836a52d7415dc933fa49d4c4adced79aa25aaad51e11166357e8c8b29fbf6021a7401b98c21b850b5d8e8ad773fdb5d6608e1e85
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Jul 14, 2020
f32c408 Make sure unconfirmed parents are requestable (Pieter Wuille)
c4626bc Drop setInventoryTxToSend based filtering (Pieter Wuille)
43f02cc Only respond to requests for recently announced transactions (Pieter Wuille)
b24a17f Introduce constant for mempool-based relay separate from mapRelay caching (Pieter Wuille)
a9bc563 Swap relay pool and mempool lookup (Pieter Wuille)

Pull request description:

  This implements the follow-up suggested here: bitcoin#18861 (comment) . Instead of checking `setInventoryTxToSend`, maintain an explicit bloom filter with the 3500 most recently announced invs, and permit fetching any of these as long as they're in the relay pool or the mempool. In addition, permit relay from the mempool after just 2 minutes instead of 15.

  This:

  * Fixes the brief opportunity an attacker has to request unannounced invs just after the connection is established (pointed out by naumenkogs, see bitcoin#18861 (comment)).
  * Guarantees that locally resubmitted invs after `filterInventoryKnown` rolls over can still be requested (pointed out by luke-jr, see bitcoin#18861 (comment)).

  It adds 37 KiB of filter per peer.

  This is also a step towards dropping the relay pool entirely and always relaying from the mempool directly (see bitcoin#17303), but that is still blocked by dealing properly with NOTFOUNDs (see bitcoin#18238).

ACKs for top commit:
  jnewbery:
    reACK f32c408
  jonatack:
    re-ACK f32c408 per `git range-diff f7c19e8 2da7ee3 f32c408` and redid the following: code review, thought about motivation, DoS and privacy aspects, debug build to check for warnings after updating Clang from 6 to 11 since last review.
  ajtowns:
    re-ACK f32c408

Tree-SHA512: aa05b9fd01bad59581c4ec91836a52d7415dc933fa49d4c4adced79aa25aaad51e11166357e8c8b29fbf6021a7401b98c21b850b5d8e8ad773fdb5d6608e1e85
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants