-
Notifications
You must be signed in to change notification settings - Fork 36.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split orphan handling from net_processing into txorphanage #21148
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
76f62a4
to
70353b4
Compare
Rebased for conflict with #21062 |
Concept ACK. Will review once this is rebased and out of draft status. |
70353b4
to
a875ef7
Compare
a875ef7
to
7990854
Compare
Rebased, tweaked the orphan code a little, and cleaned up the results some. Should be ready for review. |
concept ACK, starting to review. two thoughts for now-
|
@amitiuttarwar 1) The first commit just sets up the new files and moves whole functions / globals into it. Later commits are pulling code out into their own functions, restructuring functions into classes, etc. 2) hmm, I was ignoring chrono stuff since #21015 exists, but I guess it didn't switch COrphanTx to chrono. Not sure it makes sense to add more things to this PR versus adding to the #20758 backlog. [EDIT: ie, feel free to add suggestions to 20758, but maybe this PR already has enough stuff in scope?] |
7990854
to
649cad4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK 649cad4
Mostly style comments, with a couple of requests for minor changes to the interface that I think would improve the encapsulation.
unsigned int nMaxOrphanTx = (unsigned int)std::max((int64_t)0, gArgs.GetArg("-maxorphantx", DEFAULT_MAX_ORPHAN_TRANSACTIONS)); | ||
unsigned int nEvicted = LimitOrphanTxSize(nMaxOrphanTx); | ||
unsigned int nEvicted = m_orphanage.LimitOrphans(nMaxOrphanTx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't have to be in this PR, but perhaps a future PR could make nMaxOrphanTx
a ctor argument, and then just call LimitOrphans()
without an argument. This is the only place in the non-test code that LimitOrphans
is called, and it's always with the same argument.
In fact, once const nMaxOrphanTx
is a member of the orphanage, we don't need a LimitOrphans()
call at all. We can just limit the size within the TxOrphanage::AddTx
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about the const
part -- it might make sense for the limit to change depending on how much pressure there is in the mempool at large. Agree otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 came here to make a very similar suggestion :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code review ACK eeeafb3
the orphanage is v. beautiful! my least favorite thing about it is the two-line if statements without parenthesis (they were copied over in move-only commits), but I'll survive :)
src/net_processing.cpp
Outdated
@@ -1003,7 +1003,7 @@ void PeerManagerImpl::FinalizeNode(const CNode& node, bool& fUpdateConnectionTim | |||
for (const QueuedBlock& entry : state->vBlocksInFlight) { | |||
mapBlocksInFlight.erase(entry.hash); | |||
} | |||
EraseOrphansFor(nodeid); | |||
WITH_LOCK(g_cs_orphans, EraseOrphansFor(nodeid)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit 38a11c3, “txorphanage: Add lock annotations”
Is the main advantage of moving where we grab g_cs_orphans
from EraseOrphansFor
into the call site in FinalizeNode
to allow for the clang safety annotations / dynamic AssertLockHeld
check? Since my understanding is that FinalizeNode
is sometimes called with g_cs_orphans
already held, so this doesn't seem to reduce the recursive locking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
denialofservice_tests locks g_cs_orphans
then calls EraseOrphansFor
, so it would have required more changes to have EraseOrphansFor
do the locking.
I don't think FinalizeNode
is called from anywhere that holds g_cs_orphans
? Am I missing somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here:
Lines 213 to 214 in b9f41df
LOCK2(::cs_main, ::g_cs_orphans); | |
node.connman->StopNodes(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's it!
StopNodes() calls DeleteNode() calls FinalizeNode() calls EraseForPeer(); with EraseForPeer needing g_cs_orphans
. StopNodes locks cs_vNodes
before calling DeleteNode, so g_cs_orphans
has to be locked before calling CConnman::StopNodes but only if it's linked to a PeerManagerImpl.
StopNodes is also called from ~CConnman
so that path will still get a lock order violation error, I think.
DeleteNode is also called from DisconnectNodes which is called from ThreadSocketHandler; but there it's called after cs_vNodes
is released, so lock ordering is not an issue, but g_cs_orphans
still has to get locked prior to doing the work in EraseForPeer, so responsibility can't be moved all the way back up the chain.
So I don't think there's a simple way to make it non-recursive for this PR; for a future PR, I think the aim is to make g_cs_orphans
become private: TxOrphanage::m_mutex
and not have it be the last thing locked / first thing released so that the ordering constraint goes away, and then it should trivially be non-recursive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for a future PR, I think the aim is to make g_cs_orphans become private: TxOrphanage::m_mutex and not have it be the last thing locked / first thing released so that the ordering constraint goes away, and then it should trivially be non-recursive.
Concept ACK :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/ajtowns/bitcoin/commits/202102-orphanworkset has work-in-progress that turns g_cs_orphans a protected m_mutex, doesn't quite turn orphans back into the responsibility of who sent them rather than who sent their final parent though.
src/net_processing.cpp
Outdated
@@ -2104,10 +2104,9 @@ void PeerManagerImpl::ProcessOrphanTx(std::set<uint256>& orphan_work_set) | |||
const uint256 orphanHash = *orphan_work_set.begin(); | |||
orphan_work_set.erase(orphan_work_set.begin()); | |||
|
|||
auto orphan_it = mapOrphanTransactions.find(orphanHash); | |||
if (orphan_it == mapOrphanTransactions.end()) continue; | |||
const auto [porphanTx, from_peer] = GetOrphanTx(orphanHash); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aw yeah, that structure binding ✨
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woah! I forgot we got that with C++17.
|
||
/** Index from wtxid into the m_orphans to lookup orphan | ||
* transactions using their witness ids. */ | ||
std::map<uint256, OrphanMap::iterator> m_wtxid_to_orphan_it GUARDED_BY(g_cs_orphans); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, it seems like a tradeoff between memory & runtime complexity. the HaveTx
lookup is currently logarithmic, but using a find_if
would be linear. I'm guessing opting for the additional data structure was chosen to minimize the runtime impact on the program since we call this function so frequently (at least.. for every txid in inv messages, before we send a getdata, when processing a tx)
unsigned int nMaxOrphanTx = (unsigned int)std::max((int64_t)0, gArgs.GetArg("-maxorphantx", DEFAULT_MAX_ORPHAN_TRANSACTIONS)); | ||
unsigned int nEvicted = LimitOrphanTxSize(nMaxOrphanTx); | ||
unsigned int nEvicted = m_orphanage.LimitOrphans(nMaxOrphanTx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 came here to make a very similar suggestion :)
c295e8d
to
5e50e2d
Compare
Added a commit to update comments with suggestions. @jnewbery yeah, other commit could have been dropped -- it was only there to move |
utACK 5e50e2d |
utACK 5e50e2d |
re ACK 5e50e2d, comment updates |
Yes I like how this cleanly encapsulates the transaction orphan handling, exposing a single documented interface in a small header file. It is clearly an improvement. The only thing I don't completely understand is why the Code review ACK 5e50e2d |
|
@jnewbery Good to know, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Post-Merge Code Review ACK 5e50e2d
} | ||
} | ||
|
||
bool TxOrphanage::HaveTx(const GenTxid& gtxid) const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's say an attacker sends an invalid orphan Z to Alice. Such orphan will be stored
in m_orphans
until expiration time, random eviction or parent reception. Another honest
peer Bob announce Z's parent and Z through txid-relay to Alice. Announcement for Z from Bob
won't be scheduled for request by AddTxAnnouncement()
as it'll be bounced off before
due to a positive AlreadyHaveTx()
.
If Bob sends a valid Z's parent, it should be accepted but Alice's version of Z will be
rejected as invalid. Assuming Bob is initial broadcaster of Z and same trick is repeated
against all his tx-relay peers, Z won't propagate until next Bob rebroadcast.
Is the following description correct ? If yes, I don't think that's concerning, assuming Bob
rebroadcast frequency is pretty high (which is the case for time-sensitive Bitcoin
applications).
Though not introduced by this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what you're saying is: Bob has tx P
(parent) and a child of that tx C
. Mallory knows both and malleates the witness of C
so that it creates a new invalid tx X
that has the same txid as C
but obviously a different wtxid. Mallory relays X
to Alice before Alice hears about P
or C
. Alice adds X
to the orphan pool and requests P
from Mallory by wtxid.
I think the expected behaviour is then: Carol announces both P
and C
to Alice; if wtxid relay is active, both are announced via wtxid so don't conflict with X
, and both are requested. If the request for P
from Mallory is still active, C
will be requested first, and when we receive it, we'll see there's already an entry for that txid in the orphan pool and ignore it. I'm not sure if we'll re-request from other honest peers at that point in normal usage; though I think this is a case where #21061 would eventually save the day.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we'll re-request from other honest peers at that point in normal usage
Once the transaction is rejected from orphanage in reason of an already present orphan, we forget its txid/wtxid from tx-requester. So effectively, we shouldn't re-request it from other onest peers at that stage. Ultimately, parent P
withhold by Mallory should expire, Alice will fetch a valid parent P
from another peer. When Bob rebroadcasts C
, Mallory isn't able to interfere anymore as malleated X
will be rejected directly, without being logged in orphan pool.
Yes I agree #21061 should make things better.
EraseTx(m_orphan_list[randompos]->first); | ||
++nEvicted; | ||
} | ||
return nEvicted; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't the "orphanage overflow" log L3131 in net_processing be moved here instead of returning nEvicted
? And change the message to dissociate clearly expiration-from-eviction.
return {it->second.tx, it->second.fromPeer}; | ||
} | ||
|
||
void TxOrphanage::EraseForBlock(const CBlock& block) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we had a std::map<uint256, OrphanMap::iterator> m_parentid_to_orphan_it
index we could update each peer's orphan_work_set
in function of the parent received through this block.
Otherwise, I think you might have stalling valid orphans never re-processed. Any ulterior parent announcement should be bounce off by m_recent_confirmed_transactions
. Though rare enough to not be worthy of the complexity...
/** Check if we already have an orphan transaction (by txid or wtxid) */ | ||
bool HaveTx(const GenTxid& gtxid) const EXCLUSIVE_LOCKS_REQUIRED(!g_cs_orphans); | ||
|
||
/** Get an orphan transaction and its orginating peer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: originating
pm ACK |
b8f17fb [tests] Move TxOrphange tests to orphange_tests.cpp (John Newbery) Pull request description: PR #21148 moved the orphan transaction handling functionality from net_processing into its own translation unit txorphanage.cpp. The unit tests for that code should be in its own file rather than mixed with the net_processing unit tests in denialofservive_tests.cpp. ACKs for top commit: vincenzopalazzo: ACK b8f17fb Tree-SHA512: 32da89b3792abcbdcf897d66276225731c8976e1e0cd902c4b5ad8aff02104719c3aee2990cc2fcbe3eddede8a59472266e0ad1ce2ac11d66fe52c8cbe705161
Summary: This module captures orphan tracking code for tx relay. Can be reviewed with --color-moved=dimmed-zebra This is a backport of [[bitcoin/bitcoin#21148 | core#21148]] [1/14] bitcoin/bitcoin@9d5313d Note: I had to change a couple of `uint256` type to `TxId` in a test. Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Differential Revision: https://reviews.bitcoinabc.org/D11481
Summary: This is a backport of [[bitcoin/bitcoin#21148 | core#21148]] [2/14] bitcoin/bitcoin@81dd57e Depends on D11481 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Differential Revision: https://reviews.bitcoinabc.org/D11482
Summary: EraseOrphansFor was called both with and without g_cs_orphans held, correct that so that it's always called with it already held. LimitOrphanTxSize was always called with g_cs_orphans held, so add annotations and don't lock it a second time. This is a backport of [[bitcoin/bitcoin#21148 | core#21148]] [3/14] bitcoin/bitcoin@38a11c3 Depends on D11482 Test Plan: With clang and debug: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Differential Revision: https://reviews.bitcoinabc.org/D11483
Summary: Extract some common code into AddChildrenToWorkSet function. (It's a hard knock life) This is a backport of [[bitcoin/bitcoin#21148 | core#21148]] [4/14] bitcoin/bitcoin@ee135c8 Depends on D11483 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, sdulfari Reviewed By: #bitcoin_abc, sdulfari Subscribers: sdulfari Differential Revision: https://reviews.bitcoinabc.org/D11484
Summary: Extract some common code into HaveOrphanTx function. This is a backport of [[bitcoin/bitcoin#21148 | core#21148]] [5/14] bitcoin/bitcoin@83679ff Depends on D11484 Test Plan: With clang and debug: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Subscribers: Fabien Differential Revision: https://reviews.bitcoinabc.org/D11485
Summary: Extract orphan lookup code into GetOrphanTx function. This is a backport of [[bitcoin/bitcoin#21148 | core#21148]] [6/14] bitcoin/bitcoin@f294da7 Depends on D11485 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Subscribers: Fabien Differential Revision: https://reviews.bitcoinabc.org/D11486
Summary: > txorphanage: Extract OrphanageAddTx > > Extract code from `AddOrphanTx` into `OrphanageAddTx`. bitcoin/bitcoin@1041616 > denialofservices_tests: check txorphanage's AddTx > > Rather than checking net_processing's internal implementation of > AddOrphanTx, test txorphanage's exported AddTx interface. Note that > this means AddToCompactExtraTransactions is no longer tested here. bitcoin/bitcoin@26d1a6c > net_processing: drop AddOrphanTx > > All the interesting functionality of AddOrphanTx is already in other > functions, so call those functions directly in the one place that > AddOrphanTx was used. bitcoin/bitcoin@3c4c3c2 This is a backport of [[bitcoin/bitcoin#21148 | core#21148]] [7, 8 & 9/14] Depends on D11486 Note: the way Core used the lock in `denialofservice_tests` depends on `RandomOrphan` not taking the locks, which Core does in [[bitcoin/bitcoin@6bd4963 | a subsequent commit of the same PR]]. Test Plan: With clang and debug: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Differential Revision: https://reviews.bitcoinabc.org/D11487
Summary: Extract code that erases orphans when a new block is found into EraseOrphansForBlock. This is a backport of [[bitcoin/bitcoin#21148 | core#21148]] [10/14] bitcoin/bitcoin@03257b8 Depends on D11487 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Differential Revision: https://reviews.bitcoinabc.org/D11489
Summary: Collects all the orphan handling globals into a single member var in net_processing, and ensures access is encapuslated into the interface functions. Also adds doxygen comments for methods. This is a backport of [[bitcoin/bitcoin#21148 | core#21148]] [11/14] bitcoin/bitcoin@6bd4963 Depends on D11489 Test Plan: With clang and debug: `ninja all check-all` Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Subscribers: Fabien Differential Revision: https://reviews.bitcoinabc.org/D11490
Summary: ``` -BEGIN VERIFY SCRIPT- sed -i 's/mapOrphanTransactionsByPrev/m_outpoint_to_orphan_it/g' src/txorphanage.h src/txorphanage.cpp sed -i 's/mapOrphanTransactions/m_orphans/g' src/txorphanage.h src/txorphanage.cpp src/net_processing.cpp src/test/denialofservice_tests.cpp sed -i 's/g_orphan_list/m_orphan_list/g' src/txorphanage.h src/txorphanage.cpp sed -i 's/nMaxOrphans/max_orphans/g' src/txorphanage.h src/txorphanage.cpp sed -i 's/COrphanTx/OrphanTx/g' src/txorphanage.h src/txorphanage.cpp src/test/denialofservice_tests.cpp arc lint -END VERIFY SCRIPT- ``` This is a backport of [[bitcoin/bitcoin#21148 | core#21148]] [12/14] bitcoin/bitcoin@f8c0688 Depends on D11489 Test Plan: `ninja all check-all` Reviewers: #bitcoin_abc, sdulfari Reviewed By: #bitcoin_abc, sdulfari Differential Revision: https://reviews.bitcoinabc.org/D11491
Summary: Allows making vExtraTxnForCompact and vExtraTxnForCompactIt member vars instead of globals. This is a backport of [[bitcoin/bitcoin#21148 | core#21148]] [13/14] bitcoin/bitcoin@eeeafb3 Depends on D11491 Test Plan: With clang and debug: `ninja all check-all` Reviewers: #bitcoin_abc, sdulfari Reviewed By: #bitcoin_abc, sdulfari Differential Revision: https://reviews.bitcoinabc.org/D11492
Summary: This is a backport of [[bitcoin/bitcoin#21148 | core#21148]] [14/14] bitcoin/bitcoin@5e50e2d Depends on D11492 Test Plan: NA (only comments were changed) Reviewers: #bitcoin_abc, sdulfari Reviewed By: #bitcoin_abc, sdulfari Differential Revision: https://reviews.bitcoinabc.org/D11493
Splits orphan handling into its own module and reduces global usage.