-
Notifications
You must be signed in to change notification settings - Fork 36.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net-processing refactoring -- lose globals, move implementation details from .h to .cpp #20758
Conversation
Marked as WIP; definitely needs some review/testing. The I think it might make sense to have CNode have an opaque Peer pointer member that's provided by net_processing when the connection is first made and passed back to net_processing via the PeerManager interface -- that would obsolete a lot of the "state for this peer" lookups, and might simplify locking substantially. Bit complicated to think about, so I stopped at this point, but that was what I was originally looking to explore here. |
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
src/net_processing.cpp
Outdated
@@ -1319,7 +1319,7 @@ void PeerManager::UpdatedBlockTip(const CBlockIndex *pindexNew, const CBlockInde | |||
LOCK(m_peer_mutex); | |||
for (auto& it : m_peer_map) { | |||
Peer& peer = *it.second; | |||
LOCK(peer.m_block_inv_mutex); | |||
LOCK(peer.m_block_inv_mutex); // TODO: violates "Mutexes inside this struct must not be held when locking m_peer_mutex." rule?? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're the second person who's been confused by this comment (#19829 (comment)), which tells me that I didn't make it clear enough.
It's fine to lock m_peer_mutex and then lock one of the inner mutexes, or even to lock m_peer_mutex, take a shared_ptr to the Peer, release m_peer_mutex and then lock an inner mutex. The only rule is that we don't want to lock an inner mutex and then lock m_peer_mutex.
If you have better wording for the comment, I'm very happy to take it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, in that case I would describe it as a lock ordering constraint -- "if you need to acquire internal locks from multiple peers, acquire m_peer_mutex first, then acquire the peers' locks and do your work, then release the internal locks, then release m_peer_mutex. If you are already holding an internal lock, you must release it before acquiring m_peer_mutex (eg by calling GetPeerRef)".
(It's probably already clear enough on its own; it's just that the above would go unwritten everywhere else in the codebase, so that it's written down here make it sound like there's some stricter requirement going on...)
Strong concept ACK. I really like the idea of compilation firewalls and not exposing the innards of net_processing through the header file. I don't know where this fits in the sequence of PRs for #19398. Doing it first makes some sense because it means we won't be putting stuff into the header file only to remove it later, but I think this PR will need to be split up into more digestible pieces to be merged. |
src/net_processing.cpp
Outdated
}; | ||
|
||
namespace { | ||
class PeerManagerImpl final : public PeerManager { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't what I was expecting to see. I was assuming that you'd use a pimpl idiom where the PeerManager
would hold a unique_ptr to the PeerManagerImpl
rather than have PeerManagerImpl
inherit from PeerManager
.
I'm not saying there's a problem with this way, but I'm curious what made you choose it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PeerManager does multiple inheritance already so the virtual overhead's already there, and this avoids two lots of unique pointers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few more things that can be deleted from the header file or moved into the cpp file:
→ git --no-pager d
diff --git a/src/net_processing.cpp b/src/net_processing.cpp
index ca016ec52b..1c7638a774 100644
--- a/src/net_processing.cpp
+++ b/src/net_processing.cpp
@@ -26,6 +26,7 @@
#include <streams.h>
#include <tinyformat.h>
#include <txmempool.h>
+#include <txrequest.h>
#include <util/check.h> // For NDEBUG compile time check
#include <util/strencodings.h>
#include <util/system.h>
@@ -290,6 +291,20 @@ struct CNodeState {
};
} // namespace
+/**
+ * Data structure for an individual peer. This struct is not protected by
+ * cs_main since it does not contain validation-critical data.
+ *
+ * Memory is owned by shared pointers and this object is destructed when
+ * the refcount drops to zero.
+ *
+ * Mutexes inside this struct must not be held when locking m_peer_mutex.
+ *
+ * Details are all local to net_processing.cpp
+ *
+ * TODO: move most members from CNodeState to this structure.
+ * TODO: move remaining application-layer data members from CNode to this structure.
+ */
struct Peer {
/** Same id as the CNode object for this peer */
const NodeId m_id{0};
@@ -334,6 +349,8 @@ struct Peer {
explicit Peer(NodeId id, CAddress addr, bool is_inbound) : m_id(id), nodestate(addr, is_inbound) {}
};
+using PeerRef = std::shared_ptr<Peer>;
+
namespace {
class PeerManagerImpl final : public PeerManager {
public:
diff --git a/src/net_processing.h b/src/net_processing.h
index ee34a38178..74eabd6376 100644
--- a/src/net_processing.h
+++ b/src/net_processing.h
@@ -6,19 +6,12 @@
#ifndef BITCOIN_NET_PROCESSING_H
#define BITCOIN_NET_PROCESSING_H
-#include <consensus/params.h>
#include <net.h>
-#include <sync.h>
-#include <txrequest.h>
#include <validationinterface.h>
-class BlockTransactionsRequest;
-class BlockValidationState;
-class CBlockHeader;
class CChainParams;
class CTxMemPool;
class ChainstateManager;
-class TxValidationState;
extern RecursiveMutex cs_main;
extern RecursiveMutex g_cs_orphans;
@@ -40,24 +33,6 @@ struct CNodeStateStats {
std::vector<int> vHeightInFlight;
};
-/**
- * Data structure for an individual peer. This struct is not protected by
- * cs_main since it does not contain validation-critical data.
- *
- * Memory is owned by shared pointers and this object is destructed when
- * the refcount drops to zero.
- *
- * Mutexes inside this struct must not be held when locking m_peer_mutex.
- *
- * Details are all local to net_processing.cpp
- *
- * TODO: move most members from CNodeState to this structure.
- * TODO: move remaining application-layer data members from CNode to this structure.
- */
-struct Peer; // body is defined in net_processing.cpp
-
-using PeerRef = std::shared_ptr<Peer>;
-
class PeerManager : public CValidationInterface, public NetEventsInterface {
public:
/** Evict extra outbound peers. If we think our tip may be stale, connect to an extra outbound */
src/net_processing.h
Outdated
struct COrphanTx { | ||
CTransactionRef tx; | ||
NodeId fromPeer; | ||
int64_t nTimeExpire; | ||
size_t list_pos; | ||
}; | ||
|
||
virtual bool AddOrphanTx(const CTransactionRef& tx, NodeId peer) EXCLUSIVE_LOCKS_REQUIRED(g_cs_orphans) = 0; | ||
virtual void EraseOrphansFor(NodeId peer) EXCLUSIVE_LOCKS_REQUIRED(g_cs_orphans) = 0; | ||
virtual unsigned int LimitOrphanTxSize(unsigned int nMaxOrphans) = 0; | ||
virtual void UpdateLastBlockAnnounceTime(NodeId node, int64_t time_in_seconds) = 0; | ||
|
||
/** Map from txid to orphan transaction record. Limited by | ||
* -maxorphantx/DEFAULT_MAX_ORPHAN_TRANSACTIONS */ | ||
std::map<uint256, COrphanTx> mapOrphanTransactions GUARDED_BY(g_cs_orphans); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be great to eventually move all of this fully inside PeerManagerImpl
, or break it out into an OrphansManager
class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did spend a few moments trying to break it out, but my first approach didn't work immediately, so I'm leaving it for later
I've made a branch at https://github.com/jnewbery/bitcoin/tree/pr20758.1 that implements something I've wanted to try for a while. It moves PeerManagerImpl into its own translation unit (net_processing_impl.cpp), with the declaration of PeerManagerImpl in net_processing_impl.h. That header file is only included by net_processing.cpp (so the std::unique_ptr can be constructed) and the test files. The tests are updated to use a PeerManagerImpl instead of a PeerManager. The result is that PeerManager and net_processing.h don't need to include any functions that are for test only. That header file and PeerManager's public interface are about as minimal as possible (the only exposed methods are RelayTransaction, GetNodeStateStats and IgnoresIncomingTxs). Meanwhile, everything in PeerManagerImpl can be made public (since it's behind a compilation firewall so effectively everything is hidden from other translation units), and the tests can access any method/data in the class. What do you think? Is this something we should consider? |
I think a better approach would be to have I think maybe it'd be even better to focus on making the classes smaller so they can be tested purely via their public interfaces though, and not worry too much about exposing a few extra methods in the meantime? |
ACK. This seems even better!
Also ACK making classes smaller. I think the tx_request module is a great example of how we can make the code clearer and better tested by breaking up the functionality. Perhaps block downloading, orphan handling, addr relay, etc could get similar treatment in future. Aside: I think 'tested purely via their public interfaces' is an OOP mantra that sounds good in theory, but white-box testing is too useful to throw out entirely, especially when dealing with legacy code that has existing large modules/classes. This can all be left for future work though. I think just implementing the PeerManagerImpl is a great first step. I'd love to help get this reviewed and merged. Perhaps everything up to move PeerManagerImpl into cpp file could be split into a PR for review first? |
I still think that's white-box testing -- it's just that calling public functions is equivalent to entering at function boundaries, while calling private functions is more like a GOTO into the middle of a function. But I don't mind having test-only sanity check functions (like in txrequest.h)... |
fc83bf0
to
2e57317
Compare
4f9931d
to
3f90a18
Compare
🐙 This pull request conflicts with the target branch and needs rebase. Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft". |
There hasn't been much activity lately and the patch still needs rebase. What is the status here?
|
Allows nPreferredDownload to become a member var instead of global.
This allows making mapNodeState not a global. Requires also moving ProcessBlockAvailability, UpdateBlockAvailability, RelayTransaction, ProcessGetBlockData, GetFetchFlags, and UpdateLastBlockAnnounceTime into PeerManagerImpl, as well as removing the const annotation from ReattemptInitialBroadcast (as they invoke State()). Also requires moving RelayTransaction and UpdateLastBlockAnnounceTime into PeerManager.
These function-local variables are effectively globals; annotate them so it's easier to refactor them later.
Also obsoletes mapNodeState.
Fixes lock order violations introduced by merging CNodeState into Peer
Fixes lock order violations caused by merging CNodeState into Peer
3f90a18
to
915da49
Compare
Closing this. If you'd like it resurrected please help #25174 make progress. |
Don't hesitate to ping me to review net and lock related changes like this. |
Done in #20811 :
Done in #20942:
In #21148:
Additional things in this PR:
Not done yet: