Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net-processing refactoring -- lose globals, move implementation details from .h to .cpp #20758

Closed
wants to merge 15 commits into from

Conversation

ajtowns
Copy link
Contributor

@ajtowns ajtowns commented Dec 23, 2020

Done in #20811 :

  • Moves implementation details of PeerManager into PeerManagerImpl; moves PeerManagerImpl into .cpp
  • Moves struct Peer definition into .cpp

Done in #20942:

  • Moves net_processing globals into PeerManagerImpl.
  • Moves a lot of net_processing functions into PeerManagerImpl and simplifies some of their arguments as a result.

In #21148:

  • Split orphan management into its own module
  • More more globals/functions into PeerManagerImpl

Additional things in this PR:

  • Moves more globals/functions into PeerManagerImpl; simplify their arguments.
  • Makes test/denialofservice_tests use the PeerManager api instead of directly accessing net_processing implementation details
  • Introduces State(CNode&) and State(Peer&) alternatives to calling State(node.GetId()) all the time.
  • Moves CNodeState into Peer as a step towards consolidating state information
  • Make NetEventsInterface take a CNode& instead of a CNode*
  • Pass ArgsManager into the constructor and collect parameters once rather than calling GetArgs regularly at runtime

Not done yet:

  • Split parts of PeerManager into their own modules a la txrequest (block downloads?)

@ajtowns
Copy link
Contributor Author

ajtowns commented Dec 23, 2020

Marked as WIP; definitely needs some review/testing.

The move CNodeState into Peer commit might not be the greatest idea -- the locking issues are a little subtle (assumign I got them right). Not sure the the locking guidelines for Peer make sense (see "annotate lock guideline violation?" commit) though (and generally not convinced we need the multiple per-peer state mutexes that Peer has).

I think it might make sense to have CNode have an opaque Peer pointer member that's provided by net_processing when the connection is first made and passed back to net_processing via the PeerManager interface -- that would obsolete a lot of the "state for this peer" lookups, and might simplify locking substantially. Bit complicated to think about, so I stopped at this point, but that was what I was originally looking to explore here.

@DrahtBot
Copy link
Contributor

DrahtBot commented Dec 23, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@@ -1319,7 +1319,7 @@ void PeerManager::UpdatedBlockTip(const CBlockIndex *pindexNew, const CBlockInde
LOCK(m_peer_mutex);
for (auto& it : m_peer_map) {
Peer& peer = *it.second;
LOCK(peer.m_block_inv_mutex);
LOCK(peer.m_block_inv_mutex); // TODO: violates "Mutexes inside this struct must not be held when locking m_peer_mutex." rule??
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're the second person who's been confused by this comment (#19829 (comment)), which tells me that I didn't make it clear enough.

It's fine to lock m_peer_mutex and then lock one of the inner mutexes, or even to lock m_peer_mutex, take a shared_ptr to the Peer, release m_peer_mutex and then lock an inner mutex. The only rule is that we don't want to lock an inner mutex and then lock m_peer_mutex.

If you have better wording for the comment, I'm very happy to take it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, in that case I would describe it as a lock ordering constraint -- "if you need to acquire internal locks from multiple peers, acquire m_peer_mutex first, then acquire the peers' locks and do your work, then release the internal locks, then release m_peer_mutex. If you are already holding an internal lock, you must release it before acquiring m_peer_mutex (eg by calling GetPeerRef)".

(It's probably already clear enough on its own; it's just that the above would go unwritten everywhere else in the codebase, so that it's written down here make it sound like there's some stricter requirement going on...)

@DrahtBot DrahtBot mentioned this pull request Dec 23, 2020
@jnewbery
Copy link
Contributor

Strong concept ACK. I really like the idea of compilation firewalls and not exposing the innards of net_processing through the header file.

I don't know where this fits in the sequence of PRs for #19398. Doing it first makes some sense because it means we won't be putting stuff into the header file only to remove it later, but I think this PR will need to be split up into more digestible pieces to be merged.

};

namespace {
class PeerManagerImpl final : public PeerManager {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't what I was expecting to see. I was assuming that you'd use a pimpl idiom where the PeerManager would hold a unique_ptr to the PeerManagerImpl rather than have PeerManagerImpl inherit from PeerManager.

I'm not saying there's a problem with this way, but I'm curious what made you choose it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PeerManager does multiple inheritance already so the virtual overhead's already there, and this avoids two lots of unique pointers?

Copy link
Contributor

@jnewbery jnewbery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few more things that can be deleted from the header file or moved into the cpp file:

→ git --no-pager d
diff --git a/src/net_processing.cpp b/src/net_processing.cpp
index ca016ec52b..1c7638a774 100644
--- a/src/net_processing.cpp
+++ b/src/net_processing.cpp
@@ -26,6 +26,7 @@
 #include <streams.h>
 #include <tinyformat.h>
 #include <txmempool.h>
+#include <txrequest.h>
 #include <util/check.h> // For NDEBUG compile time check
 #include <util/strencodings.h>
 #include <util/system.h>
@@ -290,6 +291,20 @@ struct CNodeState {
 };
 } // namespace
 
+/**
+ * Data structure for an individual peer. This struct is not protected by
+ * cs_main since it does not contain validation-critical data.
+ *
+ * Memory is owned by shared pointers and this object is destructed when
+ * the refcount drops to zero.
+ *
+ * Mutexes inside this struct must not be held when locking m_peer_mutex.
+ *
+ * Details are all local to net_processing.cpp
+ *
+ * TODO: move most members from CNodeState to this structure.
+ * TODO: move remaining application-layer data members from CNode to this structure.
+ */
 struct Peer {
     /** Same id as the CNode object for this peer */
     const NodeId m_id{0};
@@ -334,6 +349,8 @@ struct Peer {
     explicit Peer(NodeId id, CAddress addr, bool is_inbound) : m_id(id),  nodestate(addr, is_inbound) {}
 };
 
+using PeerRef = std::shared_ptr<Peer>;
+
 namespace {
 class PeerManagerImpl final : public PeerManager {
 public:
diff --git a/src/net_processing.h b/src/net_processing.h
index ee34a38178..74eabd6376 100644
--- a/src/net_processing.h
+++ b/src/net_processing.h
@@ -6,19 +6,12 @@
 #ifndef BITCOIN_NET_PROCESSING_H
 #define BITCOIN_NET_PROCESSING_H
 
-#include <consensus/params.h>
 #include <net.h>
-#include <sync.h>
-#include <txrequest.h>
 #include <validationinterface.h>
 
-class BlockTransactionsRequest;
-class BlockValidationState;
-class CBlockHeader;
 class CChainParams;
 class CTxMemPool;
 class ChainstateManager;
-class TxValidationState;
 
 extern RecursiveMutex cs_main;
 extern RecursiveMutex g_cs_orphans;
@@ -40,24 +33,6 @@ struct CNodeStateStats {
     std::vector<int> vHeightInFlight;
 };
 
-/**
- * Data structure for an individual peer. This struct is not protected by
- * cs_main since it does not contain validation-critical data.
- *
- * Memory is owned by shared pointers and this object is destructed when
- * the refcount drops to zero.
- *
- * Mutexes inside this struct must not be held when locking m_peer_mutex.
- *
- * Details are all local to net_processing.cpp
- *
- * TODO: move most members from CNodeState to this structure.
- * TODO: move remaining application-layer data members from CNode to this structure.
- */
-struct Peer; // body is defined in net_processing.cpp
-
-using PeerRef = std::shared_ptr<Peer>;
-
 class PeerManager : public CValidationInterface, public NetEventsInterface {
 public:
     /** Evict extra outbound peers. If we think our tip may be stale, connect to an extra outbound */

src/net_processing.h Outdated Show resolved Hide resolved
src/net_processing.h Outdated Show resolved Hide resolved
src/net_processing.h Outdated Show resolved Hide resolved
src/net_processing.h Outdated Show resolved Hide resolved
Comment on lines 88 to 87
struct COrphanTx {
CTransactionRef tx;
NodeId fromPeer;
int64_t nTimeExpire;
size_t list_pos;
};

virtual bool AddOrphanTx(const CTransactionRef& tx, NodeId peer) EXCLUSIVE_LOCKS_REQUIRED(g_cs_orphans) = 0;
virtual void EraseOrphansFor(NodeId peer) EXCLUSIVE_LOCKS_REQUIRED(g_cs_orphans) = 0;
virtual unsigned int LimitOrphanTxSize(unsigned int nMaxOrphans) = 0;
virtual void UpdateLastBlockAnnounceTime(NodeId node, int64_t time_in_seconds) = 0;

/** Map from txid to orphan transaction record. Limited by
* -maxorphantx/DEFAULT_MAX_ORPHAN_TRANSACTIONS */
std::map<uint256, COrphanTx> mapOrphanTransactions GUARDED_BY(g_cs_orphans);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be great to eventually move all of this fully inside PeerManagerImpl, or break it out into an OrphansManager class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did spend a few moments trying to break it out, but my first approach didn't work immediately, so I'm leaving it for later

@jnewbery
Copy link
Contributor

I've made a branch at https://github.com/jnewbery/bitcoin/tree/pr20758.1 that implements something I've wanted to try for a while. It moves PeerManagerImpl into its own translation unit (net_processing_impl.cpp), with the declaration of PeerManagerImpl in net_processing_impl.h. That header file is only included by net_processing.cpp (so the std::unique_ptr can be constructed) and the test files. The tests are updated to use a PeerManagerImpl instead of a PeerManager.

The result is that PeerManager and net_processing.h don't need to include any functions that are for test only. That header file and PeerManager's public interface are about as minimal as possible (the only exposed methods are RelayTransaction, GetNodeStateStats and IgnoresIncomingTxs). Meanwhile, everything in PeerManagerImpl can be made public (since it's behind a compilation firewall so effectively everything is hidden from other translation units), and the tests can access any method/data in the class.

What do you think? Is this something we should consider?

@DrahtBot DrahtBot mentioned this pull request Dec 27, 2020
@ajtowns
Copy link
Contributor Author

ajtowns commented Dec 28, 2020

It moves PeerManagerImpl into its own translation unit

I think a better approach would be to have module.h as the public interface, module_impl.h as the white-box testing interface (ie the class definition of PeerManagerImpl), and module.cpp as the implementation -- no need to have two cpp files for what's really only a single module, and it also avoids moving the code around.

I think maybe it'd be even better to focus on making the classes smaller so they can be tested purely via their public interfaces though, and not worry too much about exposing a few extra methods in the meantime?

@jnewbery
Copy link
Contributor

I think a better approach would be to have module.h as the public interface, module_impl.h as the white-box testing interface (ie the class definition of PeerManagerImpl), and module.cpp as the implementation -- no need to have two cpp files for what's really only a single module, and it also avoids moving the code around.

ACK. This seems even better!

I think maybe it'd be even better to focus on making the classes smaller so they can be tested purely via their public interfaces though, and not worry too much about exposing a few extra methods in the meantime?

Also ACK making classes smaller. I think the tx_request module is a great example of how we can make the code clearer and better tested by breaking up the functionality. Perhaps block downloading, orphan handling, addr relay, etc could get similar treatment in future.

Aside: I think 'tested purely via their public interfaces' is an OOP mantra that sounds good in theory, but white-box testing is too useful to throw out entirely, especially when dealing with legacy code that has existing large modules/classes.

This can all be left for future work though. I think just implementing the PeerManagerImpl is a great first step. I'd love to help get this reviewed and merged. Perhaps everything up to move PeerManagerImpl into cpp file could be split into a PR for review first?

@ajtowns
Copy link
Contributor Author

ajtowns commented Dec 30, 2020

Aside: I think 'tested purely via their public interfaces' is an OOP mantra that sounds good in theory, but white-box testing is too useful to throw out entirely, especially when dealing with legacy code that has existing large modules/classes.

I still think that's white-box testing -- it's just that calling public functions is equivalent to entering at function boundaries, while calling private functions is more like a GOTO into the middle of a function. But I don't mind having test-only sanity check functions (like in txrequest.h)...

@DrahtBot
Copy link
Contributor

🐙 This pull request conflicts with the target branch and needs rebase.

Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft".

@DrahtBot
Copy link
Contributor

There hasn't been much activity lately and the patch still needs rebase. What is the status here?
  • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
  • Is it no longer relevant? ➡️ Please close.
  • Did the author lose interest or time to work on this? ➡️ Please close it and mark it 'Up for grabs' with the label, so that it can be picked up in the future.

ajtowns added 15 commits March 21, 2022 23:52
Allows nPreferredDownload to become a member var instead of global.
This allows making mapNodeState not a global. Requires also moving
ProcessBlockAvailability, UpdateBlockAvailability, RelayTransaction,
ProcessGetBlockData, GetFetchFlags, and UpdateLastBlockAnnounceTime
into PeerManagerImpl, as well as removing the const annotation from
ReattemptInitialBroadcast (as they invoke State()).

Also requires moving RelayTransaction and UpdateLastBlockAnnounceTime
into PeerManager.
These function-local variables are effectively globals; annotate them
so it's easier to refactor them later.
Also obsoletes mapNodeState.
Fixes lock order violations introduced by merging CNodeState into Peer
Fixes lock order violations caused by merging CNodeState into Peer
@ajtowns ajtowns force-pushed the 202012-netproc-refactor branch from 3f90a18 to 915da49 Compare March 21, 2022 14:18
@ajtowns
Copy link
Contributor Author

ajtowns commented Mar 21, 2022

Hmm, in my head, the next set of changes for this PR are in #24474 and #21527, though I guess they're technically orthogonal. Rebased on top of #21148 merge so there aren't redundant commits at least.

@ajtowns
Copy link
Contributor Author

ajtowns commented Aug 29, 2022

Closing this. If you'd like it resurrected please help #25174 make progress.

@ajtowns ajtowns closed this Aug 29, 2022
@jonatack
Copy link
Member

Don't hesitate to ping me to review net and lock related changes like this.

@bitcoin bitcoin locked and limited conversation to collaborators Aug 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants