Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hash_type MUHASH for gettxoutsetinfo #19145

Merged
merged 5 commits into from
Feb 12, 2021
Merged

Conversation

fjahr
Copy link
Contributor

@fjahr fjahr commented Jun 2, 2020

This is another Pr in the series PRs for Coinstatsindex (see overview in #18000). This PR adds the hash_type option muhash to gettxoutsetinfo through which the user can calculate the serialized muhash of the utxo set. This PR does not use the index yet.

@fjahr fjahr changed the title [WIP] Allow hash_type options for gettxoutsetinfo [WIP] Add hash_type options for gettxoutsetinfo Jun 2, 2020
@DrahtBot
Copy link
Contributor

DrahtBot commented Jun 3, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@fjahr fjahr changed the title [WIP] Add hash_type options for gettxoutsetinfo Add hash_type options for gettxoutsetinfo Jun 4, 2020
@fjahr
Copy link
Contributor Author

fjahr commented Jun 4, 2020

Added another more extensive test, now ready for review.

@Sjors
Copy link
Member

Sjors commented Jun 9, 2020

Concept ACK. In the followup PR that adds a MuHash index, the RPC documentation should point out that this index can be used to dramatically speed up the RPC call when used with muhash. And that muhash and none will be more responsive than the default (or just change the default, and note that as a breaking change to anyone who uses the legacy hash).

@fjahr fjahr force-pushed the csi-3-muhash-rpc branch from 6013f04 to 7841f05 Compare June 11, 2020 23:02
@fjahr
Copy link
Contributor Author

fjahr commented Jun 11, 2020

@MarcoFalke I would like to hear your thoughts on 2109165. I had it only in the test at first but put it into the framework because I think it could be useful for other tests as well and since it's just a method it does not affect the use of P2PDataStore in other tests. But the data it returns on each UTXO is kind of specific to my test.

@maflcko
Copy link
Member

maflcko commented Jun 11, 2020

Seems fine to put it wherever you want, as long as reviewers are happy. I guess the decision mostly depends on whether this will be used by other tests.

@fjahr
Copy link
Contributor Author

fjahr commented Jun 29, 2020

rebased

@jonatack
Copy link
Member

re-ACK 3506d90 per git diff 2e5617f 3506d90

Thanks for updating.

@jonatack
Copy link
Member

This might need a release note (even if the note will be updated in planned follow-ups to this).

@Sjors
Copy link
Member

Sjors commented Jan 28, 2021

Muhash at height 668,047: c263d20036cc983510e9b347cb88e947168dd5665ed00b0e6118d07a7704f810

Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3506d90, almost happy...

Muhash at height 668,212: d0192e6604c3dc9a78f47bfc5ae84b8c0f6baf960d7ba9305664bbb9f65ef092.

hash_serialized_2 is 63fa057a656168ed832d8abbec8008948fc17aecde9f346d69043871a746941f on master, but 15d3629db2fcbed27a1fd9d35c15b5ac6f0a70f5723ebf9cf87e3df0b0feb061 here (efe73b1f89e82d65100c73c06d9a01ec19cc7e029d93caaa8bfded4d1723d484 if I revert GetHash())

You may want to hardcode a hash_serialized_2 example in the test suite.

src/node/coinstats.cpp Show resolved Hide resolved
src/node/coinstats.cpp Show resolved Hide resolved
src/rpc/blockchain.cpp Outdated Show resolved Hide resolved
src/node/coinstats.cpp Outdated Show resolved Hide resolved
@fjahr
Copy link
Contributor Author

fjahr commented Jan 31, 2021

Reviewed 3506d90, almost happy...

Muhash at height 668,212: d0192e6604c3dc9a78f47bfc5ae84b8c0f6baf960d7ba9305664bbb9f65ef092.

hash_serialized_2 is 63fa057a656168ed832d8abbec8008948fc17aecde9f346d69043871a746941f on master, but 15d3629db2fcbed27a1fd9d35c15b5ac6f0a70f5723ebf9cf87e3df0b0feb061 here (efe73b1f89e82d65100c73c06d9a01ec19cc7e029d93caaa8bfded4d1723d484 if I revert GetHash())

You may want to hardcode a hash_serialized_2 example in the test suite.

Thanks a lot for testing @Sjors ! I should have addressed all your comments.

The difference in that hash was caused by using outputs.end() where I should have used std::prev(outputs.end()) in ApplyHash(). I was actually sure we had some deterministic tests in place for that hash already but obviously, we didn't. So I added a new commit with a small test calculating the hash based on the cached chain to add such coverage.

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch and nice improvements.

Tested re-ACK e987ae5 per git diff 3506d90 e987ae5, reviewed diff, debug built, ran gettxoutsetinfo -signet and help on this branch vs master, at height 23127 both returned hash_serialized_2 of 2b72d65f3b6efb2311f58374ea2b939abf49684d44f4bafda45faa3b5452a454 and this branch returned muhash of c9f1ff12d345ccf9939c6bbf087e6f7399b6115adee1569287e9c5c43dbb475c

Only if you have to retouch and if you agree, one nit below (and maybe use single/double quotes consistently in the functional tests).

@@ -1063,7 +1076,8 @@ static RPCHelpMan gettxoutsetinfo()
{RPCResult::Type::NUM, "transactions", "The number of transactions with unspent outputs"},
{RPCResult::Type::NUM, "txouts", "The number of unspent transaction outputs"},
{RPCResult::Type::NUM, "bogosize", "A meaningless metric for UTXO set size"},
{RPCResult::Type::STR_HEX, "hash_serialized_2", "The serialized hash (only present if 'hash_serialized_2' hash_type is chosen)"},
{RPCResult::Type::STR_HEX, "hash_serialized_2", /* optional */ true, "The serialized hash (only present if 'hash_serialized_2' hash_type is chosen)"},
{RPCResult::Type::STR_HEX, "muhash", /* optional */ true, "The serialized hash (only present if 'muhash' hash_type is chosen)"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, in these two lines, could replace the single quotes with double ones for consistency, e.g. s/'/\"/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will do if I have to retouch! Thanks for re-reviewing!

@Sjors
Copy link
Member

Sjors commented Feb 3, 2021

tACK e987ae5

I get the same MuHash as before for block 668,212 and hash_serialized_2 now matches what I found on master.

@achow101
Copy link
Member

achow101 commented Feb 8, 2021

ACK e987ae5

It would be nice if the test used MiniWallet or otherwise didn't rely on the wallet, but it's fine for now.

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK e987ae5. Looks very good. I left one suggestion to simplify code, but feel free to ignore it here and maybe consider it for later since PR has already had a lot of review.

uint256 hashed_in = (CHashWriter(SER_DISK, 0) << in).GetSHA256();
unsigned char tmp[BYTE_SIZE];
ChaCha20(hashed_in.data(), hashed_in.size()).Keystream(tmp, BYTE_SIZE);
Num3072::Num3072(const unsigned char (&data)[BYTE_SIZE]) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "refactor: Improve encapsulation between MuHash3072 and Num3072" (a1fccea)

Note in case it helps other reviewers: I found this commit easier to review rearranging the method order to ToNum3072 Num3072 ToBytes instead of Num3072 ToBytes ToNum3072 so diff was smaller with code moving around less.

@@ -24,31 +24,35 @@ static uint64_t GetBogoSize(const CScript& scriptPubKey)
scriptPubKey.size() /* scriptPubKey */;
}

static void ApplyStats(CCoinsStats& stats, CHashWriter& ss, const uint256& hash, const std::map<uint32_t, Coin>& outputs)
static void ApplyHash(CCoinsStats& stats, CHashWriter& ss, const uint256& hash, const std::map<uint32_t, Coin>& outputs, std::map<uint32_t, Coin>::const_iterator it)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "refactor: Separate hash and stats calculation in coinstats" (2474645)

stats arguments in unused in all three overloads and could be dropped.

Also, passing iterators around like this seems unnecessarily elaborate. It makes sense to separate stats and hash computations, but is there a reason both computations need to share the a single for loop over the output map? I would think you could make all the functions separate like:

static void ApplyHash(CHashWriter& ss, const uint256& hash, const std::map<uint32_t, Coin>& outputs);
static void ApplyHash(MuHash3072& muhash, const uint256& hash, const std::map<uint32_t, Coin>& outputs);
static void ApplyHash(std::nullptr_t, const uint256& hash, const std::map<uint32_t, Coin>& outputs);
static void ApplyStats(CCoinsStats& stats, const uint256& hash, const std::map<uint32_t, Coin>& outputs);

@fjahr
Copy link
Contributor Author

fjahr commented Feb 10, 2021

Thanks a lot for the reviews @achow101 and @ryanofsky ! I will add your suggestions if I have to push again, otherwise I will put them into the next PR in this series.

@laanwj laanwj merged commit 8d82edd into bitcoin:master Feb 12, 2021
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Feb 12, 2021
e987ae5 test: Add test for deterministic UTXO set hash results (Fabian Jahr)
6ccc8fc test: Add test for gettxoutsetinfo RPC with MuHash (Fabian Jahr)
0d3b2f6 rpc: Add hash_type MUHASH to gettxoutsetinfo (Fabian Jahr)
2474645 refactor: Separate hash and stats calculation in coinstats (Fabian Jahr)
a1fccea refactor: Improve encapsulation between MuHash3072 and Num3072 (Fabian Jahr)

Pull request description:

  This is another Pr in the series PRs for Coinstatsindex (see overview in bitcoin#18000). This PR adds the `hash_type` option `muhash` to `gettxoutsetinfo` through which the user can calculate the serialized muhash of the utxo set. This PR does not use the index yet.

ACKs for top commit:
  Sjors:
    tACK e987ae5
  achow101:
    ACK e987ae5
  jonatack:
    Tested re-ACK e987ae5 per `git diff 3506d90 e987ae5`, reviewed diff, debug built, ran gettxoutsetinfo -signet and help on this branch vs master, at height 23127 both returned `hash_serialized_2` of `2b72d65f3b6efb2311f58374ea2b939abf49684d44f4bafda45faa3b5452a454` and this branch returned `muhash` of `c9f1ff12d345ccf9939c6bbf087e6f7399b6115adee1569287e9c5c43dbb475c`
  ryanofsky:
    Code review ACK e987ae5. Looks very good. I left one suggestion to simplify code, but feel free to ignore it here and maybe consider it for later since PR has already had a lot of review.

Tree-SHA512: 9a739ce375e73749fa69a467262b60d3e5314ef384e2d7150b3bbc8e4125cd9fd1db95306623bb9a632fcbaf5d9d2bf2f5cc43bf717d4ff5e2c9c4b52dd9296c
def test_deterministic_hash_results(self):
self.log.info("Test deterministic UTXO set hash results")

# These depend on the setup_clean_chain option, the chain loaded from the cache
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no loaded chain and the utxo set is empty. Which is also useful to check, but doesn't check if the hash is deterministic?

Did you forget to load the chain from ./test/functional/data/rpc_getblockstats.json?

maflcko pushed a commit to maflcko/bitcoin-core that referenced this pull request Mar 26, 2021
4f2653a test: Use deterministic chain in utxo set hash test (Fabian Jahr)
4973c51 test: Remove wallet dependency of utxo set hash test (Fabian Jahr)
1a27af1 rpc: Improve gettxoutsetinfo help (Fabian Jahr)

Pull request description:

  Follow-ups to bitcoin#19145:
  - Small improvement on the help text of RPC gettxoutsetinfo
  - Using deterministic blockchain in the test `functional/feature_utxo_set_hash.py`
  - Removing wallet dependency in the test `functional/feature_utxo_set_hash.py`

  Split out of bitcoin#19521.

ACKs for top commit:
  MarcoFalke:
    review ACK 4f2653a 👲

Tree-SHA512: 92927b3aa22b6324eb4fc9d346755313dec44d973aa69a0ebf80a8569b5f3a7cf3539721ebdba183737534b9e29b3e33f412515890f0d0b819878032a3bba8f9
kwvg added a commit to kwvg/dash that referenced this pull request Feb 26, 2022
kwvg added a commit to kwvg/dash that referenced this pull request Feb 26, 2022
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Mar 14, 2022
Summary:
This is a backport of [[bitcoin/bitcoin#19145 | core#19145]] [1/3]
bitcoin/bitcoin@a1fccea

Depends on D11150

Test Plan: `ninja check`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D11155
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Mar 14, 2022
Summary:
This is a backport of [[bitcoin/bitcoin#19145 | core#19145]] [2/3]
bitcoin/bitcoin@2474645

Depends on D11155

Test Plan: `ninja check`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Subscribers: Fabien

Differential Revision: https://reviews.bitcoinabc.org/D11156
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Mar 14, 2022
Summary:
This concludes backport of [[bitcoin/bitcoin#19145 | core#19145]] [3/3]

bitcoin/bitcoin@0d3b2f6
bitcoin/bitcoin@6ccc8fc
bitcoin/bitcoin@e987ae5

Depends on D11162

Test Plan: `ninja all check-all`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Subscribers: Fabien

Differential Revision: https://reviews.bitcoinabc.org/D11157
kwvg added a commit to kwvg/dash that referenced this pull request Apr 3, 2022
kwvg added a commit to kwvg/dash that referenced this pull request Apr 20, 2022
kwvg added a commit to kwvg/dash that referenced this pull request Apr 24, 2022
kwvg added a commit to kwvg/dash that referenced this pull request Apr 27, 2022
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Aug 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.