Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mempool file over 32 MB cannot be deserialized #30

Closed
lontivero opened this issue Feb 24, 2021 · 20 comments
Closed

Mempool file over 32 MB cannot be deserialized #30

lontivero opened this issue Feb 24, 2021 · 20 comments
Labels

Comments

@lontivero
Copy link

Expected behavior

Mempool should be deserialized without error.

Actual behavior

2021-02-24T15:18:39Z Failed to deserialize mempool data on disk: ReadCompactSize(): size too large: iostream error. Continuing anyway.

To reproduce

$ bitcoin-cli stop
... (wait)
$ pgrep bitcoind
$ bitcoind

It seems to happen always with no exception (100% reproducible in our environment)

Note

We didn't experienced this with Bitcoin Core.

System information

Bitcoin Knots version v0.21.0.0-gd4a64f61c1346b15ccb91eafbae16ffc8f52e577 (release build)

But this same problem was present in the previous released version v0.20.0

Ubuntu 18.04.4 LTS (no GUI)

Logs
2021-02-24T15:15:27Z ThreadRPCServer method=stop user=bitcoinuser
2021-02-24T15:15:27Z Interrupting HTTP RPC server
2021-02-24T15:15:27Z Interrupting RPC
2021-02-24T15:15:27Z tor: Thread interrupt
2021-02-24T15:15:27Z Shutdown: In progress...
2021-02-24T15:15:27Z Stopping HTTP RPC server
2021-02-24T15:15:27Z addcon thread exit
2021-02-24T15:15:27Z opencon thread exit
2021-02-24T15:15:27Z RemoveLocal(7v4ytzubiaguuwvw.onion:8333)
2021-02-24T15:15:27Z Stopping RPC
2021-02-24T15:15:27Z torcontrol thread exit
2021-02-24T15:15:27Z RPC stopped.
2021-02-24T15:15:27Z net thread exit
2021-02-24T15:15:27Z msghand thread exit
2021-02-24T15:15:28Z scheduler thread exit
2021-02-24T15:15:30Z Dumped mempool: 0.19573s to copy, 2.15715s to dump
2021-02-24T15:15:30Z FlushStateToDisk: write coins cache to disk (959903 coins, 138092kB) started
2021-02-24T15:15:32Z FlushStateToDisk: write coins cache to disk (959903 coins, 138092kB) completed (2.33s)
2021-02-24T15:15:32Z FlushStateToDisk: write coins cache to disk (0 coins, 14660kB) started
2021-02-24T15:15:32Z FlushStateToDisk: write coins cache to disk (0 coins, 14660kB) completed (0.03s)
2021-02-24T15:15:33Z [default wallet] Releasing wallet
2021-02-24T15:15:33Z Shutdown: done





2021-02-24T15:18:32Z Bitcoin Knots version v0.21.0.0-gd4a64f61c1346b15ccb91eafbae16ffc8f52e577 (release build)
2021-02-24T15:18:32Z InitParameterInteraction: parameter interaction: -whitebind set -> setting -listen=1
2021-02-24T15:18:32Z Assuming ancestors of block 00000000000000000007888a9d01313d69d6335df46ea33e875ee6832670c596 have valid signatures.
2021-02-24T15:18:32Z Setting nMinimumChainWork=0000000000000000000000000000000000000000189ccba50031c45c16a96544
2021-02-24T15:18:32Z ParameterInteraction: parameter interaction: -walletimplicitsegwit=0 -> setting -addresstype=legacy
2021-02-24T15:18:32Z Using the 'sse4(1way),sse41(4way),avx2(8way)' SHA256 implementation
2021-02-24T15:18:32Z Using RdSeed as additional entropy source
2021-02-24T15:18:32Z Using RdRand as an additional entropy source
2021-02-24T15:18:32Z Default data directory /home/user/.bitcoin
2021-02-24T15:18:32Z Using data directory /home/user/.bitcoin
2021-02-24T15:18:32Z Config file: /home/user/.bitcoin/bitcoin.conf
2021-02-24T15:18:32Z R/W Config file: /home/user/.bitcoin/bitcoin_rw.conf (not found, skipping)
2021-02-24T15:18:32Z Config file arg: daemon="1"
2021-02-24T15:18:32Z Config file arg: debug="rpc"
2021-02-24T15:18:32Z Config file arg: mempoolreplacement="fee,optin"
2021-02-24T15:18:32Z Config file arg: rpcpassword=****
2021-02-24T15:18:32Z Config file arg: rpcthreads="8"
2021-02-24T15:18:32Z Config file arg: rpcuser=****
2021-02-24T15:18:32Z Config file arg: rpcworkqueue="256"
2021-02-24T15:18:32Z Config file arg: server="1"
2021-02-24T15:18:32Z Config file arg: testnet="0"
2021-02-24T15:18:32Z Config file arg: txindex="1"
2021-02-24T15:18:32Z Config file arg: whitebind="127.0.0.1:8333"
2021-02-24T15:18:32Z Using at most 125 automatic connections (16384 file descriptors available)
2021-02-24T15:18:32Z Using 16 MiB out of 32/2 requested for signature cache, able to store 524288 elements
2021-02-24T15:18:32Z Using 16 MiB out of 32/2 requested for script execution cache, able to store 524288 elements
2021-02-24T15:18:32Z Script verification uses 15 additional threads
2021-02-24T15:18:32Z scheduler thread start
2021-02-24T15:18:32Z HTTP: creating work queue of depth 256
2021-02-24T15:18:32Z Starting RPC
2021-02-24T15:18:32Z Starting HTTP RPC server
2021-02-24T15:18:32Z Config options rpcuser and rpcpassword will soon be deprecated. Locally-run instances may remove rpcuser to use cookie-based auth, or may be replaced with rpcauth. Please see share/rpcauth for rpcauth auth generation.
2021-02-24T15:18:32Z HTTP: starting 8 worker threads
2021-02-24T15:18:32Z Using wallet directory /home/user/.bitcoin
2021-02-24T15:18:32Z init message: Verifying wallet(s)...
2021-02-24T15:18:32Z Using BerkeleyDB version Berkeley DB 4.8.30: (April  9, 2010)
2021-02-24T15:18:32Z Using wallet /home/user/.bitcoin/wallet.dat
2021-02-24T15:18:32Z BerkeleyEnvironment::Open: LogDir=/home/user/.bitcoin/database ErrorFile=/home/user/.bitcoin/db.log
2021-02-24T15:18:33Z init message: Loading banlist...
2021-02-24T15:18:33Z SetNetworkActive: true
2021-02-24T15:18:33Z Using /16 prefix for IP bucketing
2021-02-24T15:18:33Z Cache configuration:
2021-02-24T15:18:33Z * Using 2.0 MiB for block index database
2021-02-24T15:18:33Z * Using 56.0 MiB for transaction index database
2021-02-24T15:18:33Z * Using 8.0 MiB for chain state database
2021-02-24T15:18:33Z * Using 384.0 MiB for in-memory UTXO set (plus up to 286.1 MiB of unused mempool space)
2021-02-24T15:18:33Z * Flushing caches if available system memory drops below 10 MiB
2021-02-24T15:18:33Z init message: Loading block index...
2021-02-24T15:18:33Z Switching active chainstate to Chainstate [ibd] @ height -1 (null)
2021-02-24T15:18:33Z Opening LevelDB in /home/user/.bitcoin/blocks/index
2021-02-24T15:18:33Z Opened LevelDB successfully
2021-02-24T15:18:33Z Using obfuscation key for /home/user/.bitcoin/blocks/index: 0000000000000000
2021-02-24T15:18:37Z LoadBlockIndexDB: last block file = 2465
2021-02-24T15:18:37Z LoadBlockIndexDB: last block file info: CBlockFileInfo(blocks=99, size=129829546, heights=671887...671985, time=2021-02-24...2021-02-24)
2021-02-24T15:18:37Z Checking all blk files are present...
2021-02-24T15:18:37Z Opening LevelDB in /home/user/.bitcoin/chainstate
2021-02-24T15:18:38Z Opened LevelDB successfully
2021-02-24T15:18:38Z Using obfuscation key for /home/user/.bitcoin/chainstate: c405746c298350d2
2021-02-24T15:18:38Z Loaded best chain: hashBestChain=00000000000000000004ca94ef1c162008391a09bc438ce4c6c702a040f28a84 height=671985 date=2021-02-24T15:07:50Z progress=0.999996
2021-02-24T15:18:38Z init message: Rewinding blocks...
2021-02-24T15:18:38Z FlushStateToDisk: write coins cache to disk (0 coins, 0kB) started
2021-02-24T15:18:38Z FlushStateToDisk: write coins cache to disk (0 coins, 0kB) completed (0.00s)
2021-02-24T15:18:38Z init message: Verifying blocks...
2021-02-24T15:18:38Z Verifying last 6 blocks at level 3
2021-02-24T15:18:38Z [0%]...[16%]...[33%]...[50%]...[66%]...[83%]...[99%]...[DONE].
2021-02-24T15:18:39Z No coin database inconsistencies in last 6 blocks (14986 transactions)
2021-02-24T15:18:39Z  block index            6548ms
2021-02-24T15:18:39Z Opening LevelDB in /home/user/.bitcoin/indexes/txindex
2021-02-24T15:18:39Z Opened LevelDB successfully
2021-02-24T15:18:39Z Using obfuscation key for /home/user/.bitcoin/indexes/txindex: 0000000000000000
2021-02-24T15:18:39Z txindex thread start
2021-02-24T15:18:39Z txindex is enabled at height 671985
2021-02-24T15:18:39Z txindex thread exit
2021-02-24T15:18:39Z init message: Loading wallet...
2021-02-24T15:18:39Z BerkeleyEnvironment::Open: LogDir=/home/user/.bitcoin/database ErrorFile=/home/user/.bitcoin/db.log
2021-02-24T15:18:39Z [default wallet] Wallet File Version = 159900
2021-02-24T15:18:39Z [default wallet] Keys: 2001 plaintext, 0 encrypted, 2001 w/ metadata, 2001 total. Unknown wallet records: 0
2021-02-24T15:18:39Z [default wallet] Wallet completed loading in              40ms
2021-02-24T15:18:39Z [default wallet] setKeyPool.size() = 2000
2021-02-24T15:18:39Z [default wallet] mapWallet.size() = 0
2021-02-24T15:18:39Z [default wallet] m_address_book.size() = 0
2021-02-24T15:18:39Z block tree size = 672043
2021-02-24T15:18:39Z nBestHeight = 671985
2021-02-24T15:18:39Z loadblk thread start
2021-02-24T15:18:39Z AddLocal(178.62.244.102:8333,1)
2021-02-24T15:18:39Z Discover: IPv4 eth0: 178.62.244.102
2021-02-24T15:18:39Z Failed to deserialize mempool data on disk: ReadCompactSize(): size too large: iostream error. Continuing anyway.

@nopara73
Copy link

Reproduced it on Windows.

@nopara73
Copy link

Bitcoin Knots version v0.20.1 was also doing this, but we thought it's just a fluke.

@molnard
Copy link

molnard commented Feb 24, 2021

How to reproduce

Override mempool.dat here C:\Users\user\AppData\Roaming\Bitcoin
https://drive.google.com/file/d/13PFtUE-ITsSBLq0rrb5uF6jQKqka3Xb7/view?usp=sharing

See an error in debug.log

2021-02-24T16:25:25Z Failed to deserialize mempool data on disk: ReadCompactSize(): size too large: iostream error. Continuing anyway.

After this mempool lost and the node starting build up the mempool from scratch.

Our backend is using Ubuntu, I reproduced it on Windows, both had the bug so it is platform-independent.

@lontivero
Copy link
Author

lontivero commented Feb 24, 2021

FYI: test/functional/mempool_persist.py pass

2021-02-24T18:20:13.531856Z [httpworker.2] [default wallet] Submitting wtx 60179ab15ab27be9c3a42ca6dcaf53cf62466acce89ddf3980edf7cd831ed48c to mempool for relay
2021-02-24T18:20:13.579655Z [httpworker.0] [default wallet] Submitting wtx fc45c3df4ee7c5bb3c1a8b9be11bf9f7a5313c0fbe777745d1954a6ff2d82b0e to mempool for relay
2021-02-24T18:20:13.629336Z [httpworker.2] [default wallet] Submitting wtx 5dee1effb2e7b0b967a85629a225428df2dd897312f740a01daac8e8b02ca192 to mempool for relay
2021-02-24T18:20:13.679815Z [httpworker.0] [default wallet] Submitting wtx 2807b5bb86cc67de3038f2b69fa0d5bda3da7a1a3b3e870226d586a001c91cb3 to mempool for relay
2021-02-24T18:20:13.686547Z [httpworker.1] ThreadRPCServer method=getrawmempool user=__cookie__
2021-02-24T18:20:13.689265Z [httpworker.3] ThreadRPCServer method=getrawmempool user=__cookie__
2021-02-24T18:23:40.625955Z [shutoff] Dumped mempool: 7e-06s to copy, 0.001928s to dump
2021-02-24T18:23:40.625990Z [shutoff] Recorded 5 unconfirmed txs from mempool in 6e-06s
2021-02-24T18:24:08.402190Z [init] * Using 440.0 MiB for in-memory UTXO set (plus up to 286.1 MiB of unused mempool space)
2021-02-24T18:24:08.453175Z [init] [default wallet] Submitting wtx 934ef4ffe1aded1d28f57cc8c0130e91c0dcf0362df973087aa016746830aa8d to mempool for relay
2021-02-24T18:24:08.453507Z [init] [default wallet] Submitting wtx 60179ab15ab27be9c3a42ca6dcaf53cf62466acce89ddf3980edf7cd831ed48c to mempool for relay
2021-02-24T18:24:08.454012Z [init] [default wallet] Submitting wtx fc45c3df4ee7c5bb3c1a8b9be11bf9f7a5313c0fbe777745d1954a6ff2d82b0e to mempool for relay
2021-02-24T18:24:08.454530Z [loadblk] Imported mempool transactions from disk: 5 succeeded, 0 failed, 0 expired, 0 already there, 5 waiting for initial broadcast
2021-02-24T18:24:08.454617Z [init] [default wallet] Submitting wtx 5dee1effb2e7b0b967a85629a225428df2dd897312f740a01daac8e8b02ca192 to mempool for relay
2021-02-24T18:24:08.454710Z [init] [default wallet] Submitting wtx 2807b5bb86cc67de3038f2b69fa0d5bda3da7a1a3b3e870226d586a001c91cb3 to mempool for relay
2021-02-24T18:24:08.466766Z [httpworker.1] ThreadRPCServer method=getmempoolinfo user=__cookie__

Using the @molnard mempool.dat file it seems to fail while deserializing the "txs" entry.

@lontivero
Copy link
Author

lontivero commented Feb 25, 2021

It seems the mempool.dat file is broken but I don't see how is that possible. contains transactions that sum a size bigger than the allowed.

 ~$ cat ~/.bitcoin/regtest/mempool.dat  | xxd | head -100
00000000: 0200 0000 0000 0000 0306 6465 6c74 6173  ..........deltas
00000010: 0100 066d 696e 6665 6511 7680 d2c5 c53b  ...minfee.v....;
00000020: bf40 0188 6d36 6000 0000 0003 7478 73fe  .@..m6`.....txs.
00000030: 79ee 9506 fece 3801 0002 00fd e501 0200  y.....8.........

@molnard
Copy link

molnard commented Feb 25, 2021

Just mentioning, I saved the file on the fly - I did not stop bitcoind service, I was afraid of losing the mempool again.

@nopara73
Copy link

Here's a mempool file created by Knots: if you load it with Bitcoin Core 21, then it's successful, if you do it with Bitcoin Knots 21 then it fails.

https://filebin.net/bp477ybe98z95bxy

@nopara73
Copy link

FTR @luke-jr the reason why we're investigating it so fiercely because we cannot restart Knots on the server without the mempool corrupting.

@MaxHillebrand
Copy link

we cannot restart Knots on the server without the mempool corrupting.

And this takes away our most powerful skill of trouble shooting - trying to turn it off and on again... 😢

@lontivero
Copy link
Author

It seems it cannot load txs verctors larger than MAX_SIZE (33_554_432) in the our case that vector is 110_ 489_ 209 long.

@luke-jr
Copy link
Member

luke-jr commented Feb 25, 2021

A workaround would probably be -maxmempool=32 I suppose.

I had planned to drop the custom mempool.dat format in 22.0 already (0.21 added compatibility code for Core's format), but sounds like it might be good to do it sooner.

@luke-jr luke-jr added the bug label Feb 25, 2021
@luke-jr luke-jr changed the title Mempool file cannot be deserialized when bitcoind is started Mempool file over 32 MB cannot be deserialized Feb 25, 2021
@molnard
Copy link

molnard commented Feb 26, 2021

A workaround would probably be -maxmempool=32 I suppose.

I had planned to drop the custom mempool.dat format in 22.0 already (0.21 added compatibility code for Core's format), but sounds like it might be good to do it sooner.

It is a priority issue for us. 32Mb mempool is a no-go, on the contrary, it should be increased more to avoid wallet state inconsistencies. All the clients (without the Full node), maintain their mempool according to the backend's mempool. Every time I need to restart the backend - for any reason mempool lost. It is like sitting on a bomb.

Is it possible to fix this ASAP in a hotfix release, under any conditions?

@luke-jr
Copy link
Member

luke-jr commented Feb 26, 2021

@molnard If you can apply a patch and build yourself, this one should write a Core mempool.dat instead (which is supported by Knots 0.21) https://dpaste.com/ELMEQ8ZUG

Does this hotfix work for you, or you need more?

@lontivero
Copy link
Author

lontivero commented Feb 27, 2021

@luke-jr I am a bit blocked with this hotfix. I applied the patch and compiled it but I cannot make the test/functional/mempool_compatibility.py pass because it fails with:

FileNotFoundError: [Errno 2] No such file or directory: '/mnt/data/GitHub/bitcoin-knots/releases/v0.19.1/bin/bitcoind'
/tmp/bitcoin_func_test_seni8rds
2021-02-27T18:24:07.926000Z TestFramework (DEBUG): PRNG seed is: 6820928144085222329
2021-02-27T18:24:07.926000Z TestFramework (DEBUG): Setting up network thread
2021-02-27T18:24:07.926000Z TestFramework (INFO): Initializing test directory /tmp/bitcoin_func_test_seni8rds
2021-02-27T18:24:07.926000Z TestFramework (DEBUG): Copy cache directory /mnt/data/GitHub/bitcoin-knots/test/cache/node0 to node 0
2021-02-27T18:24:07.940000Z TestFramework (DEBUG): Copy cache directory /mnt/data/GitHub/bitcoin-knots/test/cache/node0 to node 1
2021-02-27T18:24:07.955000Z TestFramework.node0 (DEBUG): bitcoind started, waiting for RPC to come up
2021-02-27T18:24:07.958000Z TestFramework.node1 (DEBUG): bitcoind started, waiting for RPC to come up
2021-02-27T18:24:08.210000Z TestFramework.node0 (DEBUG): Stopping node
2021-02-27T18:24:08.210000Z TestFramework (ERROR): Assertion failed
Traceback (most recent call last):
  File "/mnt/data/GitHub/bitcoin-knots/test/functional/test_framework/test_framework.py", line 507, in start_nodes
    node.wait_for_rpc_connection()
  File "/mnt/data/GitHub/bitcoin-knots/test/functional/test_framework/test_node.py", line 227, in wait_for_rpc_connection
    raise FailedToStartError(self._node_msg(
test_framework.test_node.FailedToStartError: [node 0] bitcoind exited with status 1 during initialization

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/mnt/data/GitHub/bitcoin-knots/test/functional/test_framework/test_framework.py", line 125, in main
    self.setup()
  File "/mnt/data/GitHub/bitcoin-knots/test/functional/test_framework/test_framework.py", line 262, in setup
    self.setup_network()
  File "./test/functional/mempool_compatibility.py", line 35, in setup_network
    self.start_nodes()
  File "/mnt/data/GitHub/bitcoin-knots/test/functional/test_framework/test_framework.py", line 510, in start_nodes
    self.stop_nodes()
  File "/mnt/data/GitHub/bitcoin-knots/test/functional/test_framework/test_framework.py", line 526, in stop_nodes
    node.stop_node(wait=wait)
  File "/mnt/data/GitHub/bitcoin-knots/test/functional/test_framework/test_node.py", line 326, in stop_node
    self.stop(wait=wait)
  File "/mnt/data/GitHub/bitcoin-knots/test/functional/test_framework/test_node.py", line 186, in __getattr__
    assert self.rpc_connected and self.rpc is not None, self._node_msg("Error: no RPC connection")
AssertionError: [node 0] Error: no RPC connection
2021-02-27T18:24:08.213000Z TestFramework (DEBUG): Closing down network thread
2021-02-27T18:24:08.265000Z TestFramework (INFO): Stopping nodes
2021-02-27T18:24:08.265000Z TestFramework.node0 (DEBUG): Stopping node

This is because knots-v0.19.1 says: This software is expired

/m/d/G/bitcoin-knots (0.21.x-knots) [1]> ./releases/v0.19.1/bin/bitcoind --version
Bitcoin Knots version v0.19.1.knots20200304
Copyright (C) 2009-2019 The Bitcoin Core developers

Please contribute if you find Bitcoin Knots useful. Visit
<http://bitcoinknots.org/> for further information about the software.
The source code is available from <https://github.com/bitcoinknots/bitcoin>.

This is experimental software.
Distributed under the MIT software license, see the accompanying file COPYING
or <https://opensource.org/licenses/MIT>

This product includes software developed by the OpenSSL Project for use in the
OpenSSL Toolkit <https://www.openssl.org> and cryptographic software written by
Eric Young and UPnP software written by Thomas Bernard.

/m/d/G/bitcoin-knots (0.21.x-knots)> ./releases/v0.19.1/bin/bitcoind
This software is expired, and may be out of consensus. You must choose to upgrade or override this expiration.

What am i doing wrong? How can I run the test/functional/mempool_compatibility.py?

Thanks for the hotfix.

@luke-jr
Copy link
Member

luke-jr commented Feb 27, 2021

It is expecting Core releases for the compatibility tests. (I do have a Knots compatibility variant locally, but it's a hack and I didn't adapt it to this hotfix)

@lontivero
Copy link
Author

Tested. It is compatible with Core mempool.dat files in the two ways.

@molnard
Copy link

molnard commented Mar 1, 2021

@molnard
Copy link

molnard commented Jun 28, 2021

With this mempool file, you will end up the same as here.

2021-06-28T17:00:43Z Failed to deserialize mempool data on disk: ReadCompactSize(): size too large: iostream error. Continuing anyway.

https://github.com/molnard/bitcoin/releases/download/v0.21.0.0-14a6a1dd2/mempool.dat

@luke-jr
Copy link
Member

luke-jr commented Jul 4, 2021

This should be fixed in 0.21.1.knots20210629. Please close the issue once you've confirmed.

(Additional gitian signatures would be appreciated BTW)

@lontivero
Copy link
Author

Fixed and deployed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants
@lontivero @luke-jr @nopara73 @molnard @MaxHillebrand and others