-
Notifications
You must be signed in to change notification settings - Fork 36.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
util: Allow std::byte and char Span serialization #27927
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsNo conflicts as of last run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
Concept ACK. It makes sense to allow I also think it is nice that the implementation uses a Looking at surrounding code, I also think there are some followups that could be done to make it cleaner / safer / nicer (but would probably save these for a followup PR):
|
I thought about this, but concluded it isn't worth it.
This may be a bit too much hand-holding, with a small benefit. Also, it would make code more bloaty, as there are quite a few cases in the current code base where a Also, Line 651 in f1b4975
It may be good to treat all byte-like types the same, which can be achieved with Span<B> , or alternatively also delete the other B -types that are not special-cased, or alternatively delete the special-case code to treat all vectors the same.
I don't think this is possible either. It is required to speak with libsqlite3, libbdb, and internally in serialize.h. I think we should instead think about nuking the |
Rebased on #27973 for now, please review that first. |
This removes bloat that is not needed.
Did a rebase on clean master to drop the unneeded dependency. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK fa38d86. This looks great. The second commit really removes a lot of boilerplate and shows why the first commit is useful.
Can drop this line from PR description (#27927 (comment)) |
re: #27927 (comment) Thanks for the responses. You convinced me there probably little benefit to preventing serialization of char spans, and it would just add boilerplate and not make things safer. I do think we can drop the AsBytePtr function, though, so I tried to do that in #27978. I also still think in the longer run it would be good to convert more internal code to use |
utACK fa38d86 I do wonder: is there any reason why we wouldn't want to support serializing |
I think that should be fine to add, once there is a use case. Or is there already one? |
ACK fa38d86 |
Yeah, I am planning to do a more thorough cleanup with C++20, see my comment directly above and the other one earlier in the thread:
|
7c85361 refactor: Drop unsafe AsBytePtr function (Ryan Ofsky) Pull request description: Replace calls to `AsBytePtr` with calls to `AsBytes` or `reinterpret_cast`. `AsBytePtr` is just a wrapper around `reinterpret_cast`. It accepts any type of pointer as an argument and uses `reinterpret_cast` to cast the argument to a `std::byte` pointer. Despite taking any type of pointer as an argument, it is not useful to call `AsBytePtr` on most types of pointers, because byte representations of most types will be platform specific or undefined. Also, because it is named similarly to the `AsBytes` function, `AsBytePtr` looks safer than it actually is. Both `AsBytes` and `AsBytePtr` call reinterpret_cast internally and may be unsafe to use with certain types, but AsBytes at least has some type checking and can only be called on `Span` objects, while `AsBytePtr` can be called on any pointer argument. The change was motivated by discussion on #27973 and #27927 and is compatible with those PRs ACKs for top commit: jonatack: re-ACK 7c85361 sipa: utACK 7c85361 achow101: ACK 7c85361 Tree-SHA512: 200d858b1d4d579f081a7f9a14d488a99713b4918b4564ac3dd5c18578d927dbd6426e62e02f49f04a3fa73ca02ff7109c495cb0b92bec43c27d9b74e2f95757
7c85361 refactor: Drop unsafe AsBytePtr function (Ryan Ofsky) Pull request description: Replace calls to `AsBytePtr` with calls to `AsBytes` or `reinterpret_cast`. `AsBytePtr` is just a wrapper around `reinterpret_cast`. It accepts any type of pointer as an argument and uses `reinterpret_cast` to cast the argument to a `std::byte` pointer. Despite taking any type of pointer as an argument, it is not useful to call `AsBytePtr` on most types of pointers, because byte representations of most types will be platform specific or undefined. Also, because it is named similarly to the `AsBytes` function, `AsBytePtr` looks safer than it actually is. Both `AsBytes` and `AsBytePtr` call reinterpret_cast internally and may be unsafe to use with certain types, but AsBytes at least has some type checking and can only be called on `Span` objects, while `AsBytePtr` can be called on any pointer argument. The change was motivated by discussion on bitcoin#27973 and bitcoin#27927 and is compatible with those PRs ACKs for top commit: jonatack: re-ACK 7c85361 sipa: utACK 7c85361 achow101: ACK 7c85361 Tree-SHA512: 200d858b1d4d579f081a7f9a14d488a99713b4918b4564ac3dd5c18578d927dbd6426e62e02f49f04a3fa73ca02ff7109c495cb0b92bec43c27d9b74e2f95757
Github-Pull: bitcoin#27927 Rebased-From: fa257bc
This removes bloat that is not needed. Github-Pull: bitcoin#27927 Rebased-From: fa38d86
this is required to prevent the tests introduced in bitcoin#27927 from failing to compile because boost::has_left_shift<std::ostream,T>::value returns false for the std::byte specialization, resulting in a static assertion failure in Boost.Test this change was introduced in 90fc8b0 in bitcoin#23497
this is required to prevent the tests introduced in bitcoin#27927 from failing to compile because boost::has_left_shift<std::ostream,T>::value returns false for the std::byte specialization, resulting in a static assertion failure in Boost.Test this change was introduced in 90fc8b0 in bitcoin#23497
this is required to prevent the tests introduced in bitcoin#27927 from failing to compile because boost::has_left_shift<std::ostream,T>::value returns false for the std::byte specialization, resulting in a static assertion failure in Boost.Test this change was introduced in 90fc8b0 in bitcoin#23497
this is required to prevent the tests introduced in bitcoin#27927 from failing to compile because boost::has_left_shift<std::ostream,T>::value returns false for the std::byte specialization, resulting in a static assertion failure in Boost.Test this change was introduced in 90fc8b0 in bitcoin#23497
this is required to prevent the tests introduced in bitcoin#27927 from failing to compile because boost::has_left_shift<std::ostream,T>::value returns false for the std::byte specialization, resulting in a static assertion failure in Boost.Test this change was introduced in f7086fd in bitcoin#23497
this is required to prevent the tests introduced in bitcoin#27927 from failing to compile because boost::has_left_shift<std::ostream,T>::value returns false for the std::byte specialization, resulting in a static assertion failure in Boost.Test this change was introduced in f7086fd in bitcoin#23497
9f13dc1 doc: Update release notes for 25.2rc1 (Ava Chow) a27662b doc: update manpages for 25.2rc1 (Ava Chow) 65c6171 build: Bump to 25.2rc1 (Ava Chow) cf0f43e wallet: Fix use-after-free in WalletBatch::EraseRecords (MarcoFalke) 6acfc43 Use only Span{} constructor for byte-like types where possible (MarcoFalke) b40d107 util: Allow std::byte and char Span serialization (MarcoFalke) Pull request description: Backport: * #29176 * #27927 #29176 does not cleanly backport, and it also requires 27927 to work. Both are still fairly simple backports. Also does the rest of the version bump tasks for 25.2rc1. ACKs for top commit: fanquake: ACK 9f13dc1 Tree-SHA512: 9d9dbf415f8559410eba9a431b61a8fc94216898d2d1fd8398e1f7a22a04790faade810e65324c7a797456b33396c3a58f991e81319aaaa63d3ab441e5e20dbc
this is required to prevent the tests introduced in bitcoin#27927 from failing to compile because boost::has_left_shift<std::ostream,T>::value returns false for the std::byte specialization, resulting in a static assertion failure in Boost.Test this change was introduced in f7086fd in bitcoin#23497
this is required to prevent the tests introduced in bitcoin#27927 from failing to compile because boost::has_left_shift<std::ostream,T>::value returns false for the std::byte specialization, resulting in a static assertion failure in Boost.Test this change was introduced in f7086fd in bitcoin#23497
, bitcoin#21969, bitcoin#23653, bitcoin#23438, bitcoin#24190, bitcoin#24253, bitcoin#24231, bitcoin#26258, bitcoin#28012, partial bitcoin#25001, bitcoin#25296, bitcoin#23595, bitcoin#27927 (serialization updates) 2b26a87 merge bitcoin#28012: Allow FastRandomContext::randbytes for std::byte, Allow std::byte serialization (Kittywhiskers Van Gogh) 4eeafa2 partial bitcoin#27927: Allow std::byte and char Span serialization (Kittywhiskers Van Gogh) cb2fa83 test: place the std::ostream operator<< definition in namespace std (Kittywhiskers Van Gogh) cf4522f partial bitcoin#23595: Add ParseHex<std::byte>() helper (Kittywhiskers Van Gogh) e4091aa partial bitcoin#25296: Add DataStream without ser-type and ser-version (Kittywhiskers Van Gogh) eab031a merge bitcoin#26258: Remove unused CDataStream::rdbuf method (Kittywhiskers Van Gogh) 95b5850 partial bitcoin#25001: Modernize util/strencodings and util/string: string_view and optional (Kittywhiskers Van Gogh) 5fe72bb merge bitcoin#24231: Fix read-past-the-end and integer overflows (Kittywhiskers Van Gogh) 24af372 merge bitcoin#24253: Remove broken and unused CDataStream methods (Kittywhiskers Van Gogh) baf8dd6 merge bitcoin#24190: Fix sanitizer suppresions in streams_tests (Kittywhiskers Van Gogh) e933d78 merge bitcoin#23438: Use spans of std::byte in serialize (Kittywhiskers Van Gogh) d3b2822 merge bitcoin#23653: Generalize/simplify VectorReader into SpanReader (Kittywhiskers Van Gogh) 2c32a09 merge bitcoin#21969: Switch serialize to uint8_t (Kittywhiskers Van Gogh) 0a08dbf merge bitcoin#21824: Replace deprecated char with uint8_t in serialization (Kittywhiskers Van Gogh) d0b4e56 merge bitcoin#21966: Remove double serialization; use software encoder for fee estimation (Kittywhiskers Van Gogh) 1d6aafe merge bitcoin#21817: Replace &foo[0] with foo.data() (Kittywhiskers Van Gogh) d9a8ce2 trivial: move GetSerializeSize away from Stream (Un)serialize functions (Kittywhiskers Van Gogh) Pull request description: ## Additional Information * Dependency for #5902 * [bitcoin#24231](bitcoin#24231) is merged after [bitcoin#24253](bitcoin#24253) due to a MinGW bug ([comment](bitcoin#24231 (comment))) * [bitcoin#25001](bitcoin#25001) is listed as unmerged despite being committed upstream as bitcoin@132d5f8 * [bitcoin#25296](bitcoin#25296) is listed as unmerged despite being committed upstream as bitcoin@79e007d * [bitcoin#21966](bitcoin#21966) was partially backported in [dash#4197](#4197) as f946c68, including only 2be4cd9. The excluded commits have been backported, marking the pull request as fully merged. * [bitcoin#23438](bitcoin#23438) was partially backported in [dash#5574](#5574) as de54b87, including only fa65bbf. The excluded commits have been backported, marking the pull request as fully merged. * [bitcoin#27927](bitcoin#27927) opened a fresh can of hell thanks to being (possibly?) the first pull request to include `std::byte` `BOOST_CHECK`'s to the unit test suite. For reasons still unbeknownst to me, it refused to compile, despite being perfectly happy when checked-out as a commit directly and built as-is from upstream. The compile error was like this (edited for brevity): ``` CXX test/test_dash-serialize_tests.o In file included from test/serialize_tests.cpp:13: In file included from /src/dash/depends/x86_64-pc-linux-gnu/include/boost/test/unit_test.hpp:18: In file included from /src/dash/depends/x86_64-pc-linux-gnu/include/boost/test/test_tools.hpp:46: In file included from /src/dash/depends/x86_64-pc-linux-gnu/include/boost/test/tools/old/impl.hpp:24: /src/dash/depends/x86_64-pc-linux-gnu/include/boost/test/tools/detail/print_helper.hpp:53:13: error: static assertion failed due to requirement 'boost::has_left_shift<std::basic_ostream<char, std::char_traits<char>>, std::byte, boost::binary_op_detail::dont_care>::value': Type has to implement operator<< to be printable BOOST_STATIC_ASSERT_MSG( (boost::has_left_shift<std::ostream,T>::value), ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [...] <scratch space>:206:1: note: expanded from here BOOST_PP_REPEAT_1 ^ test/serialize_tests.cpp:347:9: note: in instantiation of function template specialization 'boost::test_tools::tt_detail::check_frwd<boost::test_tools::tt_detail::equal_impl_frwd, std::byte, std::byte>' requested here BOOST_CHECK_EQUAL(out.at(0), std::byte{'a'}); ^ [...] In file included from test/serialize_tests.cpp:13: In file included from /src/dash/depends/x86_64-pc-linux-gnu/include/boost/test/unit_test.hpp:18: In file included from /src/dash/depends/x86_64-pc-linux-gnu/include/boost/test/test_tools.hpp:46: In file included from /src/dash/depends/x86_64-pc-linux-gnu/include/boost/test/tools/old/impl.hpp:24: /src/dash/depends/x86_64-pc-linux-gnu/include/boost/test/tools/detail/print_helper.hpp:55:18: error: invalid operands to binary expression ('std::ostream' (aka 'basic_ostream<char>') and 'const std::byte') ostr << t; ~~~~ ^ ~ /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/cstddef:130:5: note: candidate function template not viable: no known conversion from 'std::ostream' (aka 'basic_ostream<char>') to 'byte' for 1st argument operator<<(byte __b, _IntegerType __shift) noexcept ^ [...] 5 warnings and 2 errors generated. make[2]: *** [Makefile:17842: test/test_dash-serialize_tests.o] Error 1 make[2]: *** Waiting for unfinished jobs.... make[2]: Leaving directory '/src/dash/src' make[1]: *** [Makefile:18525: all-recursive] Error 1 make[1]: Leaving directory '/src/dash/src' make: *** [Makefile:802: all-recursive] Error 1 ``` * No such error was present on Bitcoin. It is true, that no `std::ostream& operator<<(std::ostream& a, const std::byte& b)` is present by default but attempting to `grep` for any specializations didn't show anything _relevant_ that Dash didn't have. Searching on GitHub didn't help either. * Then, assuming that perhaps Boost's assertion logic may have changed, upgraded the version of Boost to match the pull request at the time, Boost 1.81. That also did not do anything (and actually, caused a trailing slashes unit test to fail but doesn't cause any problem in Bitcoin because they got rid of their `boost::filesystem` usage by then). * If that isn't it, then let's try building Bitcoin with Dash's depends. It built successfully, ran successfully. The problem isn't in the dependency, it's in the codebase. * Since it seemed to be `std::byte` related, pull requests that are related to `std::byte` serialization were backported and `std::byte` serialization related changes needed for [dash#5902](#5902) were cherry-picked. That's why this pull request came to be. But it didn't help this particular issue (though it did smooth out the cherry-picks). * Running out of ideas, `gdb` is used to step through `serialize_tests`'s `class_methods` and understand why `BOOST_CHECK_EQUAL(out.at(0), std::byte{'a'});` is valid in Bitcoin but not Dash by finding the elusive `operator<<`. This is where things go from bad to worse. Turns out, when you build with `clang`, `gdb` loses the ability to do breakpoints by file. So, an attempt is made to use `lldb` (which btw, is called `lldb-16`, running `lldb` with yield an error if you're using the develop container) and it refuses to work, erroring `personality set failed: Operation not permitted`. Turns out, the [`docker-compose.yml`](https://github.com/dashpay/dash/blob/37f43e4e56de0d510110a7f790df34ea77748dc9/contrib/containers/develop/docker-compose.yml) needs the following additions: ``` cap_add: - SYS_PTRACE security_opt: - "seccomp:unconfined" ``` After making these changes, `lldb` works and then we resume trying to find `operator<<`. After too many hours and nimbly alternating between `next` and `step`, tried making a return to `gdb` (compiling with `gcc` this time with the appropriate `CXXFLAGS`) hoping for different results and a while later, realized that it cannot step through Boost's headers (it doesn't recognize the filenames) and then recompile it with `clang` and return to `lldb`. This was a wild goose chase. * After a lot of futile efforts to find the operator by stepping through `BOOST_CHECK_EQUAL`, a basic example addressing the static assertion (that a left shift operator must exist of `<type>` (here `std::byte`) for `std::ostream`) was added in ```c++ std::ostream& ostr = std::cout; ostr << std::byte{'a'}; ``` ...and it compiled in both codebases. So the left shift that Boost is asserting doesn't exist does exist but it isn't being detected for some reason. Upon hovering the `<<`, VSCode highlighted the source of the definition as [`setup_common.h`](https://github.com/dashpay/dash/blob/96ac317c2714afcac1ff4e2df309f17149f42776/src/test/util/setup_common.h#L27-L32) thanks to the comment above it. Diffing between Bitcoin and Dash revealed the secret, the `operator<<` definition was placed under namespace `std` by [bitcoin#23497](bitcoin#23497) in bitcoin@f7086fd (see [change](https://github.com/bitcoin/bitcoin/blame/f7086fd8ff084ab0dd656d75b7485e59263bdfd8/src/test/util/setup_common.h#L29-L35)). That change has now been made in a separate commit. ## Breaking Changes Changes in serialization APIs will make backports predating [bitcoin#23438](bitcoin#23438) annoying but will _not_ change how data is stored on disk. ## Checklist: _Go over all the following points, and put an `x` in all the boxes that apply._ - [x] I have performed a self-review of my own code - [x] I have commented my code, particularly in hard-to-understand areas **(note: N/A)** - [x] I have added or updated relevant unit/integration/functional/e2e tests **(note: N/A)** - [x] I have made corresponding changes to the documentation **(note: N/A)** - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_ Top commit has no ACKs. Tree-SHA512: 1d7c67f5fe282f78e24cb720828e5f1f48b6926006b903c28399938532cc5c470c175b00c8b80e9662c4467c1201e09ae6d1cd9b95e8b20ace5e4410c72c472e
Github-Pull: bitcoin/bitcoin#27927 Rebased-From: fa257bc8312b91c2d281f48ca2500d9cba353cc5
This removes bloat that is not needed. Github-Pull: bitcoin/bitcoin#27927 Rebased-From: fa38d862358b87219b12bf31236c52f28d9fc5d6
Seems odd to require developers to cast all byte-like spans passed to serialization to
unsigned char
-spans. Fix that by passing and accepting byte-like spans as-is. Finally, add tests and update the code to use justSpan
where possible.