-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error symbolizing Go binaries #581
Comments
For reproduction any binary should do. Here is a simple hello world.
Then build the example:
You should not be able to symbolize associated symbols. |
Thanks for the report. May only get to looking at this issue later this week. |
I am curious as to why the elf resolver would have trouble with Go binaries. Feel free to reach out if I can help with repro / other cases. |
This could be dwarf related: --no-debug-syms allows us to get the symbol.
|
I think the root cause of the failure is the presence of compressed debug section in the binary.
|
Interesting. Yes, we do not currently support compressed debug sections and Go seems to have enabled them by default from what I understand. |
So I didn't have much time to look at it, but I suspect we need two things. First, support for dealing with compressed ELF headers. I think that should be reasonably straight forward to add. Then, presumably we'd decompress the data on the fly while reading -- that should also not be that much, though it could be a bit problematic that we currently rely on memory mapping everywhere and that could translate to large allocations. So ideally we'd rework those bits eventually to support more gradual/on demand reads. What I haven't gotten to is what, if anything, needs to happen on the DWARF side of things. I've certainly come across special section names being used, but there could be more. Will have more time to look into it next week. |
I would find the first approach of just decompressing on the fly very valuable. I think users can expect some overhead when they ask for debug information. On ideas to support more gradual reads, one thing I noticed was that when only asking for location information (without inlining), we seem to be paying the cost of more than just the debug_line section. I think the gimli dwarf loader just opens all of the dwarf it can find. I think I was seeing an 80 megs difference symbolizing an Envoy binary with debug information (and requesting code locations, no inlining info). I think the gradual reads would be nice. Though the information for inlined functions are scattered around in the debug info section (afaik), so we would still pay a significant cost when parsing these. That seems like a generally hard thing to optimize. |
Thanks for the feedback! After looking into this some more, I believe that in the case of compression, there really is no (reasonably straightforward) way to have incremental reads. The reason is simple: you (generally) don't have random access on compressed data (that's perhaps a bit of a broad statement, as I understand there may be compression formats that allow for that to some extent or trickery based on implementation details (edit: potentially better reference) that effectively allows for doing so, but in the Rust realm most decompressors only implement
This is interesting information, thanks for sharing. I haven't looked that deeply below the hood of |
All that being said, I opened #590 which should add the necessary support. Feel free to give it a try. Right now it only supports zlib compression. Let me know if you need zstd as well and I should be able to add it quickly. |
Add support for working with compressed ELF sections ([0] [1]). To the best of my knowledge, these are currently only used in conjunction with compressed DWARF information. This change only adds support zlib compressed sections. zstd is another algorithm supported by ELF and should be easy to add as a follow up. Decompression currently happens entirely at the ELF layer. This makes it a trivial addition. However, it means that we have to decompress the entire section data in advance. Doing so may be less controversial than one might think, though: generally there is no random access into compressed data and while it may be possible to build something resembling random-access for certain formats (e.g., [2]), none of the zlib Rust crates evaluated seemed to sport random-access support in any shape or form. In general, if we wanted to implement a more gradual approach, where we decompress only blocks containing data being of interest, we would need to make far reaching changes to the DWARF parsing code (because right now the parsing code works on a raw slice of bytes and once that is no longer the case data ownership will get tricky quickly). In addition to requiring aforementioned lower-level support for random access at the compression layer. Additional notes: - we do not support the legacy "GNU" zlib compression format (in whatever way it is different; documentation seems to be even more sparse than for other compression bits) - we do not support dealing with renamed sections (e.g., .zdebug); as per my understanding section renaming has been superseded by usage of the SHF_COMPRESSED flag and no contemporary toolchain should do that anymore (check for instance [3]) - we currently use the miniz_oxide crate for zlib support; it is also used by the slightly higher level flate2 crate, which is maintained by the Rust team - we could consider switching to something else if need be, e.g., zip (which effectively is one of our dev-dependencies at this point) also seems to support zlib [0] https://man.freebsd.org/cgi/man.cgi?elf(5) [1] https://maskray.me/blog/2022-01-23-compressed-debug-sections [2] https://github.com/madler/zlib/blob/51b7f2abdade71cd9bb0e7a373ef2610ec6f9daf/examples/zran.h [3] golang/go#50796 Closes: #581 Signed-off-by: Daniel Müller <deso@posteo.net>
Just wanted to say thank you for opening the issue and @d-e-s-o for very quickly implementing the feature! Been using blazesym in a new project and so far things have been super smooth. Go symbolization was the only thing I found missing! |
Description
I have hit some issues while symbolizing Go binaries. Here is an example I could reproduce with several Go binaries:
You can build a quick hello world program to reproduce the issue.
The text was updated successfully, but these errors were encountered: