-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[llvm][dwarf][rfc][donotcommit] Enable print of ranges addresses from .debug_info.dwo #65516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… .debug_info.dwo Summary: For split dwarf some of the sections remain in the main binary. For DWARF4 it's .debug_ranges, .debug_addr. For DWARF5 it's .debug_addr. When using llvm-dwarfdump on .dwo/.dwp files this results in not being able to see what ranges and addresses for DW_AT_low_pc are used in DIEs, and output having "Error: " in it. I added a new option --main-binary=<binary> that will create a link in DWARFContext between DWO context and main binary. This allows tool to display addresses for DW_AT_ranges and DW_AT_low_pc. Example (DWARF5): DW_TAG_inlined_subroutine DW_AT_abstract_origin (0x00000b21 "flush_RL") DW_AT_ranges (indexed (0x0) rangelist = 0x00000054 [0x0000000000403fe2, 0x0000000000403ff3) [0x0000000000403ff6, 0x0000000000403ffe)) DW_TAG_subprogram DW_AT_low_pc (0x0000000000403940)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I have to specify the dwo and the .o simultaneously, why would I do that instead of just running dwarfdump on the .o (which will follow links to the .dwo)?
This is also missing a test.
@@ -125,6 +125,12 @@ class DWARFContext : public DIContext { | |||
DWARFUnitVector &getDWOUnits(bool Lazy = false); | |||
|
|||
std::unique_ptr<const DWARFObject> DObj; | |||
/// Can be optionally set by tools that work with .dwo/.dwp files to reference | |||
/// main binary debug information. Usefull for accessing .debug_ranges and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Useful
@@ -1004,21 +1006,47 @@ void DWARFContext::dump( | |||
DObj->getAbbrevDWOSection())) | |||
getDebugAbbrevDWO()->dump(OS); | |||
|
|||
std::unordered_map<uint64_t, DWARFUnit *> DwoIDToDUMap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually use llvm::DenseMap.
https://llvm.org/docs/CodingStandards.html#c-standard-library
Not sure what you mean. llvm-dwarfdump won't follow the DW_AT_comp_dr+DW_AT_dwo_name to the .dwo file if you use it on the the main elf binary. The idea behind the patch is being able to output .dwo/.dwp directly with all the references to .debug_ranges (DWARF4) .debug_addr resolved. Forgot to copy a comment from original phab review. I was planning on adding tests once it's not clear this is not dead on arrival. :) |
(be good to have a link to the original phab review to make it clear whatever context is being carried from there - I think I commented on the phab version of this) Yeah - currently if you dumped just the binary, it wouldn't dump associated dwo/dwp files - but we could make that happen & then it wouldn't need another input file argument (though might still need another flag to say whetwher to do this deep dumping, versus the current shallow behavior) That could also use the existing filter flags that only dump part of the input file to avoid dumping, say, the whole dwp file or all the dwo files. I don't feel /too/ strongly, but I do rather like the idea of relying on the existing paths/references in the format, rather than adding a new one. |
Maybe I am missing something, but one downside going from main binary and using existing references (getNonSkeletonCU?), is that we will need to parse through all the CUs in the main binary to match DWO ID and figure out for which CU we need to get non-skeleton portion. In DWARF4 it's not part of the header either :( For pure split dwarf it's not that big of a deal, but for more complex build where there there are hundreds of megabytes of monolithic DWARF this can get annoying fast for the user. Do you think I need to add anyone else to this review? |
Yeah, I was suggesting/thinking that the user could rely on What's the use case you're interested in, I guess? If you didn't have Split DWARF, what would you be doing with the DWARF/trying to find out? (might be helpful to build features that aren't Split DWARF-specific, but work with Split DWARF too)
Don't think so? |
This is 100% split dwarf specific. Main usage model is to be able to dump out .debug_info.dwo either full or specific dies (with usual optional parent/children flags), and see addresses and ranges. So to bring the functionality of monolithic dwarf to split-dwarf space. |
I'm trying to ask about the higher level goal - what's the problem you were trying to address by dumping this information? Presumably you were looking to see if some piece of the DWARF encoded some data correctly, etc? And what I'm asking is, if the binary hadn't been built with Split DWARF, what would be the right tools to help dump just the data you're interested in? (eg: if you were looking at a specific function and wanted to see how the address ranges of the function were emitted - the ability to search the DWARF by function name, using the index, or scoping it to a specific file could be useful - and that could be suitable with or without Split DWARF) And if we had /that/ tool, and made it work with Split DWARF too, then we'd have a more unified mechanism for answering that sort of question. |
Without the split dwarf the correct tool still would have been llvm-dwarfdump. The goal is to be able to look at "random" DIEs, or all DIEs, and see full debug information about that DIE. One concrete example is verify that debug information BOLT outputs is correct. This came up during internal verification of llvm-gsymutil. It was reporting error that address range was not in it's parent. At high level right now when you look at output of llvm-dwarfdump for monolithic case you can query CU for all the dies, or random DIE and see full information that includes address and ranges. This functionality is missing when we enable split dwarf and try to output context of .debug_info.dwo section. This requires manual effort of figuring out what correct offset is in .debug_ranges/.debug_addr from CU, adding correct index, etc. What I would like is to have a functionality in existing tool, llvm-dwarfdump, to display the same information for monolithic case and for split-dwarf. Preferably as fast as possible. |
I posted the PR for doing it the other way where we specify the main binary: |
@dwblaikie Should I close this, or WDYT? |
Yeah - sorry about this. I know the usability tradeoff either way (dwo->exe, exe->dwo) isn't super smooth either way, but yeah - let's continue over on the other review. |
Summary:
For split dwarf some of the sections remain in the main binary. For DWARF4 it's
.debug_ranges, .debug_addr. For DWARF5 it's .debug_addr. When using
llvm-dwarfdump on .dwo/.dwp files this results in not being able to see what ranges
and addresses for DW_AT_low_pc are used in DIEs, and output having "Error: " in it.
I added a new option --main-binary= that will create a link in
DWARFContext between DWO context and main binary. This allows tool to display
addresses for DW_AT_ranges and DW_AT_low_pc.
Example (DWARF5):
DW_TAG_inlined_subroutine
DW_AT_abstract_origin (0x00000b21 "flush_RL")
DW_AT_ranges (indexed (0x0) rangelist = 0x00000054
[0x0000000000403fe2, 0x0000000000403ff3)
[0x0000000000403ff6, 0x0000000000403ffe))
DW_TAG_subprogram
DW_AT_low_pc (0x0000000000403940)
Original phab review: https://reviews.llvm.org/D159374