Skip to content

[BOLT][DWARF] Add support for .debug_names #81062

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Feb 26, 2024
Merged

Conversation

ayermolo
Copy link
Contributor

@ayermolo ayermolo commented Feb 8, 2024

DWARF5 spec supports the .debug_names acceleration table. This is the formalized version of combination of gdb-index/pubnames/types. Added implementation of it to BOLT. It supports both monolothic and split dwarf, with and without Type Units. It does not include parent indices. This will be in followup PR. Unlike LLVM output this will put all the CUs and TUs into one Module.

Copy link
Member

@dcci dcci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial run of comments.

@@ -378,11 +377,13 @@ getUnitForOffset(DIEBuilder &Builder, DWARFContext &DWCtx,
return nullptr;
}

uint32_t DIEBuilder::computeDIEOffset(const DWARFUnit &CU, DIE &Die,
uint32_t DIEBuilder::computeDIEOffset(DWARFUnit &CU, DIE &Die,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the method computes DIE offset, it's better to keep the CU parameter a constant ref, and do the modifications outside of this method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed const because DWARF5AcceleratorTable::addUnit invokes StrOffsetsWriter.initialize(Unit); which invokes DebugStrOffsetsWriter::initialize, which invokes Unit.getStringOffsetsTableContribution() that is non const. Because it invokes extractDIEsIfNeeded.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm strongly in favor of keeping CU a const ref since semantically computeDIEOffset should not change the input CU (or DIE). Please make modifications to the CU outside of this method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not making changes to CU. I am invoking none const APIs from it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you then add a const API so you don't have to drop const qualifier? As a separate diff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue is

 const std::optional<StrOffsetsContributionDescriptor> &
  getStringOffsetsTableContribution() {
    extractDIEsIfNeeded(true /*CUDIeOnly*/);
    return StringOffsetsTableContribution;
  }

It parses debug information and sets StringOffsetsTableContribution first time it's called.

I can add

  const std::optional<StrOffsetsContributionDescriptor> &
  getStringOffsetsTableContribution() const {
    assert(StringOffsetsTableContribution);
    return StringOffsetsTableContribution;
  } 

But then implementation will rely on extractDIEsIfNeeded being invoked for this cu at some point. Probably will, but this seems flaky to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried with change on an internal service of reasonable size. No issues, but having basically the same API with quite different behavior based on const seems like just asking for trouble.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope you understand why I'm raising this issue – that if the method is called computeDIEOffset, it's reasonable that CU is passed as a const ref.

Dropping the const qualifier is a symptom of a larger issue: that a parameter that's not supposed to be modified starts to be modified. I understand that it's not possible/practical to add a const-qualified method, that's ok. I suggested moving the modifications to CU/calling non-const stuff outside of this method – what do you think about that?
As a final resort, you can rename the method to computeDIEOffsetAndAddAccelTableEntry and drop const.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point. The reason I put adding entries here to avoid traversing all these CUs/DIEs again and wasting resources.
I can get StringOffsetsTableContribution before the call, but I don't think it fundamentally addresses your concern. Why are we passing StrOffsetsContributionDescriptor into computeDIEOffset?

So I think the option you mentioned of renaming to something else would be the way to go. Maybe something more generic processAndFinalizeDIEs? In case down the road we add more stuff here. For example once ability to delete dies is added I think computing GDB Index will need to move out of main updatedebuginfo.

Copy link
Member

@dcci dcci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if Amir's happy with it.

ayermolo added a commit to ayermolo/llvm-project that referenced this pull request Feb 21, 2024
Changed isDWO to a function that checks Skeleton CU that is passed in. This is
for preparation for llvm#81062.
ayermolo added a commit to ayermolo/llvm-project that referenced this pull request Feb 21, 2024
Refactored clearning data structures in DebugStrOffsetsWriter into clear()
function and made initialize() public. This is for
llvm#81062.
ayermolo added a commit that referenced this pull request Feb 22, 2024
Changed isDWO to a function that checks Skeleton CU that is passed in.
This is for preparation for
#81062.
ayermolo added a commit to ayermolo/llvm-project that referenced this pull request Feb 22, 2024
Refactored clearning data structures in DebugStrOffsetsWriter into clear()
function and made initialize() public. This is for
llvm#81062.
ayermolo added a commit that referenced this pull request Feb 22, 2024
Refactored cod that clears data-structures in DebugStrOffsetsWriter into
clear() function and made initialize() public. This is for
#81062.
DWARF5 spec supports the .debug_names acceleration table. This is the formalized
version of combination of gdb-index/pubnames/types. Added implementation of it
to BOLT. It supports both monolothic and split dwarf, with and without Type
Units. It does not include parent indices. This will be in followup PR. Unlike
LLVM output this will put all the CUs and TUs into one Module.
@ayermolo
Copy link
Contributor Author

ayermolo commented Feb 23, 2024

Stats collected on internal binary, monolithic dwarf, with descent size .debug_names section (bolt generated)
[70] .debug_names PROGBITS 0000000000000000 1b03a2b06 877d092 00 0 0 1

base:
9:12.18 real, 642.61 user, 186.98 sys, 0 amem, 78477316 mmem
.debug_names before string offset fix:
11:02.32 real, 742.95 user, 188.14 sys, 0 amem, 82591356 mmem
.debug_names after string offset refactor (no longer using DebugStrOffsetsWriter)
9:42.25 real, 668.48 user, 188.72 sys, 0 amem, 82923700 mmem

Copy link
Contributor

@aaupov aaupov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on debug_names support. LGTM but I'd wait for Maksim's comments if he has any.

Copy link
Contributor

@maksfb maksfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for creating DebugNames under Core and not under Rewrite?

@ayermolo
Copy link
Contributor Author

Any reason for creating DebugNames under Core and not under Rewrite?

Any reason for creating DebugNames under Core and not under Rewrite?

Well it's not a re-writer it's a supporting data structure. I viewed it similar to DIEBuilder, DebugData

@maksfb
Copy link
Contributor

maksfb commented Feb 26, 2024

Any reason for creating DebugNames under Core and not under Rewrite?

Well it's not a re-writer it's a supporting data structure. I viewed it similar to DIEBuilder, DebugData

We've discussed the file placement offline. We'll need a separate PR to move DebugNames out of \Core as Alexander ran into a circular dependency issue while trying to do so.

@ayermolo ayermolo merged commit 6de5fcc into llvm:main Feb 26, 2024
@ayermolo ayermolo deleted the debugNamesSupport branch February 26, 2024 22:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants