Skip to content

Fix cas-friendly line table emission in llvm #6718

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

rastogishubham
Copy link

When using the option -mllvm -cas-friendly-debug-info, we expect the line table to be emitted in such a way, that it is split-able in the CAS by every function's line table contribution. This was achieved by emitting a DW_LNE_end_sequence at the end of the function's contribution to the line table. This however, had some issues.

Namely, there was an extra DW_LNE_end_sequence being emitted at the end of the line table.

There was also the case where because of DW_LNE_end_sequence denoting a reset of the state machine, unnecessary DW_LNS_set_file directives were being emitted which was preventing deduplication.

These issues have been addressed along with changes to associated tests.

@rastogishubham
Copy link
Author

@swift-ci please test


int64_t LineDelta = static_cast<int64_t>(LineEntry.getLine()) - LastLine;

if (FileNum != LineEntry.getFileNum() || LineEntry.IsEndOfFunction) {
if (FileNum != LineEntry.getFileNum()) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like there could be some NFC refactoring done upstream first? Such as moving the LineEntry.IsEndOfFunction higher up? Might make maintaining the patch easier.

@rastogishubham rastogishubham force-pushed the LineTableFix branch 2 times, most recently from 0162d6a to 7a9f612 Compare April 26, 2023 17:29
@rastogishubham
Copy link
Author

@swift-ci please test

@rastogishubham
Copy link
Author

Unfortunately there is a problem with the CAS creation of the debug_line section when this PR is applied.

Before this PR was applied, the size of the debug_line blocks were

mc:debug_line 196,232 2.35% 8,211,642 1.84% 0 0.00% 30,164,670 0.24% 36,444,094 0.22%
mc:debug_line_distinct_data 5,938 0.07% 11,119 0.00% 0 0.00% 137,299,052 1.09% 137,489,068 0.84%
mc:debug_line_section 11,119 0.13% 17,990 0.00% 8,233,880 1.85% 147,747,649 1.18% 213,974,497 1.30%

After this patch was applied the numbers are:

mc:debug_line 214,869 2.57% 8,832,044 1.98% 0 0.00% 31,359,842 0.25% 38,235,650 0.23%
mc:debug_line_distinct_data 5,938 0.07% 11,128 0.00% 0 0.00% 132,081,462 1.05% 132,271,478 0.81%
mc:debug_line_section 11,128 0.13% 17,990 0.00% 8,854,300 1.99% 147,427,970 1.17% 218,618,466 1.33%

The first step was to determine what exactly was wrong, this patch has two changes, one change removes the extra end_sequences at the end of the line table and the other one doesn’t reset the FileNum when the line table is being emitted so that there are no extra DW_LNS_set_line opcodes emitted for functions after the first function for the same header file.

To quickly diagnose what was wrong, I put up two draft PRs:

#6734 which re-adds the extra end_sequence in the line table whose build results look like

mc:debug_line 214,870 2.57% 8,842,479 1.98% 0 0.00% 31,359,849 0.25% 38,235,689 0.23%
mc:debug_line_distinct_data 5,938 0.07% 11,128 0.00% 0 0.00% 132,296,958 1.05% 132,486,974 0.81%
mc:debug_line_section 11,128 0.13% 17,990 0.00% 8,864,735 1.99% 147,876,502 1.18% 219,150,478 1.34%

and

#6735 which does not restore the FileNum when generating the line table, thereby ensuring that if a function belongs in a file number that isn’t 1, a DW_LNS_set_file will be emitted. Whose build results look like

mc:debug_line 196,367 2.35% 8,203,528 1.84% 0 0.00% 30,689,270 0.24% 36,973,014 0.23%
mc:debug_line_distinct_data 5,938 0.07% 11,127 0.00% 0 0.00% 137,101,370 1.09% 137,291,386 0.84%
mc:debug_line_section 11,127 0.13% 17,990 0.00% 8,225,782 1.85% 147,388,204 1.17% 213,550,524 1.30%

We can see that re-adding the extra end_sequence doesn’t do any improvement, but re-adding the extra DW_LNS_set_file makes the numbers look fairly close to the base case without the PR #6718

@rastogishubham
Copy link
Author

@swift-ci please test

@rastogishubham
Copy link
Author

@swift-ci please test

@rastogishubham
Copy link
Author

mc:debug_line 194,623 2.33% 8,204,512 1.84% 0 0.00% 30,547,505 0.24% 36,775,441 0.22%
mc:debug_line_distinct_data 5,938 0.07% 11,128 0.00% 0 0.00% 137,107,333 1.09% 137,297,349 0.84%
mc:debug_line_section 11,128 0.13% 17,990 0.00% 8,226,768 1.85% 147,403,166 1.17% 213,573,406 1.30%

We can see that with the latest change that forces the DW_LNS_set_file to be emitted for every line table contribution, deduplication is the highest, but the overall size is marginally higher, this is because we are emitting an extra byte for the opcode DW_LNS_set_file, unfortunately there is no way to ensure a correct line table is emitted, that is debuggable and also maximises deduplication


int64_t LineDelta = static_cast<int64_t>(LineEntry.getLine()) - LastLine;

if (FileNum != LineEntry.getFileNum() || LineEntry.IsEndOfFunction) {
Copy link

@felipepiovezan felipepiovezan May 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused about the previous code: was this || LineEntry.IsEndOfFunction ever relevant? Because right above we continue for this state 🤔

But I see now that most removals here are actually restoring the code to its upstream state

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that part was not needed, and I have removed it. I am sorry for the confusion on that.

Copy link

@felipepiovezan felipepiovezan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but please address the small comment about the empty line!

When using the option -mllvm -cas-friendly-debug-info, we expect the
line table to be emitted in such a way, that it is split-able in the CAS
by every function's line table contribution. This was achieved by
emitting a DW_LNE_end_sequence at the end of the function's contribution
to the line table. This however, had some issues.

Namely, there was an extra DW_LNE_end_sequence being emitted at the end
of the line table.

There was also the case where because of DW_LNE_end_sequence denoting
a reset of the state machine. If the file number of the next line table
is 1, then a new DW_LNS_set_file was not emitted, to fix this and
maximize deduplication, a DW_LNS_set_file is always emitted after a
DW_LNE_end_sequence.

These issues have been addressed along with changes to associated tests.
@rastogishubham
Copy link
Author

@swift-ci please test

@rastogishubham rastogishubham merged commit 30f24ba into swiftlang:experimental/cas/main May 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants