Skip to content

Turn off instruction flow control annotations by default (#84607) #8372

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

jasonmolenda
Copy link

Walter Erquinigo added optional instruction annotations for x86 instructions in 2022 for the thread trace dump instruction command, and code to DisassemblerLLVMC to add annotations for instructions that change flow control, v. https://reviews.llvm.org/D128477

This was added as an option to disassemble, and the trace dump command enables it by default, but several other instruction dumpers were changed to display them by default as well. These are only implemented for Intel instructions, so our disassembly on other targets ends up looking like

(lldb) x/5i 0x1000086e4
0x1000086e4: 0xa9be6ffc   unknown     stp    x28, x27, [sp, #-0x20]!
0x1000086e8: 0xa9017bfd   unknown     stp    x29, x30, [sp, #0x10]
0x1000086ec: 0x910043fd   unknown     add    x29, sp, #0x10
0x1000086f0: 0xd11843ff   unknown     sub    sp, sp, #0x610
0x1000086f4: 0x910c63e8   unknown     add    x8, sp, #0x318

instead of disassemble's output style of

lldb`main:
lldb[0x1000086e4] <+0>:  stp    x28, x27, [sp, #-0x20]!
lldb[0x1000086e8] <+4>:  stp    x29, x30, [sp, #0x10]
lldb[0x1000086ec] <+8>:  add    x29, sp, #0x10
lldb[0x1000086f0] <+12>: sub    sp, sp, #0x610
lldb[0x1000086f4] <+16>: add    x8, sp, #0x318

Adding symbolic annotations for assembly instructions is something I'm interested in too, because we may have users investigating a crash or apparent-incorrect behavior who must debug optimized assembly and they may not be familiar with the ISA they're using, so short of flipping through a many-thousand-page PDF to understand each instruction, they're lost. They don't write assembly or work at that level, but to understand a bug, they have to understand what the instructions are actually doing.

But the annotations that exist today don't move us forward much on that front - I'd argue that the flow control instructions on Intel are not hard to understand from their names, but that might just be my personal bias. Much trickier instructions exist in any event.

Displaying this information by default for all targets when we only have one class of instructions on one target is not a good default.

Also, in 2011 when Greg implemented the memory read -f i (aka x/i) command

commit 5009f9d5010a7e34ae15f962dac8505ea11a8716
Author: Greg Clayton <[email protected]>
Date:   Thu Oct 27 17:55:14 2011 +0000
[...]
    eFormatInstruction will print out disassembly with bytes and it will use the
    current target's architecture. The format character for this is "i" (which
    used to be being used for the integer format, but the integer format also has
    "d", so we gave the "i" format to disassembly), the long format is
    "instruction".

he had DumpDataExtractor's DumpInstructions print the bytes of the instruction -- that's the first field we see above for the x/5i after the address -- and this is only useful for people who are debugging the disassembler itself, I would argue. I don't want this displayed by default either.

tl;dr this patch removes both fields from memory read -f -i and I think this is the right call today. While I'm really interested in instruction annotation, I don't think x/i is the right place to have it enabled by default unless it's really compelling on at least some of our major targets.

(cherry picked from commit bdbad0d)

Walter Erquinigo added optional instruction annotations for x86
instructions in 2022 for the `thread trace dump instruction` command,
and code to DisassemblerLLVMC to add annotations for instructions that
change flow control, v. https://reviews.llvm.org/D128477

This was added as an option to `disassemble`, and the trace dump command
enables it by default, but several other instruction dumpers were
changed to display them by default as well. These are only implemented
for Intel instructions, so our disassembly on other targets ends up
looking like

```
(lldb) x/5i 0x1000086e4
0x1000086e4: 0xa9be6ffc   unknown     stp    x28, x27, [sp, #-0x20]!
0x1000086e8: 0xa9017bfd   unknown     stp    x29, x30, [sp, #0x10]
0x1000086ec: 0x910043fd   unknown     add    x29, sp, #0x10
0x1000086f0: 0xd11843ff   unknown     sub    sp, sp, #0x610
0x1000086f4: 0x910c63e8   unknown     add    x8, sp, #0x318
```

instead of `disassemble`'s output style of

```
lldb`main:
lldb[0x1000086e4] <+0>:  stp    x28, x27, [sp, #-0x20]!
lldb[0x1000086e8] <+4>:  stp    x29, x30, [sp, #0x10]
lldb[0x1000086ec] <+8>:  add    x29, sp, #0x10
lldb[0x1000086f0] <+12>: sub    sp, sp, #0x610
lldb[0x1000086f4] <+16>: add    x8, sp, #0x318
```

Adding symbolic annotations for assembly instructions is something I'm
interested in too, because we may have users investigating a crash or
apparent-incorrect behavior who must debug optimized assembly and they
may not be familiar with the ISA they're using, so short of flipping
through a many-thousand-page PDF to understand each instruction, they're
lost. They don't write assembly or work at that level, but to understand
a bug, they have to understand what the instructions are actually doing.

But the annotations that exist today don't move us forward much on that
front - I'd argue that the flow control instructions on Intel are not
hard to understand from their names, but that might just be my personal
bias. Much trickier instructions exist in any event.

Displaying this information by default for all targets when we only have
one class of instructions on one target is not a good default.

Also, in 2011 when Greg implemented the `memory read -f i` (aka `x/i`)
command
```
commit 5009f9d
Author: Greg Clayton <[email protected]>
Date:   Thu Oct 27 17:55:14 2011 +0000
[...]
    eFormatInstruction will print out disassembly with bytes and it will use the
    current target's architecture. The format character for this is "i" (which
    used to be being used for the integer format, but the integer format also has
    "d", so we gave the "i" format to disassembly), the long format is
    "instruction".
```

he had DumpDataExtractor's DumpInstructions print the bytes of the
instruction -- that's the first field we see above for the `x/5i` after
the address -- and this is only useful for people who are debugging the
disassembler itself, I would argue. I don't want this displayed by
default either.

tl;dr this patch removes both fields from `memory read -f -i` and I
think this is the right call today. While I'm really interested in
instruction annotation, I don't think `x/i` is the right place to have
it enabled by default unless it's really compelling on at least some of
our major targets.

(cherry picked from commit bdbad0d)
@jasonmolenda
Copy link
Author

@swift-ci test

@jasonmolenda
Copy link
Author

@swift-ci test linux

1 similar comment
@jasonmolenda
Copy link
Author

@swift-ci test linux

@jasonmolenda
Copy link
Author

@swift-ci test macos

@jasonmolenda
Copy link
Author

@swift-ci test windows

@jasonmolenda jasonmolenda merged commit 36cbdac into swiftlang:stable/20230725 Mar 13, 2024
@jasonmolenda jasonmolenda deleted the cp/omit-flow-control-field-on-disassembly branch March 13, 2024 00:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant