Skip to content

[XRay][X86] Handle conditional calls when lowering patchable tail calls #89364

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 28, 2024

Conversation

rickyz
Copy link
Member

@rickyz rickyz commented Apr 19, 2024

xray instruments tail call function exits by inserting a nop sled before
the tail call. When tracing is enabled, the nop sled is replaced with a
call to __xray_FunctionTailExit(). This currently does not work for
conditional tail calls, as the instrumentation assumes that the tail
call will be unconditional. This causes two issues:

  • __xray_FunctionTailExit() is inappropately called even when the
    tail call is not taken.
  • __xray_FunctionTailExit()'s prologue/epilogue adjusts the stack
    pointer with add/sub instructions. This clobbers condition flags,
    which can flip the condition used for the tail call, leading to
    incorrect program behavior.

Fix this by rewriting conditional calls when lowering patchable tail
calls.

With this change, a conditional patchable tail call like:

  je target

Will be lowered to:

  jne .fallthrough
  .p2align 1, ..
.Lxray_sled_N:
  SLED_CODE
  jmp target
.fallthrough:

@llvmbot
Copy link
Member

llvmbot commented Apr 19, 2024

@llvm/pr-subscribers-xray

@llvm/pr-subscribers-backend-x86

Author: Ricky Zhou (rickyz)

Changes

xray instruments tail call function exits by inserting a nop sled before
the tail call. When tracing is enabled, the nop sled is replaced with a
call to __xray_FunctionTailExit(). This currently does not work for
conditional tail calls, as the instrumentation assumes that the tail
call will be unconditional. This causes two issues:

  • __xray_FunctionTailExit() is inappropately called even when the
    tail call is not taken.
  • __xray_FunctionTailExit()'s prologue/epilogue adjusts the stack
    pointer with add/sub instructions. This clobbers condition flags,
    which can flip the condition used for the tail call, leading to
    incorrect program behavior.

Avoid this misbehavior by disabling instrumentation of conditional tail
calls. A more comprehensive fix could be to either rewrite conditional
tail calls to non-conditional ones, or to create specialized function
tail exit instrumentation functions for each of the possible types of
conditional tail calls.


Full diff: https://github.com/llvm/llvm-project/pull/89364.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/XRayInstrumentation.cpp (+2-2)
  • (added) llvm/test/CodeGen/X86/xray-conditional-tail-call.ll (+28)
diff --git a/llvm/lib/CodeGen/XRayInstrumentation.cpp b/llvm/lib/CodeGen/XRayInstrumentation.cpp
index d40725838c943c..e5791ec12d3e5f 100644
--- a/llvm/lib/CodeGen/XRayInstrumentation.cpp
+++ b/llvm/lib/CodeGen/XRayInstrumentation.cpp
@@ -100,7 +100,7 @@ void XRayInstrumentation::replaceRetWithPatchableRet(
         //   PATCHABLE_RET <Opcode>, <Operand>...
         Opc = TargetOpcode::PATCHABLE_RET;
       }
-      if (TII->isTailCall(T) && op.HandleTailcall) {
+      if (TII->isTailCall(T) && T.isBarrier() && op.HandleTailcall) {
         // Treat the tail call as a return instruction, which has a
         // different-looking sled than the normal return case.
         Opc = TargetOpcode::PATCHABLE_TAIL_CALL;
@@ -131,7 +131,7 @@ void XRayInstrumentation::prependRetWithPatchableExit(
           (op.HandleAllReturns || T.getOpcode() == TII->getReturnOpcode())) {
         Opc = TargetOpcode::PATCHABLE_FUNCTION_EXIT;
       }
-      if (TII->isTailCall(T) && op.HandleTailcall) {
+      if (TII->isTailCall(T) && T.isBarrier() && op.HandleTailcall) {
         Opc = TargetOpcode::PATCHABLE_TAIL_CALL;
       }
       if (Opc != 0) {
diff --git a/llvm/test/CodeGen/X86/xray-conditional-tail-call.ll b/llvm/test/CodeGen/X86/xray-conditional-tail-call.ll
new file mode 100644
index 00000000000000..270e4da3ce50bd
--- /dev/null
+++ b/llvm/test/CodeGen/X86/xray-conditional-tail-call.ll
@@ -0,0 +1,28 @@
+; RUN: llc -mtriple=x86_64 < %s | FileCheck %s
+
+declare void @tail_call_target()
+
+define void @conditional_tail_call(i32 %cond) "function-instrument"="xray-always" nounwind {
+  ; CHECK-LABEL: conditional_tail_call:
+  ; CHECK-NEXT:  .Lfunc_begin0:
+  ; CHECK-NEXT:  # %bb.0:
+  ; CHECK-NEXT:    .p2align 1, 0x90
+  ; CHECK-NEXT:  .Lxray_sled_0:
+  ; CHECK-NEXT:    .ascii "\353\t"
+  ; CHECK-NEXT:    nopw 512(%rax,%rax)
+  ; CHECK-NEXT:    testl %edi, %edi
+  ; CHECK-NEXT:    jne tail_call_target@PLT # TAILCALL
+  ; CHECK-NEXT:  # %bb.1:
+  ; CHECK-NEXT:    .p2align 1, 0x90
+  ; CHECK-NEXT:  .Lxray_sled_1:
+  ; CHECK-NEXT:    retq
+  ; CHECK-NEXT:    nopw %cs:512(%rax,%rax)
+  ; CHECK-NEXT:  .Lfunc_end0:
+  %cmp = icmp ne i32 %cond, 0
+  br i1 %cmp, label %docall, label %ret
+docall:
+  tail call void @tail_call_target()
+  ret void
+ret:
+  ret void
+}

@rickyz rickyz changed the title xray: Do not emit tail exit hooks for conditional tail calls [xray] Do not emit tail exit hooks for conditional tail calls Apr 19, 2024
@rickyz rickyz force-pushed the xray_conditional_tail_call branch from 6586fcf to 0e2970b Compare April 19, 2024 10:32
@rickyz
Copy link
Member Author

rickyz commented Apr 19, 2024

@llvm/pr-subscribers-xray since the paths at https://github.com/orgs/llvm/teams/pr-subscribers-xray don't seem to match this PR.

rickyz added a commit to rickyz/llvm-project that referenced this pull request Apr 19, 2024
Previously, some xray trampolines would modify condition codes (before
saving/after restoring flags) due to stack alignment instructions, which
use add/sub.

I am not aware of issues that this causes in practice (outside of the
situation described in llvm#89364,
which is only problematic due to a different bug). Nevertheless, it
seems nicer and less error-prone for xray instrumentation to be as
unobstrusive/preserve as much state as possible.
@toddlipcon
Copy link

If the tail call has no instrumentation at all, doesn't the xray-tracked stack end up messed up? ie we think the tail call is nested inside the main call instead of being a sibling, and then potentially never trace an exit for the main call?

@rickyz rickyz force-pushed the xray_conditional_tail_call branch from 0e2970b to 64111ef Compare April 20, 2024 00:14
@rickyz
Copy link
Member Author

rickyz commented Apr 20, 2024

Ah, I was hoping that the optionality of

meant that it was OK to omit trace records, but it does seem like it would lead to inaccurate traces.

Thinking about this more, it should actually be straightforward to rewrite conditional tail calls when lowering them - I've updated this change to do that.

@rickyz
Copy link
Member Author

rickyz commented Apr 20, 2024

By the way, I'm not sure whether this bug happens on other architectures with xray support. The most straightforward reference I found to emitting conditional tail calls was

if (TII->canMakeTailCallConditional(PredCond, TailCall)) {
and canMakeTailCallConditional() is only true on x86. I'm not sure if I am missing other places that might generate these though.

@rickyz rickyz changed the title [xray] Do not emit tail exit hooks for conditional tail calls [xray] Handle conditional calls when lowering patchable tail calls. Apr 20, 2024
@rickyz rickyz force-pushed the xray_conditional_tail_call branch from 64111ef to 7d4be9f Compare April 20, 2024 00:48
xray instruments tail call function exits by inserting a nop sled before
the tail call. When tracing is enabled, the nop sled is replaced with a
call to `__xray_FunctionTailExit()`. This currently does not work for
conditional tail calls, as the instrumentation assumes that the tail
call will be unconditional. This causes two issues:
 - `__xray_FunctionTailExit()` is inappropately called even when the
   tail call is not taken.
 - `__xray_FunctionTailExit()`'s prologue/epilogue adjusts the stack
   pointer with add/sub instructions. This clobbers condition flags,
   which can flip the condition used for the tail call, leading to
   incorrect program behavior.

Fix this by rewriting conditional calls when lowering patchable tail
calls.

With this change, a conditional patchable tail call like:
```
  je target
```

Will be lowered to:
```
  jne .fallthrough
  .p2align 1, ..
.Lxray_sled_N:
  SLED_CODE
  jmp target
.fallthrough:
```
@rickyz rickyz force-pushed the xray_conditional_tail_call branch from 7d4be9f to 1bbcedb Compare April 20, 2024 01:04
@rickyz rickyz changed the title [xray] Handle conditional calls when lowering patchable tail calls. [XRay][X86] Handle conditional calls when lowering patchable tail calls. May 10, 2024
MaskRay pushed a commit that referenced this pull request May 27, 2024
…9452)

Previously, some xray trampolines would modify condition codes (before
saving/after restoring flags) due to stack alignment instructions, which
use add/sub.

I am not aware of issues that this causes in practice (outside of the
situation described in #89364,
which is only problematic due to a different bug). Nevertheless, it
seems nicer and less error-prone for xray instrumentation to be as
unobstrusive/preserve as much state as possible.
@MaskRay MaskRay added the xray label May 28, 2024
@MaskRay
Copy link
Member

MaskRay commented May 28, 2024

By the way, I'm not sure whether this bug happens on other architectures with xray support. The most straightforward reference I found to emitting conditional tail calls was

if (TII->canMakeTailCallConditional(PredCond, TailCall)) {

and canMakeTailCallConditional() is only true on x86. I'm not sure if I am missing other places that might generate these though.

Yes. It seems that only x86 performs conditional tail call.

Most RISC architectures cannot perform conditional tail call (except very specific assembly files) because conditional branches usually have a shorter range than unconditional branches and might not reach the destination at link time. RISC architectures typically do not implement range extension thunks for conditional branches.

@deanberris may have an idea why xray handles tail calls specially.

@MaskRay MaskRay changed the title [XRay][X86] Handle conditional calls when lowering patchable tail calls. [XRay][X86] Handle conditional calls when lowering patchable tail calls May 28, 2024
…alls.

Merge xray-conditional-tail-call.ll into xray-tail-call-sled.ll.
@rickyz
Copy link
Member Author

rickyz commented May 28, 2024

Thanks!

@deanberris may have an idea why xray handles tail calls specially.

fwiw, I suspect xray needs to instrument tail calls specially so that the correct caller/callee is relationship is maintained in traces - if a tail call were modeled as a function exit followed by a function entry call, then a caller and its tail callee would appear to be siblings rather than parent/child in the call trace info.

@MaskRay MaskRay merged commit fcffea0 into llvm:main May 28, 2024
5 of 7 checks passed
@deanberris
Copy link
Member

Thanks!

@deanberris may have an idea why xray handles tail calls specially.

fwiw, I suspect xray needs to instrument tail calls specially so that the correct caller/callee is relationship is maintained in traces - if a tail call were modeled as a function exit followed by a function entry call, then a caller and its tail callee would appear to be siblings rather than parent/child in the call trace info.

I'm only a few months late to this, but this is precisely why XRay does this. 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants