Skip to content

[NVPTX] instcombine known pointer AS checks. #112964

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 30, 2024

Conversation

Artem-B
Copy link
Member

@Artem-B Artem-B commented Oct 18, 2024

This avoids crashing on impossible address space casts guarded by __isGlobal/__isShared.

Partially fixes #112760

It's still possible to trigger the issue by using explicit AS casts w/o AS checks, but LLVM should no longer crash on valid code.

@Artem-B Artem-B requested a review from nikic as a code owner October 18, 2024 19:49
@llvmbot llvmbot added backend:NVPTX llvm:support llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Oct 18, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 18, 2024

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-support

Author: Artem Belevich (Artem-B)

Changes

This avoids crashing on impossible address space casts guarded by __isGlobal/__isShared.

Partially fixes #112760

It's still possible to trigger the issue by using explicit AS casts w/o AS checks, but LLVM should no longer crash on valid code.


Full diff: https://github.com/llvm/llvm-project/pull/112964.diff

4 Files Affected:

  • (added) llvm/include/llvm/Support/NVPTXAddrSpace.h (+33)
  • (modified) llvm/lib/Analysis/InstructionSimplify.cpp (+30)
  • (modified) llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXBaseInfo.h (+2-10)
  • (added) llvm/test/Transforms/InstCombine/NVPTX/isspacep.ll (+261)
diff --git a/llvm/include/llvm/Support/NVPTXAddrSpace.h b/llvm/include/llvm/Support/NVPTXAddrSpace.h
new file mode 100644
index 00000000000000..063d2aaffdc57d
--- /dev/null
+++ b/llvm/include/llvm/Support/NVPTXAddrSpace.h
@@ -0,0 +1,33 @@
+//===---------------- AMDGPUAddrSpace.h -------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+/// \file
+/// AMDGPU address space definition
+///
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_SUPPORT_NVPTXADDRSPACE_H
+#define LLVM_SUPPORT_NVPTXADDRSPACE_H
+
+namespace llvm {
+namespace NVPTXAS {
+enum AddressSpace : unsigned {
+  ADDRESS_SPACE_GENERIC = 0,
+  ADDRESS_SPACE_GLOBAL = 1,
+  ADDRESS_SPACE_SHARED = 3,
+  ADDRESS_SPACE_CONST = 4,
+  ADDRESS_SPACE_LOCAL = 5,
+
+  ADDRESS_SPACE_PARAM = 101,
+};
+} // end namespace NVPTXAS
+
+} // end namespace llvm
+
+#endif // LLVM_SUPPORT_NVPTXADDRSPACE_H
diff --git a/llvm/lib/Analysis/InstructionSimplify.cpp b/llvm/lib/Analysis/InstructionSimplify.cpp
index d08be1e55c853e..b525bc27d72b8b 100644
--- a/llvm/lib/Analysis/InstructionSimplify.cpp
+++ b/llvm/lib/Analysis/InstructionSimplify.cpp
@@ -38,10 +38,12 @@
 #include "llvm/IR/Dominators.h"
 #include "llvm/IR/InstrTypes.h"
 #include "llvm/IR/Instructions.h"
+#include "llvm/IR/IntrinsicsNVPTX.h"
 #include "llvm/IR/Operator.h"
 #include "llvm/IR/PatternMatch.h"
 #include "llvm/IR/Statepoint.h"
 #include "llvm/Support/KnownBits.h"
+#include "llvm/Support/NVPTXAddrSpace.h"
 #include <algorithm>
 #include <optional>
 using namespace llvm;
@@ -6365,6 +6367,34 @@ static Value *simplifyUnaryIntrinsic(Function *F, Value *Op0,
 
     break;
   }
+  case Intrinsic::nvvm_isspacep_global:
+  case Intrinsic::nvvm_isspacep_local:
+  case Intrinsic::nvvm_isspacep_shared:
+  case Intrinsic::nvvm_isspacep_const: {
+    auto *Ty = F->getReturnType();
+    unsigned AS = Op0->getType()->getPointerAddressSpace();
+    if (AS == NVPTXAS::ADDRESS_SPACE_GENERIC) {
+      if (auto *ASC = dyn_cast<AddrSpaceCastInst>(Op0))
+        AS = ASC->getSrcAddressSpace();
+      else if (auto *CE = dyn_cast<ConstantExpr>(Op0)) {
+        if (CE->getOpcode() == Instruction::AddrSpaceCast)
+          AS = CE->getOperand(0)->getType()->getPointerAddressSpace();
+      }
+    }
+    if (AS == NVPTXAS::ADDRESS_SPACE_GENERIC ||
+        AS == NVPTXAS::ADDRESS_SPACE_PARAM)
+      return nullptr; // Got to check at run-time.
+    bool ASMatches = (AS == NVPTXAS::ADDRESS_SPACE_GLOBAL &&
+                      IID == Intrinsic::nvvm_isspacep_global) ||
+                     (AS == NVPTXAS::ADDRESS_SPACE_LOCAL &&
+                      IID == Intrinsic::nvvm_isspacep_local) ||
+                     (AS == NVPTXAS::ADDRESS_SPACE_SHARED &&
+                      IID == Intrinsic::nvvm_isspacep_shared) ||
+                     (AS == NVPTXAS::ADDRESS_SPACE_CONST &&
+                      IID == Intrinsic::nvvm_isspacep_const);
+    return ConstantInt::get(Ty, ASMatches);
+    break;
+  }
   default:
     break;
   }
diff --git a/llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXBaseInfo.h b/llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXBaseInfo.h
index 815b600fe93a9f..d06e2c00ec3f96 100644
--- a/llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXBaseInfo.h
+++ b/llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXBaseInfo.h
@@ -16,18 +16,10 @@
 #ifndef LLVM_LIB_TARGET_NVPTX_MCTARGETDESC_NVPTXBASEINFO_H
 #define LLVM_LIB_TARGET_NVPTX_MCTARGETDESC_NVPTXBASEINFO_H
 
+#include "llvm/Support/NVPTXAddrSpace.h"
 namespace llvm {
 
-enum AddressSpace {
-  ADDRESS_SPACE_GENERIC = 0,
-  ADDRESS_SPACE_GLOBAL = 1,
-  ADDRESS_SPACE_SHARED = 3,
-  ADDRESS_SPACE_CONST = 4,
-  ADDRESS_SPACE_LOCAL = 5,
-
-  // NVVM Internal
-  ADDRESS_SPACE_PARAM = 101
-};
+using namespace NVPTXAS;
 
 namespace NVPTXII {
 enum {
diff --git a/llvm/test/Transforms/InstCombine/NVPTX/isspacep.ll b/llvm/test/Transforms/InstCombine/NVPTX/isspacep.ll
new file mode 100644
index 00000000000000..f53ec0120cfb3e
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/NVPTX/isspacep.ll
@@ -0,0 +1,261 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -passes=instcombine -mtriple=nvptx64-nvidia-cuda -S | FileCheck %s
+target datalayout = "e-i64:64-i128:128-v16:16-v32:32-n16:32:64"
+target triple = "nvptx64-nvidia-cuda"
+
+; Source data in different AS.
+@shared_data = dso_local addrspace(3) global i32 undef, align 4
+@global_data = dso_local addrspace(1) externally_initialized global i32 0, align 4
+@const_data = dso_local addrspace(4) externally_initialized constant i32 3, align 4
+
+; Results get stored here.
+@gen = dso_local addrspace(1) externally_initialized global i8 0, align 1
+@g1 = dso_local addrspace(1) externally_initialized global i8 0, align 1
+@g2 = dso_local addrspace(1) externally_initialized global i8 0, align 1
+@s1 = dso_local addrspace(1) externally_initialized global i8 0, align 1
+@s2 = dso_local addrspace(1) externally_initialized global i8 0, align 1
+@c1 = dso_local addrspace(1) externally_initialized global i8 0, align 1
+@c2 = dso_local addrspace(1) externally_initialized global i8 0, align 1
+@l = dso_local addrspace(1) externally_initialized global i8 0, align 1
+
+declare i1 @llvm.nvvm.isspacep.global(ptr nocapture)
+declare i1 @llvm.nvvm.isspacep.shared(ptr nocapture)
+declare i1 @llvm.nvvm.isspacep.const(ptr nocapture)
+declare i1 @llvm.nvvm.isspacep.local(ptr nocapture)
+
+define dso_local void @check_global(ptr nocapture noundef readnone %out, ptr nocapture noundef readnone %generic_data, ptr addrspace(5) %local_data) local_unnamed_addr {
+; CHECK-LABEL: define dso_local void @check_global(
+; CHECK-SAME: ptr nocapture noundef readnone [[OUT:%.*]], ptr nocapture noundef readnone [[GENERIC_DATA:%.*]], ptr addrspace(5) [[LOCAL_DATA:%.*]]) local_unnamed_addr {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[GEN0:%.*]] = tail call i1 @llvm.nvvm.isspacep.global(ptr [[GENERIC_DATA]])
+; CHECK-NEXT:    [[STOREDV:%.*]] = zext i1 [[GEN0]] to i8
+; CHECK-NEXT:    store i8 [[STOREDV]], ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+; CHECK-NEXT:    store i8 1, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+; CHECK-NEXT:    store i8 1, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+; CHECK-NEXT:    ret void
+;
+entry:
+  ; No constant folding for generic pointers of unknown origin.
+  %gen0 = tail call i1 @llvm.nvvm.isspacep.global(ptr %generic_data)
+  %storedv = zext i1 %gen0 to i8
+  store i8 %storedv, ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+
+  %isg1 = tail call i1 @llvm.nvvm.isspacep.global(ptr addrspacecast (ptr addrspace(1) @global_data to ptr))
+  %isg18 = zext i1 %isg1 to i8
+  store i8 %isg18, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+
+  %global_data_asc = addrspacecast ptr addrspace(1) @global_data to ptr
+  %isg2 = tail call i1 @llvm.nvvm.isspacep.global(ptr %global_data_asc)
+  %isg28 = zext i1 %isg2 to i8
+  store i8 %isg28, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+
+  %iss1 = tail call i1 @llvm.nvvm.isspacep.global(ptr addrspacecast (ptr addrspace(3) @shared_data to ptr))
+  %iss18 = zext i1 %iss1 to i8
+  store i8 %iss18, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+
+  %shared_data_asc = addrspacecast ptr addrspace(3) @shared_data to ptr
+  %iss2 = tail call i1 @llvm.nvvm.isspacep.global(ptr %shared_data_asc)
+  %iss28 = zext i1 %iss2 to i8
+  store i8 %iss28, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+
+  %isc1 = tail call i1 @llvm.nvvm.isspacep.global(ptr addrspacecast (ptr addrspace(4) @const_data to ptr))
+  %isc18 = zext i1 %isc1 to i8
+  store i8 %isc18, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+
+  %const_data_asc = addrspacecast ptr addrspace(4) @const_data to ptr
+  %isc2 = tail call i1 @llvm.nvvm.isspacep.global(ptr %const_data_asc)
+  %isc28 = zext i1 %isc2 to i8
+  store i8 %isc28, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+
+  ; Local data can't ihave a constant address, so we can't have a constant ASC expression
+  ; We can only use an ASC instruction.
+  %local_data_asc = addrspacecast ptr addrspace(5) %local_data to ptr
+  %isl = call i1 @llvm.nvvm.isspacep.global(ptr nonnull %local_data_asc)
+  %isl8 = zext i1 %isl to i8
+  store i8 %isl8, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+
+  ret void
+}
+
+define dso_local void @check_shared(ptr nocapture noundef readnone %out, ptr nocapture noundef readnone %generic_data, ptr addrspace(5) %local_data) local_unnamed_addr {
+; CHECK-LABEL: define dso_local void @check_shared(
+; CHECK-SAME: ptr nocapture noundef readnone [[OUT:%.*]], ptr nocapture noundef readnone [[GENERIC_DATA:%.*]], ptr addrspace(5) [[LOCAL_DATA:%.*]]) local_unnamed_addr {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[GEN0:%.*]] = tail call i1 @llvm.nvvm.isspacep.shared(ptr [[GENERIC_DATA]])
+; CHECK-NEXT:    [[STOREDV:%.*]] = zext i1 [[GEN0]] to i8
+; CHECK-NEXT:    store i8 [[STOREDV]], ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+; CHECK-NEXT:    store i8 1, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+; CHECK-NEXT:    store i8 1, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+; CHECK-NEXT:    ret void
+;
+entry:
+  ; No constant folding for generic pointers of unknown origin.
+  %gen0 = tail call i1 @llvm.nvvm.isspacep.shared(ptr %generic_data)
+  %storedv = zext i1 %gen0 to i8
+  store i8 %storedv, ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+
+  %isg1 = tail call i1 @llvm.nvvm.isspacep.shared(ptr addrspacecast (ptr addrspace(1) @global_data to ptr))
+  %isg18 = zext i1 %isg1 to i8
+  store i8 %isg18, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+
+  %global_data_asc = addrspacecast ptr addrspace(1) @global_data to ptr
+  %isg2 = tail call i1 @llvm.nvvm.isspacep.shared(ptr %global_data_asc)
+  %isg28 = zext i1 %isg2 to i8
+  store i8 %isg28, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+
+  %iss1 = tail call i1 @llvm.nvvm.isspacep.shared(ptr addrspacecast (ptr addrspace(3) @shared_data to ptr))
+  %iss18 = zext i1 %iss1 to i8
+  store i8 %iss18, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+
+  %shared_data_asc = addrspacecast ptr addrspace(3) @shared_data to ptr
+  %iss2 = tail call i1 @llvm.nvvm.isspacep.shared(ptr %shared_data_asc)
+  %iss28 = zext i1 %iss2 to i8
+  store i8 %iss28, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+
+  %isc1 = tail call i1 @llvm.nvvm.isspacep.shared(ptr addrspacecast (ptr addrspace(4) @const_data to ptr))
+  %isc18 = zext i1 %isc1 to i8
+  store i8 %isc18, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+
+  %const_data_asc = addrspacecast ptr addrspace(4) @const_data to ptr
+  %isc2 = tail call i1 @llvm.nvvm.isspacep.shared(ptr %const_data_asc)
+  %isc28 = zext i1 %isc2 to i8
+  store i8 %isc28, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+
+  ; Local data can't have a constant address, so we can't have a constant ASC expression
+  ; We can only use an ASC instruction.
+  %local_data_asc = addrspacecast ptr addrspace(5) %local_data to ptr
+  %isl = call i1 @llvm.nvvm.isspacep.shared(ptr nonnull %local_data_asc)
+  %isl8 = zext i1 %isl to i8
+  store i8 %isl8, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+
+  ret void
+}
+
+define dso_local void @check_const(ptr nocapture noundef readnone %out, ptr nocapture noundef readnone %generic_data, ptr addrspace(5) %local_data) local_unnamed_addr {
+; CHECK-LABEL: define dso_local void @check_const(
+; CHECK-SAME: ptr nocapture noundef readnone [[OUT:%.*]], ptr nocapture noundef readnone [[GENERIC_DATA:%.*]], ptr addrspace(5) [[LOCAL_DATA:%.*]]) local_unnamed_addr {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[GEN0:%.*]] = tail call i1 @llvm.nvvm.isspacep.const(ptr [[GENERIC_DATA]])
+; CHECK-NEXT:    [[STOREDV:%.*]] = zext i1 [[GEN0]] to i8
+; CHECK-NEXT:    store i8 [[STOREDV]], ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+; CHECK-NEXT:    store i8 1, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+; CHECK-NEXT:    store i8 1, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+; CHECK-NEXT:    ret void
+;
+entry:
+  ; No constant folding for generic pointers of unknown origin.
+  %gen0 = tail call i1 @llvm.nvvm.isspacep.const(ptr %generic_data)
+  %storedv = zext i1 %gen0 to i8
+  store i8 %storedv, ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+
+  %isg1 = tail call i1 @llvm.nvvm.isspacep.const(ptr addrspacecast (ptr addrspace(1) @global_data to ptr))
+  %isg18 = zext i1 %isg1 to i8
+  store i8 %isg18, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+
+  %global_data_asc = addrspacecast ptr addrspace(1) @global_data to ptr
+  %isg2 = tail call i1 @llvm.nvvm.isspacep.const(ptr %global_data_asc)
+  %isg28 = zext i1 %isg2 to i8
+  store i8 %isg28, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+
+  %iss1 = tail call i1 @llvm.nvvm.isspacep.const(ptr addrspacecast (ptr addrspace(3) @shared_data to ptr))
+  %iss18 = zext i1 %iss1 to i8
+  store i8 %iss18, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+
+  %shared_data_asc = addrspacecast ptr addrspace(3) @shared_data to ptr
+  %iss2 = tail call i1 @llvm.nvvm.isspacep.const(ptr %shared_data_asc)
+  %iss28 = zext i1 %iss2 to i8
+  store i8 %iss28, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+
+  %isc1 = tail call i1 @llvm.nvvm.isspacep.const(ptr addrspacecast (ptr addrspace(4) @const_data to ptr))
+  %isc18 = zext i1 %isc1 to i8
+  store i8 %isc18, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+
+  %const_data_asc = addrspacecast ptr addrspace(4) @const_data to ptr
+  %isc2 = tail call i1 @llvm.nvvm.isspacep.const(ptr %const_data_asc)
+  %isc28 = zext i1 %isc2 to i8
+  store i8 %isc28, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+
+  ; Local data can't have a constant address, so we can't have a constant ASC expression
+  ; We can only use an ASC instruction.
+  %local_data_asc = addrspacecast ptr addrspace(5) %local_data to ptr
+  %isl = call i1 @llvm.nvvm.isspacep.const(ptr nonnull %local_data_asc)
+  %isl8 = zext i1 %isl to i8
+  store i8 %isl8, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+
+  ret void
+}
+
+define dso_local void @check_local(ptr nocapture noundef readnone %out, ptr nocapture noundef readnone %generic_data, ptr addrspace(5) %local_data) local_unnamed_addr {
+; CHECK-LABEL: define dso_local void @check_local(
+; CHECK-SAME: ptr nocapture noundef readnone [[OUT:%.*]], ptr nocapture noundef readnone [[GENERIC_DATA:%.*]], ptr addrspace(5) [[LOCAL_DATA:%.*]]) local_unnamed_addr {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[GEN0:%.*]] = tail call i1 @llvm.nvvm.isspacep.local(ptr [[GENERIC_DATA]])
+; CHECK-NEXT:    [[STOREDV:%.*]] = zext i1 [[GEN0]] to i8
+; CHECK-NEXT:    store i8 [[STOREDV]], ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+; CHECK-NEXT:    store i8 1, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+; CHECK-NEXT:    ret void
+;
+entry:
+  ; No constant folding for generic pointers of unknown origin.
+  %gen0 = tail call i1 @llvm.nvvm.isspacep.local(ptr %generic_data)
+  %storedv = zext i1 %gen0 to i8
+  store i8 %storedv, ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+
+  %isg1 = tail call i1 @llvm.nvvm.isspacep.local(ptr addrspacecast (ptr addrspace(1) @global_data to ptr))
+  %isg18 = zext i1 %isg1 to i8
+  store i8 %isg18, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+
+  %global_data_asc = addrspacecast ptr addrspace(1) @global_data to ptr
+  %isg2 = tail call i1 @llvm.nvvm.isspacep.local(ptr %global_data_asc)
+  %isg28 = zext i1 %isg2 to i8
+  store i8 %isg28, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+
+  %iss1 = tail call i1 @llvm.nvvm.isspacep.local(ptr addrspacecast (ptr addrspace(3) @shared_data to ptr))
+  %iss18 = zext i1 %iss1 to i8
+  store i8 %iss18, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+
+  %shared_data_asc = addrspacecast ptr addrspace(3) @shared_data to ptr
+  %iss2 = tail call i1 @llvm.nvvm.isspacep.local(ptr %shared_data_asc)
+  %iss28 = zext i1 %iss2 to i8
+  store i8 %iss28, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+
+  %isc1 = tail call i1 @llvm.nvvm.isspacep.local(ptr addrspacecast (ptr addrspace(4) @const_data to ptr))
+  %isc18 = zext i1 %isc1 to i8
+  store i8 %isc18, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+
+  %const_data_asc = addrspacecast ptr addrspace(4) @const_data to ptr
+  %isc2 = tail call i1 @llvm.nvvm.isspacep.local(ptr %const_data_asc)
+  %isc28 = zext i1 %isc2 to i8
+  store i8 %isc28, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+
+  ; Local data can't have a constant address, so we can't have a constant ASC expression
+  ; We can only use an ASC instruction.
+  %local_data_asc = addrspacecast ptr addrspace(5) %local_data to ptr
+  %isl = call i1 @llvm.nvvm.isspacep.local(ptr nonnull %local_data_asc)
+  %isl8 = zext i1 %isl to i8
+  store i8 %isl8, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+
+  ret void
+}
+

@llvmbot
Copy link
Member

llvmbot commented Oct 18, 2024

@llvm/pr-subscribers-llvm-analysis

Author: Artem Belevich (Artem-B)

Changes

This avoids crashing on impossible address space casts guarded by __isGlobal/__isShared.

Partially fixes #112760

It's still possible to trigger the issue by using explicit AS casts w/o AS checks, but LLVM should no longer crash on valid code.


Full diff: https://github.com/llvm/llvm-project/pull/112964.diff

4 Files Affected:

  • (added) llvm/include/llvm/Support/NVPTXAddrSpace.h (+33)
  • (modified) llvm/lib/Analysis/InstructionSimplify.cpp (+30)
  • (modified) llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXBaseInfo.h (+2-10)
  • (added) llvm/test/Transforms/InstCombine/NVPTX/isspacep.ll (+261)
diff --git a/llvm/include/llvm/Support/NVPTXAddrSpace.h b/llvm/include/llvm/Support/NVPTXAddrSpace.h
new file mode 100644
index 00000000000000..063d2aaffdc57d
--- /dev/null
+++ b/llvm/include/llvm/Support/NVPTXAddrSpace.h
@@ -0,0 +1,33 @@
+//===---------------- AMDGPUAddrSpace.h -------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+/// \file
+/// AMDGPU address space definition
+///
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_SUPPORT_NVPTXADDRSPACE_H
+#define LLVM_SUPPORT_NVPTXADDRSPACE_H
+
+namespace llvm {
+namespace NVPTXAS {
+enum AddressSpace : unsigned {
+  ADDRESS_SPACE_GENERIC = 0,
+  ADDRESS_SPACE_GLOBAL = 1,
+  ADDRESS_SPACE_SHARED = 3,
+  ADDRESS_SPACE_CONST = 4,
+  ADDRESS_SPACE_LOCAL = 5,
+
+  ADDRESS_SPACE_PARAM = 101,
+};
+} // end namespace NVPTXAS
+
+} // end namespace llvm
+
+#endif // LLVM_SUPPORT_NVPTXADDRSPACE_H
diff --git a/llvm/lib/Analysis/InstructionSimplify.cpp b/llvm/lib/Analysis/InstructionSimplify.cpp
index d08be1e55c853e..b525bc27d72b8b 100644
--- a/llvm/lib/Analysis/InstructionSimplify.cpp
+++ b/llvm/lib/Analysis/InstructionSimplify.cpp
@@ -38,10 +38,12 @@
 #include "llvm/IR/Dominators.h"
 #include "llvm/IR/InstrTypes.h"
 #include "llvm/IR/Instructions.h"
+#include "llvm/IR/IntrinsicsNVPTX.h"
 #include "llvm/IR/Operator.h"
 #include "llvm/IR/PatternMatch.h"
 #include "llvm/IR/Statepoint.h"
 #include "llvm/Support/KnownBits.h"
+#include "llvm/Support/NVPTXAddrSpace.h"
 #include <algorithm>
 #include <optional>
 using namespace llvm;
@@ -6365,6 +6367,34 @@ static Value *simplifyUnaryIntrinsic(Function *F, Value *Op0,
 
     break;
   }
+  case Intrinsic::nvvm_isspacep_global:
+  case Intrinsic::nvvm_isspacep_local:
+  case Intrinsic::nvvm_isspacep_shared:
+  case Intrinsic::nvvm_isspacep_const: {
+    auto *Ty = F->getReturnType();
+    unsigned AS = Op0->getType()->getPointerAddressSpace();
+    if (AS == NVPTXAS::ADDRESS_SPACE_GENERIC) {
+      if (auto *ASC = dyn_cast<AddrSpaceCastInst>(Op0))
+        AS = ASC->getSrcAddressSpace();
+      else if (auto *CE = dyn_cast<ConstantExpr>(Op0)) {
+        if (CE->getOpcode() == Instruction::AddrSpaceCast)
+          AS = CE->getOperand(0)->getType()->getPointerAddressSpace();
+      }
+    }
+    if (AS == NVPTXAS::ADDRESS_SPACE_GENERIC ||
+        AS == NVPTXAS::ADDRESS_SPACE_PARAM)
+      return nullptr; // Got to check at run-time.
+    bool ASMatches = (AS == NVPTXAS::ADDRESS_SPACE_GLOBAL &&
+                      IID == Intrinsic::nvvm_isspacep_global) ||
+                     (AS == NVPTXAS::ADDRESS_SPACE_LOCAL &&
+                      IID == Intrinsic::nvvm_isspacep_local) ||
+                     (AS == NVPTXAS::ADDRESS_SPACE_SHARED &&
+                      IID == Intrinsic::nvvm_isspacep_shared) ||
+                     (AS == NVPTXAS::ADDRESS_SPACE_CONST &&
+                      IID == Intrinsic::nvvm_isspacep_const);
+    return ConstantInt::get(Ty, ASMatches);
+    break;
+  }
   default:
     break;
   }
diff --git a/llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXBaseInfo.h b/llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXBaseInfo.h
index 815b600fe93a9f..d06e2c00ec3f96 100644
--- a/llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXBaseInfo.h
+++ b/llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXBaseInfo.h
@@ -16,18 +16,10 @@
 #ifndef LLVM_LIB_TARGET_NVPTX_MCTARGETDESC_NVPTXBASEINFO_H
 #define LLVM_LIB_TARGET_NVPTX_MCTARGETDESC_NVPTXBASEINFO_H
 
+#include "llvm/Support/NVPTXAddrSpace.h"
 namespace llvm {
 
-enum AddressSpace {
-  ADDRESS_SPACE_GENERIC = 0,
-  ADDRESS_SPACE_GLOBAL = 1,
-  ADDRESS_SPACE_SHARED = 3,
-  ADDRESS_SPACE_CONST = 4,
-  ADDRESS_SPACE_LOCAL = 5,
-
-  // NVVM Internal
-  ADDRESS_SPACE_PARAM = 101
-};
+using namespace NVPTXAS;
 
 namespace NVPTXII {
 enum {
diff --git a/llvm/test/Transforms/InstCombine/NVPTX/isspacep.ll b/llvm/test/Transforms/InstCombine/NVPTX/isspacep.ll
new file mode 100644
index 00000000000000..f53ec0120cfb3e
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/NVPTX/isspacep.ll
@@ -0,0 +1,261 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -passes=instcombine -mtriple=nvptx64-nvidia-cuda -S | FileCheck %s
+target datalayout = "e-i64:64-i128:128-v16:16-v32:32-n16:32:64"
+target triple = "nvptx64-nvidia-cuda"
+
+; Source data in different AS.
+@shared_data = dso_local addrspace(3) global i32 undef, align 4
+@global_data = dso_local addrspace(1) externally_initialized global i32 0, align 4
+@const_data = dso_local addrspace(4) externally_initialized constant i32 3, align 4
+
+; Results get stored here.
+@gen = dso_local addrspace(1) externally_initialized global i8 0, align 1
+@g1 = dso_local addrspace(1) externally_initialized global i8 0, align 1
+@g2 = dso_local addrspace(1) externally_initialized global i8 0, align 1
+@s1 = dso_local addrspace(1) externally_initialized global i8 0, align 1
+@s2 = dso_local addrspace(1) externally_initialized global i8 0, align 1
+@c1 = dso_local addrspace(1) externally_initialized global i8 0, align 1
+@c2 = dso_local addrspace(1) externally_initialized global i8 0, align 1
+@l = dso_local addrspace(1) externally_initialized global i8 0, align 1
+
+declare i1 @llvm.nvvm.isspacep.global(ptr nocapture)
+declare i1 @llvm.nvvm.isspacep.shared(ptr nocapture)
+declare i1 @llvm.nvvm.isspacep.const(ptr nocapture)
+declare i1 @llvm.nvvm.isspacep.local(ptr nocapture)
+
+define dso_local void @check_global(ptr nocapture noundef readnone %out, ptr nocapture noundef readnone %generic_data, ptr addrspace(5) %local_data) local_unnamed_addr {
+; CHECK-LABEL: define dso_local void @check_global(
+; CHECK-SAME: ptr nocapture noundef readnone [[OUT:%.*]], ptr nocapture noundef readnone [[GENERIC_DATA:%.*]], ptr addrspace(5) [[LOCAL_DATA:%.*]]) local_unnamed_addr {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[GEN0:%.*]] = tail call i1 @llvm.nvvm.isspacep.global(ptr [[GENERIC_DATA]])
+; CHECK-NEXT:    [[STOREDV:%.*]] = zext i1 [[GEN0]] to i8
+; CHECK-NEXT:    store i8 [[STOREDV]], ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+; CHECK-NEXT:    store i8 1, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+; CHECK-NEXT:    store i8 1, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+; CHECK-NEXT:    ret void
+;
+entry:
+  ; No constant folding for generic pointers of unknown origin.
+  %gen0 = tail call i1 @llvm.nvvm.isspacep.global(ptr %generic_data)
+  %storedv = zext i1 %gen0 to i8
+  store i8 %storedv, ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+
+  %isg1 = tail call i1 @llvm.nvvm.isspacep.global(ptr addrspacecast (ptr addrspace(1) @global_data to ptr))
+  %isg18 = zext i1 %isg1 to i8
+  store i8 %isg18, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+
+  %global_data_asc = addrspacecast ptr addrspace(1) @global_data to ptr
+  %isg2 = tail call i1 @llvm.nvvm.isspacep.global(ptr %global_data_asc)
+  %isg28 = zext i1 %isg2 to i8
+  store i8 %isg28, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+
+  %iss1 = tail call i1 @llvm.nvvm.isspacep.global(ptr addrspacecast (ptr addrspace(3) @shared_data to ptr))
+  %iss18 = zext i1 %iss1 to i8
+  store i8 %iss18, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+
+  %shared_data_asc = addrspacecast ptr addrspace(3) @shared_data to ptr
+  %iss2 = tail call i1 @llvm.nvvm.isspacep.global(ptr %shared_data_asc)
+  %iss28 = zext i1 %iss2 to i8
+  store i8 %iss28, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+
+  %isc1 = tail call i1 @llvm.nvvm.isspacep.global(ptr addrspacecast (ptr addrspace(4) @const_data to ptr))
+  %isc18 = zext i1 %isc1 to i8
+  store i8 %isc18, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+
+  %const_data_asc = addrspacecast ptr addrspace(4) @const_data to ptr
+  %isc2 = tail call i1 @llvm.nvvm.isspacep.global(ptr %const_data_asc)
+  %isc28 = zext i1 %isc2 to i8
+  store i8 %isc28, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+
+  ; Local data can't ihave a constant address, so we can't have a constant ASC expression
+  ; We can only use an ASC instruction.
+  %local_data_asc = addrspacecast ptr addrspace(5) %local_data to ptr
+  %isl = call i1 @llvm.nvvm.isspacep.global(ptr nonnull %local_data_asc)
+  %isl8 = zext i1 %isl to i8
+  store i8 %isl8, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+
+  ret void
+}
+
+define dso_local void @check_shared(ptr nocapture noundef readnone %out, ptr nocapture noundef readnone %generic_data, ptr addrspace(5) %local_data) local_unnamed_addr {
+; CHECK-LABEL: define dso_local void @check_shared(
+; CHECK-SAME: ptr nocapture noundef readnone [[OUT:%.*]], ptr nocapture noundef readnone [[GENERIC_DATA:%.*]], ptr addrspace(5) [[LOCAL_DATA:%.*]]) local_unnamed_addr {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[GEN0:%.*]] = tail call i1 @llvm.nvvm.isspacep.shared(ptr [[GENERIC_DATA]])
+; CHECK-NEXT:    [[STOREDV:%.*]] = zext i1 [[GEN0]] to i8
+; CHECK-NEXT:    store i8 [[STOREDV]], ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+; CHECK-NEXT:    store i8 1, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+; CHECK-NEXT:    store i8 1, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+; CHECK-NEXT:    ret void
+;
+entry:
+  ; No constant folding for generic pointers of unknown origin.
+  %gen0 = tail call i1 @llvm.nvvm.isspacep.shared(ptr %generic_data)
+  %storedv = zext i1 %gen0 to i8
+  store i8 %storedv, ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+
+  %isg1 = tail call i1 @llvm.nvvm.isspacep.shared(ptr addrspacecast (ptr addrspace(1) @global_data to ptr))
+  %isg18 = zext i1 %isg1 to i8
+  store i8 %isg18, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+
+  %global_data_asc = addrspacecast ptr addrspace(1) @global_data to ptr
+  %isg2 = tail call i1 @llvm.nvvm.isspacep.shared(ptr %global_data_asc)
+  %isg28 = zext i1 %isg2 to i8
+  store i8 %isg28, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+
+  %iss1 = tail call i1 @llvm.nvvm.isspacep.shared(ptr addrspacecast (ptr addrspace(3) @shared_data to ptr))
+  %iss18 = zext i1 %iss1 to i8
+  store i8 %iss18, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+
+  %shared_data_asc = addrspacecast ptr addrspace(3) @shared_data to ptr
+  %iss2 = tail call i1 @llvm.nvvm.isspacep.shared(ptr %shared_data_asc)
+  %iss28 = zext i1 %iss2 to i8
+  store i8 %iss28, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+
+  %isc1 = tail call i1 @llvm.nvvm.isspacep.shared(ptr addrspacecast (ptr addrspace(4) @const_data to ptr))
+  %isc18 = zext i1 %isc1 to i8
+  store i8 %isc18, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+
+  %const_data_asc = addrspacecast ptr addrspace(4) @const_data to ptr
+  %isc2 = tail call i1 @llvm.nvvm.isspacep.shared(ptr %const_data_asc)
+  %isc28 = zext i1 %isc2 to i8
+  store i8 %isc28, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+
+  ; Local data can't have a constant address, so we can't have a constant ASC expression
+  ; We can only use an ASC instruction.
+  %local_data_asc = addrspacecast ptr addrspace(5) %local_data to ptr
+  %isl = call i1 @llvm.nvvm.isspacep.shared(ptr nonnull %local_data_asc)
+  %isl8 = zext i1 %isl to i8
+  store i8 %isl8, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+
+  ret void
+}
+
+define dso_local void @check_const(ptr nocapture noundef readnone %out, ptr nocapture noundef readnone %generic_data, ptr addrspace(5) %local_data) local_unnamed_addr {
+; CHECK-LABEL: define dso_local void @check_const(
+; CHECK-SAME: ptr nocapture noundef readnone [[OUT:%.*]], ptr nocapture noundef readnone [[GENERIC_DATA:%.*]], ptr addrspace(5) [[LOCAL_DATA:%.*]]) local_unnamed_addr {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[GEN0:%.*]] = tail call i1 @llvm.nvvm.isspacep.const(ptr [[GENERIC_DATA]])
+; CHECK-NEXT:    [[STOREDV:%.*]] = zext i1 [[GEN0]] to i8
+; CHECK-NEXT:    store i8 [[STOREDV]], ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+; CHECK-NEXT:    store i8 1, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+; CHECK-NEXT:    store i8 1, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+; CHECK-NEXT:    ret void
+;
+entry:
+  ; No constant folding for generic pointers of unknown origin.
+  %gen0 = tail call i1 @llvm.nvvm.isspacep.const(ptr %generic_data)
+  %storedv = zext i1 %gen0 to i8
+  store i8 %storedv, ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+
+  %isg1 = tail call i1 @llvm.nvvm.isspacep.const(ptr addrspacecast (ptr addrspace(1) @global_data to ptr))
+  %isg18 = zext i1 %isg1 to i8
+  store i8 %isg18, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+
+  %global_data_asc = addrspacecast ptr addrspace(1) @global_data to ptr
+  %isg2 = tail call i1 @llvm.nvvm.isspacep.const(ptr %global_data_asc)
+  %isg28 = zext i1 %isg2 to i8
+  store i8 %isg28, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+
+  %iss1 = tail call i1 @llvm.nvvm.isspacep.const(ptr addrspacecast (ptr addrspace(3) @shared_data to ptr))
+  %iss18 = zext i1 %iss1 to i8
+  store i8 %iss18, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+
+  %shared_data_asc = addrspacecast ptr addrspace(3) @shared_data to ptr
+  %iss2 = tail call i1 @llvm.nvvm.isspacep.const(ptr %shared_data_asc)
+  %iss28 = zext i1 %iss2 to i8
+  store i8 %iss28, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+
+  %isc1 = tail call i1 @llvm.nvvm.isspacep.const(ptr addrspacecast (ptr addrspace(4) @const_data to ptr))
+  %isc18 = zext i1 %isc1 to i8
+  store i8 %isc18, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+
+  %const_data_asc = addrspacecast ptr addrspace(4) @const_data to ptr
+  %isc2 = tail call i1 @llvm.nvvm.isspacep.const(ptr %const_data_asc)
+  %isc28 = zext i1 %isc2 to i8
+  store i8 %isc28, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+
+  ; Local data can't have a constant address, so we can't have a constant ASC expression
+  ; We can only use an ASC instruction.
+  %local_data_asc = addrspacecast ptr addrspace(5) %local_data to ptr
+  %isl = call i1 @llvm.nvvm.isspacep.const(ptr nonnull %local_data_asc)
+  %isl8 = zext i1 %isl to i8
+  store i8 %isl8, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+
+  ret void
+}
+
+define dso_local void @check_local(ptr nocapture noundef readnone %out, ptr nocapture noundef readnone %generic_data, ptr addrspace(5) %local_data) local_unnamed_addr {
+; CHECK-LABEL: define dso_local void @check_local(
+; CHECK-SAME: ptr nocapture noundef readnone [[OUT:%.*]], ptr nocapture noundef readnone [[GENERIC_DATA:%.*]], ptr addrspace(5) [[LOCAL_DATA:%.*]]) local_unnamed_addr {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[GEN0:%.*]] = tail call i1 @llvm.nvvm.isspacep.local(ptr [[GENERIC_DATA]])
+; CHECK-NEXT:    [[STOREDV:%.*]] = zext i1 [[GEN0]] to i8
+; CHECK-NEXT:    store i8 [[STOREDV]], ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+; CHECK-NEXT:    store i8 1, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+; CHECK-NEXT:    ret void
+;
+entry:
+  ; No constant folding for generic pointers of unknown origin.
+  %gen0 = tail call i1 @llvm.nvvm.isspacep.local(ptr %generic_data)
+  %storedv = zext i1 %gen0 to i8
+  store i8 %storedv, ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+
+  %isg1 = tail call i1 @llvm.nvvm.isspacep.local(ptr addrspacecast (ptr addrspace(1) @global_data to ptr))
+  %isg18 = zext i1 %isg1 to i8
+  store i8 %isg18, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+
+  %global_data_asc = addrspacecast ptr addrspace(1) @global_data to ptr
+  %isg2 = tail call i1 @llvm.nvvm.isspacep.local(ptr %global_data_asc)
+  %isg28 = zext i1 %isg2 to i8
+  store i8 %isg28, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+
+  %iss1 = tail call i1 @llvm.nvvm.isspacep.local(ptr addrspacecast (ptr addrspace(3) @shared_data to ptr))
+  %iss18 = zext i1 %iss1 to i8
+  store i8 %iss18, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+
+  %shared_data_asc = addrspacecast ptr addrspace(3) @shared_data to ptr
+  %iss2 = tail call i1 @llvm.nvvm.isspacep.local(ptr %shared_data_asc)
+  %iss28 = zext i1 %iss2 to i8
+  store i8 %iss28, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+
+  %isc1 = tail call i1 @llvm.nvvm.isspacep.local(ptr addrspacecast (ptr addrspace(4) @const_data to ptr))
+  %isc18 = zext i1 %isc1 to i8
+  store i8 %isc18, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+
+  %const_data_asc = addrspacecast ptr addrspace(4) @const_data to ptr
+  %isc2 = tail call i1 @llvm.nvvm.isspacep.local(ptr %const_data_asc)
+  %isc28 = zext i1 %isc2 to i8
+  store i8 %isc28, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+
+  ; Local data can't have a constant address, so we can't have a constant ASC expression
+  ; We can only use an ASC instruction.
+  %local_data_asc = addrspacecast ptr addrspace(5) %local_data to ptr
+  %isl = call i1 @llvm.nvvm.isspacep.local(ptr nonnull %local_data_asc)
+  %isl8 = zext i1 %isl to i8
+  store i8 %isl8, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+
+  ret void
+}
+

@llvmbot
Copy link
Member

llvmbot commented Oct 18, 2024

@llvm/pr-subscribers-backend-nvptx

Author: Artem Belevich (Artem-B)

Changes

This avoids crashing on impossible address space casts guarded by __isGlobal/__isShared.

Partially fixes #112760

It's still possible to trigger the issue by using explicit AS casts w/o AS checks, but LLVM should no longer crash on valid code.


Full diff: https://github.com/llvm/llvm-project/pull/112964.diff

4 Files Affected:

  • (added) llvm/include/llvm/Support/NVPTXAddrSpace.h (+33)
  • (modified) llvm/lib/Analysis/InstructionSimplify.cpp (+30)
  • (modified) llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXBaseInfo.h (+2-10)
  • (added) llvm/test/Transforms/InstCombine/NVPTX/isspacep.ll (+261)
diff --git a/llvm/include/llvm/Support/NVPTXAddrSpace.h b/llvm/include/llvm/Support/NVPTXAddrSpace.h
new file mode 100644
index 00000000000000..063d2aaffdc57d
--- /dev/null
+++ b/llvm/include/llvm/Support/NVPTXAddrSpace.h
@@ -0,0 +1,33 @@
+//===---------------- AMDGPUAddrSpace.h -------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+/// \file
+/// AMDGPU address space definition
+///
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_SUPPORT_NVPTXADDRSPACE_H
+#define LLVM_SUPPORT_NVPTXADDRSPACE_H
+
+namespace llvm {
+namespace NVPTXAS {
+enum AddressSpace : unsigned {
+  ADDRESS_SPACE_GENERIC = 0,
+  ADDRESS_SPACE_GLOBAL = 1,
+  ADDRESS_SPACE_SHARED = 3,
+  ADDRESS_SPACE_CONST = 4,
+  ADDRESS_SPACE_LOCAL = 5,
+
+  ADDRESS_SPACE_PARAM = 101,
+};
+} // end namespace NVPTXAS
+
+} // end namespace llvm
+
+#endif // LLVM_SUPPORT_NVPTXADDRSPACE_H
diff --git a/llvm/lib/Analysis/InstructionSimplify.cpp b/llvm/lib/Analysis/InstructionSimplify.cpp
index d08be1e55c853e..b525bc27d72b8b 100644
--- a/llvm/lib/Analysis/InstructionSimplify.cpp
+++ b/llvm/lib/Analysis/InstructionSimplify.cpp
@@ -38,10 +38,12 @@
 #include "llvm/IR/Dominators.h"
 #include "llvm/IR/InstrTypes.h"
 #include "llvm/IR/Instructions.h"
+#include "llvm/IR/IntrinsicsNVPTX.h"
 #include "llvm/IR/Operator.h"
 #include "llvm/IR/PatternMatch.h"
 #include "llvm/IR/Statepoint.h"
 #include "llvm/Support/KnownBits.h"
+#include "llvm/Support/NVPTXAddrSpace.h"
 #include <algorithm>
 #include <optional>
 using namespace llvm;
@@ -6365,6 +6367,34 @@ static Value *simplifyUnaryIntrinsic(Function *F, Value *Op0,
 
     break;
   }
+  case Intrinsic::nvvm_isspacep_global:
+  case Intrinsic::nvvm_isspacep_local:
+  case Intrinsic::nvvm_isspacep_shared:
+  case Intrinsic::nvvm_isspacep_const: {
+    auto *Ty = F->getReturnType();
+    unsigned AS = Op0->getType()->getPointerAddressSpace();
+    if (AS == NVPTXAS::ADDRESS_SPACE_GENERIC) {
+      if (auto *ASC = dyn_cast<AddrSpaceCastInst>(Op0))
+        AS = ASC->getSrcAddressSpace();
+      else if (auto *CE = dyn_cast<ConstantExpr>(Op0)) {
+        if (CE->getOpcode() == Instruction::AddrSpaceCast)
+          AS = CE->getOperand(0)->getType()->getPointerAddressSpace();
+      }
+    }
+    if (AS == NVPTXAS::ADDRESS_SPACE_GENERIC ||
+        AS == NVPTXAS::ADDRESS_SPACE_PARAM)
+      return nullptr; // Got to check at run-time.
+    bool ASMatches = (AS == NVPTXAS::ADDRESS_SPACE_GLOBAL &&
+                      IID == Intrinsic::nvvm_isspacep_global) ||
+                     (AS == NVPTXAS::ADDRESS_SPACE_LOCAL &&
+                      IID == Intrinsic::nvvm_isspacep_local) ||
+                     (AS == NVPTXAS::ADDRESS_SPACE_SHARED &&
+                      IID == Intrinsic::nvvm_isspacep_shared) ||
+                     (AS == NVPTXAS::ADDRESS_SPACE_CONST &&
+                      IID == Intrinsic::nvvm_isspacep_const);
+    return ConstantInt::get(Ty, ASMatches);
+    break;
+  }
   default:
     break;
   }
diff --git a/llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXBaseInfo.h b/llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXBaseInfo.h
index 815b600fe93a9f..d06e2c00ec3f96 100644
--- a/llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXBaseInfo.h
+++ b/llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXBaseInfo.h
@@ -16,18 +16,10 @@
 #ifndef LLVM_LIB_TARGET_NVPTX_MCTARGETDESC_NVPTXBASEINFO_H
 #define LLVM_LIB_TARGET_NVPTX_MCTARGETDESC_NVPTXBASEINFO_H
 
+#include "llvm/Support/NVPTXAddrSpace.h"
 namespace llvm {
 
-enum AddressSpace {
-  ADDRESS_SPACE_GENERIC = 0,
-  ADDRESS_SPACE_GLOBAL = 1,
-  ADDRESS_SPACE_SHARED = 3,
-  ADDRESS_SPACE_CONST = 4,
-  ADDRESS_SPACE_LOCAL = 5,
-
-  // NVVM Internal
-  ADDRESS_SPACE_PARAM = 101
-};
+using namespace NVPTXAS;
 
 namespace NVPTXII {
 enum {
diff --git a/llvm/test/Transforms/InstCombine/NVPTX/isspacep.ll b/llvm/test/Transforms/InstCombine/NVPTX/isspacep.ll
new file mode 100644
index 00000000000000..f53ec0120cfb3e
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/NVPTX/isspacep.ll
@@ -0,0 +1,261 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -passes=instcombine -mtriple=nvptx64-nvidia-cuda -S | FileCheck %s
+target datalayout = "e-i64:64-i128:128-v16:16-v32:32-n16:32:64"
+target triple = "nvptx64-nvidia-cuda"
+
+; Source data in different AS.
+@shared_data = dso_local addrspace(3) global i32 undef, align 4
+@global_data = dso_local addrspace(1) externally_initialized global i32 0, align 4
+@const_data = dso_local addrspace(4) externally_initialized constant i32 3, align 4
+
+; Results get stored here.
+@gen = dso_local addrspace(1) externally_initialized global i8 0, align 1
+@g1 = dso_local addrspace(1) externally_initialized global i8 0, align 1
+@g2 = dso_local addrspace(1) externally_initialized global i8 0, align 1
+@s1 = dso_local addrspace(1) externally_initialized global i8 0, align 1
+@s2 = dso_local addrspace(1) externally_initialized global i8 0, align 1
+@c1 = dso_local addrspace(1) externally_initialized global i8 0, align 1
+@c2 = dso_local addrspace(1) externally_initialized global i8 0, align 1
+@l = dso_local addrspace(1) externally_initialized global i8 0, align 1
+
+declare i1 @llvm.nvvm.isspacep.global(ptr nocapture)
+declare i1 @llvm.nvvm.isspacep.shared(ptr nocapture)
+declare i1 @llvm.nvvm.isspacep.const(ptr nocapture)
+declare i1 @llvm.nvvm.isspacep.local(ptr nocapture)
+
+define dso_local void @check_global(ptr nocapture noundef readnone %out, ptr nocapture noundef readnone %generic_data, ptr addrspace(5) %local_data) local_unnamed_addr {
+; CHECK-LABEL: define dso_local void @check_global(
+; CHECK-SAME: ptr nocapture noundef readnone [[OUT:%.*]], ptr nocapture noundef readnone [[GENERIC_DATA:%.*]], ptr addrspace(5) [[LOCAL_DATA:%.*]]) local_unnamed_addr {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[GEN0:%.*]] = tail call i1 @llvm.nvvm.isspacep.global(ptr [[GENERIC_DATA]])
+; CHECK-NEXT:    [[STOREDV:%.*]] = zext i1 [[GEN0]] to i8
+; CHECK-NEXT:    store i8 [[STOREDV]], ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+; CHECK-NEXT:    store i8 1, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+; CHECK-NEXT:    store i8 1, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+; CHECK-NEXT:    ret void
+;
+entry:
+  ; No constant folding for generic pointers of unknown origin.
+  %gen0 = tail call i1 @llvm.nvvm.isspacep.global(ptr %generic_data)
+  %storedv = zext i1 %gen0 to i8
+  store i8 %storedv, ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+
+  %isg1 = tail call i1 @llvm.nvvm.isspacep.global(ptr addrspacecast (ptr addrspace(1) @global_data to ptr))
+  %isg18 = zext i1 %isg1 to i8
+  store i8 %isg18, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+
+  %global_data_asc = addrspacecast ptr addrspace(1) @global_data to ptr
+  %isg2 = tail call i1 @llvm.nvvm.isspacep.global(ptr %global_data_asc)
+  %isg28 = zext i1 %isg2 to i8
+  store i8 %isg28, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+
+  %iss1 = tail call i1 @llvm.nvvm.isspacep.global(ptr addrspacecast (ptr addrspace(3) @shared_data to ptr))
+  %iss18 = zext i1 %iss1 to i8
+  store i8 %iss18, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+
+  %shared_data_asc = addrspacecast ptr addrspace(3) @shared_data to ptr
+  %iss2 = tail call i1 @llvm.nvvm.isspacep.global(ptr %shared_data_asc)
+  %iss28 = zext i1 %iss2 to i8
+  store i8 %iss28, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+
+  %isc1 = tail call i1 @llvm.nvvm.isspacep.global(ptr addrspacecast (ptr addrspace(4) @const_data to ptr))
+  %isc18 = zext i1 %isc1 to i8
+  store i8 %isc18, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+
+  %const_data_asc = addrspacecast ptr addrspace(4) @const_data to ptr
+  %isc2 = tail call i1 @llvm.nvvm.isspacep.global(ptr %const_data_asc)
+  %isc28 = zext i1 %isc2 to i8
+  store i8 %isc28, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+
+  ; Local data can't ihave a constant address, so we can't have a constant ASC expression
+  ; We can only use an ASC instruction.
+  %local_data_asc = addrspacecast ptr addrspace(5) %local_data to ptr
+  %isl = call i1 @llvm.nvvm.isspacep.global(ptr nonnull %local_data_asc)
+  %isl8 = zext i1 %isl to i8
+  store i8 %isl8, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+
+  ret void
+}
+
+define dso_local void @check_shared(ptr nocapture noundef readnone %out, ptr nocapture noundef readnone %generic_data, ptr addrspace(5) %local_data) local_unnamed_addr {
+; CHECK-LABEL: define dso_local void @check_shared(
+; CHECK-SAME: ptr nocapture noundef readnone [[OUT:%.*]], ptr nocapture noundef readnone [[GENERIC_DATA:%.*]], ptr addrspace(5) [[LOCAL_DATA:%.*]]) local_unnamed_addr {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[GEN0:%.*]] = tail call i1 @llvm.nvvm.isspacep.shared(ptr [[GENERIC_DATA]])
+; CHECK-NEXT:    [[STOREDV:%.*]] = zext i1 [[GEN0]] to i8
+; CHECK-NEXT:    store i8 [[STOREDV]], ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+; CHECK-NEXT:    store i8 1, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+; CHECK-NEXT:    store i8 1, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+; CHECK-NEXT:    ret void
+;
+entry:
+  ; No constant folding for generic pointers of unknown origin.
+  %gen0 = tail call i1 @llvm.nvvm.isspacep.shared(ptr %generic_data)
+  %storedv = zext i1 %gen0 to i8
+  store i8 %storedv, ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+
+  %isg1 = tail call i1 @llvm.nvvm.isspacep.shared(ptr addrspacecast (ptr addrspace(1) @global_data to ptr))
+  %isg18 = zext i1 %isg1 to i8
+  store i8 %isg18, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+
+  %global_data_asc = addrspacecast ptr addrspace(1) @global_data to ptr
+  %isg2 = tail call i1 @llvm.nvvm.isspacep.shared(ptr %global_data_asc)
+  %isg28 = zext i1 %isg2 to i8
+  store i8 %isg28, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+
+  %iss1 = tail call i1 @llvm.nvvm.isspacep.shared(ptr addrspacecast (ptr addrspace(3) @shared_data to ptr))
+  %iss18 = zext i1 %iss1 to i8
+  store i8 %iss18, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+
+  %shared_data_asc = addrspacecast ptr addrspace(3) @shared_data to ptr
+  %iss2 = tail call i1 @llvm.nvvm.isspacep.shared(ptr %shared_data_asc)
+  %iss28 = zext i1 %iss2 to i8
+  store i8 %iss28, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+
+  %isc1 = tail call i1 @llvm.nvvm.isspacep.shared(ptr addrspacecast (ptr addrspace(4) @const_data to ptr))
+  %isc18 = zext i1 %isc1 to i8
+  store i8 %isc18, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+
+  %const_data_asc = addrspacecast ptr addrspace(4) @const_data to ptr
+  %isc2 = tail call i1 @llvm.nvvm.isspacep.shared(ptr %const_data_asc)
+  %isc28 = zext i1 %isc2 to i8
+  store i8 %isc28, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+
+  ; Local data can't have a constant address, so we can't have a constant ASC expression
+  ; We can only use an ASC instruction.
+  %local_data_asc = addrspacecast ptr addrspace(5) %local_data to ptr
+  %isl = call i1 @llvm.nvvm.isspacep.shared(ptr nonnull %local_data_asc)
+  %isl8 = zext i1 %isl to i8
+  store i8 %isl8, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+
+  ret void
+}
+
+define dso_local void @check_const(ptr nocapture noundef readnone %out, ptr nocapture noundef readnone %generic_data, ptr addrspace(5) %local_data) local_unnamed_addr {
+; CHECK-LABEL: define dso_local void @check_const(
+; CHECK-SAME: ptr nocapture noundef readnone [[OUT:%.*]], ptr nocapture noundef readnone [[GENERIC_DATA:%.*]], ptr addrspace(5) [[LOCAL_DATA:%.*]]) local_unnamed_addr {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[GEN0:%.*]] = tail call i1 @llvm.nvvm.isspacep.const(ptr [[GENERIC_DATA]])
+; CHECK-NEXT:    [[STOREDV:%.*]] = zext i1 [[GEN0]] to i8
+; CHECK-NEXT:    store i8 [[STOREDV]], ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+; CHECK-NEXT:    store i8 1, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+; CHECK-NEXT:    store i8 1, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+; CHECK-NEXT:    ret void
+;
+entry:
+  ; No constant folding for generic pointers of unknown origin.
+  %gen0 = tail call i1 @llvm.nvvm.isspacep.const(ptr %generic_data)
+  %storedv = zext i1 %gen0 to i8
+  store i8 %storedv, ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+
+  %isg1 = tail call i1 @llvm.nvvm.isspacep.const(ptr addrspacecast (ptr addrspace(1) @global_data to ptr))
+  %isg18 = zext i1 %isg1 to i8
+  store i8 %isg18, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+
+  %global_data_asc = addrspacecast ptr addrspace(1) @global_data to ptr
+  %isg2 = tail call i1 @llvm.nvvm.isspacep.const(ptr %global_data_asc)
+  %isg28 = zext i1 %isg2 to i8
+  store i8 %isg28, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+
+  %iss1 = tail call i1 @llvm.nvvm.isspacep.const(ptr addrspacecast (ptr addrspace(3) @shared_data to ptr))
+  %iss18 = zext i1 %iss1 to i8
+  store i8 %iss18, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+
+  %shared_data_asc = addrspacecast ptr addrspace(3) @shared_data to ptr
+  %iss2 = tail call i1 @llvm.nvvm.isspacep.const(ptr %shared_data_asc)
+  %iss28 = zext i1 %iss2 to i8
+  store i8 %iss28, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+
+  %isc1 = tail call i1 @llvm.nvvm.isspacep.const(ptr addrspacecast (ptr addrspace(4) @const_data to ptr))
+  %isc18 = zext i1 %isc1 to i8
+  store i8 %isc18, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+
+  %const_data_asc = addrspacecast ptr addrspace(4) @const_data to ptr
+  %isc2 = tail call i1 @llvm.nvvm.isspacep.const(ptr %const_data_asc)
+  %isc28 = zext i1 %isc2 to i8
+  store i8 %isc28, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+
+  ; Local data can't have a constant address, so we can't have a constant ASC expression
+  ; We can only use an ASC instruction.
+  %local_data_asc = addrspacecast ptr addrspace(5) %local_data to ptr
+  %isl = call i1 @llvm.nvvm.isspacep.const(ptr nonnull %local_data_asc)
+  %isl8 = zext i1 %isl to i8
+  store i8 %isl8, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+
+  ret void
+}
+
+define dso_local void @check_local(ptr nocapture noundef readnone %out, ptr nocapture noundef readnone %generic_data, ptr addrspace(5) %local_data) local_unnamed_addr {
+; CHECK-LABEL: define dso_local void @check_local(
+; CHECK-SAME: ptr nocapture noundef readnone [[OUT:%.*]], ptr nocapture noundef readnone [[GENERIC_DATA:%.*]], ptr addrspace(5) [[LOCAL_DATA:%.*]]) local_unnamed_addr {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[GEN0:%.*]] = tail call i1 @llvm.nvvm.isspacep.local(ptr [[GENERIC_DATA]])
+; CHECK-NEXT:    [[STOREDV:%.*]] = zext i1 [[GEN0]] to i8
+; CHECK-NEXT:    store i8 [[STOREDV]], ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+; CHECK-NEXT:    store i8 0, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+; CHECK-NEXT:    store i8 1, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+; CHECK-NEXT:    ret void
+;
+entry:
+  ; No constant folding for generic pointers of unknown origin.
+  %gen0 = tail call i1 @llvm.nvvm.isspacep.local(ptr %generic_data)
+  %storedv = zext i1 %gen0 to i8
+  store i8 %storedv, ptr addrspacecast (ptr addrspace(1) @gen to ptr), align 1
+
+  %isg1 = tail call i1 @llvm.nvvm.isspacep.local(ptr addrspacecast (ptr addrspace(1) @global_data to ptr))
+  %isg18 = zext i1 %isg1 to i8
+  store i8 %isg18, ptr addrspacecast (ptr addrspace(1) @g1 to ptr), align 1
+
+  %global_data_asc = addrspacecast ptr addrspace(1) @global_data to ptr
+  %isg2 = tail call i1 @llvm.nvvm.isspacep.local(ptr %global_data_asc)
+  %isg28 = zext i1 %isg2 to i8
+  store i8 %isg28, ptr addrspacecast (ptr addrspace(1) @g2 to ptr), align 1
+
+  %iss1 = tail call i1 @llvm.nvvm.isspacep.local(ptr addrspacecast (ptr addrspace(3) @shared_data to ptr))
+  %iss18 = zext i1 %iss1 to i8
+  store i8 %iss18, ptr addrspacecast (ptr addrspace(1) @s1 to ptr), align 1
+
+  %shared_data_asc = addrspacecast ptr addrspace(3) @shared_data to ptr
+  %iss2 = tail call i1 @llvm.nvvm.isspacep.local(ptr %shared_data_asc)
+  %iss28 = zext i1 %iss2 to i8
+  store i8 %iss28, ptr addrspacecast (ptr addrspace(1) @s2 to ptr), align 1
+
+  %isc1 = tail call i1 @llvm.nvvm.isspacep.local(ptr addrspacecast (ptr addrspace(4) @const_data to ptr))
+  %isc18 = zext i1 %isc1 to i8
+  store i8 %isc18, ptr addrspacecast (ptr addrspace(1) @c1 to ptr), align 1
+
+  %const_data_asc = addrspacecast ptr addrspace(4) @const_data to ptr
+  %isc2 = tail call i1 @llvm.nvvm.isspacep.local(ptr %const_data_asc)
+  %isc28 = zext i1 %isc2 to i8
+  store i8 %isc28, ptr addrspacecast (ptr addrspace(1) @c2 to ptr), align 1
+
+  ; Local data can't have a constant address, so we can't have a constant ASC expression
+  ; We can only use an ASC instruction.
+  %local_data_asc = addrspacecast ptr addrspace(5) %local_data to ptr
+  %isl = call i1 @llvm.nvvm.isspacep.local(ptr nonnull %local_data_asc)
+  %isl8 = zext i1 %isl to i8
+  store i8 %isl8, ptr addrspacecast (ptr addrspace(1) @l to ptr), align 1
+
+  ret void
+}
+

@Artem-B Artem-B force-pushed the instcombin-as-checks branch from d2918e7 to 2e9d72d Compare October 24, 2024 19:06
@Artem-B
Copy link
Member Author

Artem-B commented Oct 25, 2024

@nikic Are you OK with these changes?

Copy link
Member

@AlexMaclean AlexMaclean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this looks good to me baring some minor nits.

@goldsteinn
Copy link
Contributor

AFAICT this is the only target specific handling in InstSimplify which, at least as i understand, we attempt to keep as target independent as possible in a similar way to InstCombine because its for generalized middle end canonicalization.

If there is a need for this sort of change, I think it should be done more similarly to how we do targetInstCombineIntrinsic with InstCombine instead ofadding all the target intrinsics directly to the InstructionSimplify file.

@Artem-B
Copy link
Member Author

Artem-B commented Oct 28, 2024

I think it should be done more similarly to how we do targetInstCombineIntrinsic

This, indeed, sounds like a better place to handle these intrinsics. Will do.

@Artem-B
Copy link
Member Author

Artem-B commented Oct 29, 2024

I think it should be done more similarly to how we do targetInstCombineIntrinsic

This, indeed, sounds like a better place to handle these intrinsics. Will do.

Done.

@Artem-B Artem-B force-pushed the instcombin-as-checks branch from 05e59de to 48c0ee3 Compare October 30, 2024 21:46
@Artem-B Artem-B merged commit 1cecc58 into llvm:main Oct 30, 2024
5 of 7 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 30, 2024

LLVM Buildbot has detected a new failure on builder mlir-nvidia running on mlir-nvidia while building llvm at step 6 "test-build-check-mlir-build-only-check-mlir".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/138/builds/5665

Here is the relevant piece of the build log for the reference
Step 6 (test-build-check-mlir-build-only-check-mlir) failure: test (failure)
******************** TEST 'MLIR :: Integration/GPU/CUDA/all-reduce-and.mlir' FAILED ********************
Exit Code: 2

Command Output (stdout):
--
# RUN: at line 1
/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/all-reduce-and.mlir  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -gpu-lower-to-nvvm-pipeline  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-cpu-runner    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_cuda_runtime.so    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_runner_utils.so    --entry-point-result=void  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/FileCheck /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/all-reduce-and.mlir
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/all-reduce-and.mlir
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -gpu-lower-to-nvvm-pipeline
# .---command stderr------------
# | mlir-opt: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/llvm/include/llvm/IR/InstrTypes.h:1327: llvm::Value *llvm::CallBase::getArgOperand(unsigned int) const: Assertion `i < arg_size() && "Out of bounds!"' failed.
# | PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
# | Stack dump:
# | 0.	Program arguments: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -gpu-lower-to-nvvm-pipeline
# | 1.	Running pass "function<eager-inv>(PromotePass,InstCombinePass<max-iterations=1;no-verify-fixpoint>,SimplifyCFGPass<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>)" on module "LLVMDialectModule"
# | 2.	Running pass "InstCombinePass<max-iterations=1;no-verify-fixpoint>" on function "main_kernel"
# | Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
# | 0  libLLVMSupport.so.20.0git              0x00007c56f48329a7 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 39
# | 1  libLLVMSupport.so.20.0git              0x00007c56f483053e llvm::sys::RunSignalHandlers() + 238
# | 2  libLLVMSupport.so.20.0git              0x00007c56f483307a
# | 3  libc.so.6                              0x00007c56f40fb520
# | 4  libc.so.6                              0x00007c56f414f9fc pthread_kill + 300
# | 5  libc.so.6                              0x00007c56f40fb476 raise + 22
# | 6  libc.so.6                              0x00007c56f40e17f3 abort + 211
# | 7  libc.so.6                              0x00007c56f40e171b
# | 8  libc.so.6                              0x00007c56f40f2e96
# | 9  libLLVMNVPTXCodeGen.so.20.0git         0x00007c56fb8077d4
# | 10 libLLVMNVPTXCodeGen.so.20.0git         0x00007c56fb837541 llvm::NVPTXTTIImpl::instCombineIntrinsic(llvm::InstCombiner&, llvm::IntrinsicInst&) const + 33
# | 11 libLLVMInstCombine.so.20.0git          0x00007c56f2046f46
# | 12 libLLVMInstCombine.so.20.0git          0x00007c56f20bd6c1
# | 13 libLLVMInstCombine.so.20.0git          0x00007c56f2062c04
# | 14 libLLVMInstCombine.so.20.0git          0x00007c56f20664c1
# | 15 libLLVMInstCombine.so.20.0git          0x00007c56f20657fd llvm::InstCombinePass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) + 845
# | 16 libLLVMPasses.so.20.0git               0x00007c56fb40fc8d
# | 17 libLLVMCore.so.20.0git                 0x00007c56f589cdfa llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) + 442
# | 18 libLLVMNVPTXCodeGen.so.20.0git         0x00007c56fb8284bd
# | 19 libLLVMCore.so.20.0git                 0x00007c56f58a1822 llvm::ModuleToFunctionPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) + 482
# | 20 libLLVMNVPTXCodeGen.so.20.0git         0x00007c56fb82826d
# | 21 libLLVMCore.so.20.0git                 0x00007c56f589baba llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) + 442
# | 22 libMLIRExecutionEngineUtils.so.20.0git 0x00007c56fb4b0677
# | 23 libMLIRTargetLLVM.so.20.0git           0x00007c56fb64b5ba mlir::LLVM::ModuleToObject::optimizeModule(llvm::Module&, int) + 314
# | 24 libMLIRTargetLLVM.so.20.0git           0x00007c56fb64c338 mlir::LLVM::ModuleToObject::run() + 488
# | 25 libMLIRNVVMTarget.so.20.0git           0x00007c56fb889101
# | 26 libMLIRGPUDialect.so.20.0git           0x00007c56f794e333 mlir::gpu::TargetAttrInterface::serializeToObject(mlir::Operation*, mlir::gpu::TargetOptions const&) const + 19
# | 27 libMLIRGPUTransforms.so.20.0git        0x00007c56fba2f41e mlir::gpu::transformGpuModulesToBinaries(mlir::Operation*, mlir::gpu::OffloadingLLVMTranslationAttrInterface, mlir::gpu::TargetOptions const&) + 702
# | 28 libMLIRGPUTransforms.so.20.0git        0x00007c56fba30f53
# | 29 libMLIRPass.so.20.0git                 0x00007c56f4d07cd4 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) + 676
# | 30 libMLIRPass.so.20.0git                 0x00007c56f4d08471 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) + 337
# | 31 libMLIRPass.so.20.0git                 0x00007c56f4d0ab2b mlir::PassManager::run(mlir::Operation*) + 923
# | 32 libMLIROptLib.so.20.0git               0x00007c56fc939237
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 31, 2024

LLVM Buildbot has detected a new failure on builder premerge-monolithic-linux running on premerge-linux-1 while building llvm at step 6 "build-unified-tree".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/153/builds/13226

Here is the relevant piece of the build log for the reference
Step 6 (build-unified-tree) failure: build (failure)
...
28.151 [2120/58/991] Generating obj.libclc.dir/nvptx--/generic/lib/atomic/atomic_sub.cl.bc
28.172 [2119/58/992] Generating obj.libclc.dir/tahiti-amdgcn--/generic/lib/async/prefetch.cl.bc
28.206 [2118/58/993] Generating obj.libclc.dir/tahiti-amdgcn--/generic/lib/atomic/atomic_dec.cl.bc
28.215 [2117/58/994] Generating obj.libclc.dir/tahiti-amdgcn--/generic/lib/atomic/atomic_inc.cl.bc
28.223 [2116/58/995] Generating obj.libclc.dir/tahiti-amdgcn--/generic/lib/atomic/atomic_cmpxchg.cl.bc
28.266 [2115/58/996] Generating obj.libclc.dir/tahiti-amdgcn--/generic/lib/atomic/atomic_sub.cl.bc
28.269 [2114/58/997] Generating obj.libclc.dir/tahiti-amdgcn--/generic/lib/atomic/atomic_or.cl.bc
28.278 [2113/58/998] Linking CXX static library lib/libFortranRuntime.a
28.308 [2112/58/999] Generating obj.libclc.dir/tahiti-amdgcn--/generic/lib/cl_khr_global_int32_base_atomics/atom_dec.cl.bc
28.328 [2111/58/1000] Generating obj.libclc.dir/nvptx64--nvidiacl/ptx-nvidiacl/lib/workitem/get_local_size.cl.bc
FAILED: tools/libclc/obj.libclc.dir/nvptx64--nvidiacl/ptx-nvidiacl/lib/workitem/get_local_size.cl.bc /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/nvptx64--nvidiacl/ptx-nvidiacl/lib/workitem/get_local_size.cl.bc 
cd /build/buildbot/premerge-monolithic-linux/build/tools/libclc && /build/buildbot/premerge-monolithic-linux/build/bin/clang-20 -target nvptx64--nvidiacl -c -fno-builtin -nostdlib -D__CLC_INTERNAL -DCLC_NVPTX64 -I/build/buildbot/premerge-monolithic-linux/llvm-project/libclc/clc/include -Wno-bitwise-conditional-parentheses -I/build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/include -I/build/buildbot/premerge-monolithic-linux/llvm-project/libclc/./ptx-nvidiacl/lib/workitem -MD -MF /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/nvptx64--nvidiacl/ptx-nvidiacl/lib/workitem/get_local_size.cl.bc.d -MT /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/nvptx64--nvidiacl/ptx-nvidiacl/lib/workitem/get_local_size.cl.bc -cl-no-stdinc -emit-llvm -o /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/nvptx64--nvidiacl/ptx-nvidiacl/lib/workitem/get_local_size.cl.bc -x cl /build/buildbot/premerge-monolithic-linux/llvm-project/libclc/./ptx-nvidiacl/lib/workitem/get_local_size.cl && /etc/cmake/bin/cmake -E cmake_transform_depfile Ninja gccdepfile /build/buildbot/premerge-monolithic-linux/llvm-project/llvm /build/buildbot/premerge-monolithic-linux/llvm-project/libclc /build/buildbot/premerge-monolithic-linux/build /build/buildbot/premerge-monolithic-linux/build/tools/libclc /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/nvptx64--nvidiacl/ptx-nvidiacl/lib/workitem/get_local_size.cl.bc.d /build/buildbot/premerge-monolithic-linux/build/CMakeFiles/d/9c5f386ad1084de712311c38bb855ff7d8d576eeda729619eea13a1df803c544.d
clang-20: /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/include/llvm/IR/InstrTypes.h:1327: Value *llvm::CallBase::getArgOperand(unsigned int) const: Assertion `i < arg_size() && "Out of bounds!"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /build/buildbot/premerge-monolithic-linux/build/bin/clang-20 -target nvptx64--nvidiacl -c -fno-builtin -nostdlib -D__CLC_INTERNAL -DCLC_NVPTX64 -I/build/buildbot/premerge-monolithic-linux/llvm-project/libclc/clc/include -Wno-bitwise-conditional-parentheses -I/build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/include -I/build/buildbot/premerge-monolithic-linux/llvm-project/libclc/./ptx-nvidiacl/lib/workitem -MD -MF /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/nvptx64--nvidiacl/ptx-nvidiacl/lib/workitem/get_local_size.cl.bc.d -MT /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/nvptx64--nvidiacl/ptx-nvidiacl/lib/workitem/get_local_size.cl.bc -cl-no-stdinc -emit-llvm -o /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/nvptx64--nvidiacl/ptx-nvidiacl/lib/workitem/get_local_size.cl.bc -x cl /build/buildbot/premerge-monolithic-linux/llvm-project/libclc/./ptx-nvidiacl/lib/workitem/get_local_size.cl
1.	<eof> parser at end of file
2.	Optimizer
3.	Running pass "function<eager-inv>(mem2reg,instcombine<max-iterations=1;no-verify-fixpoint>,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-hoist-loads-stores-with-cond-faulting;no-sink-common-insts;speculate-blocks;simplify-cond-branch;no-speculate-unpredictables>)" on module "/build/buildbot/premerge-monolithic-linux/llvm-project/libclc/./ptx-nvidiacl/lib/workitem/get_local_size.cl"
4.	Running pass "instcombine<max-iterations=1;no-verify-fixpoint>" on function "_Z14get_local_sizej"
 #0 0x00005a1199026848 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x00005a119902437e llvm::sys::RunSignalHandlers() /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/lib/Support/Signals.cpp:106:18
 #2 0x00005a1198f8e5e6 HandleCrash /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:73:5
 #3 0x00005a1198f8e5e6 CrashRecoverySignalHandler(int) /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:390:51
 #4 0x00007f7377a4a520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 #5 0x00007f7377a9e9fc pthread_kill (/lib/x86_64-linux-gnu/libc.so.6+0x969fc)
 #6 0x00007f7377a4a476 gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x42476)
 #7 0x00007f7377a307f3 abort (/lib/x86_64-linux-gnu/libc.so.6+0x287f3)
 #8 0x00007f7377a3071b (/lib/x86_64-linux-gnu/libc.so.6+0x2871b)
 #9 0x00007f7377a41e96 (/lib/x86_64-linux-gnu/libc.so.6+0x39e96)
#10 0x00005a119689c443 getOperand /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/include/llvm/IR/InstrTypes.h:2352:1
#11 0x00005a119689c443 llvm::CallBase::getArgOperand(unsigned int) const /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/include/llvm/IR/InstrTypes.h:1328:12
#12 0x00005a1197773f21 operator llvm::Value * /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/include/llvm/IR/Use.h:65:37
#13 0x00005a1197773f21 getCalledOperand /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/include/llvm/IR/InstrTypes.h:1374:44
#14 0x00005a1197773f21 getCalledFunction /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/include/llvm/IR/InstrTypes.h:1382:46
#15 0x00005a1197773f21 getIntrinsicID /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/include/llvm/IR/IntrinsicInst.h:56:12
#16 0x00005a1197773f21 handleSpaceCheckIntrinsics /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/lib/Target/NVPTX/NVPTXTargetTransformInfo.cpp:448:25
#17 0x00005a1197773f21 llvm::NVPTXTTIImpl::instCombineIntrinsic(llvm::InstCombiner&, llvm::IntrinsicInst&) const /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/lib/Target/NVPTX/NVPTXTargetTransformInfo.cpp:473:40
#18 0x00005a1198c0f596 llvm::InstCombiner::targetInstCombineIntrinsic(llvm::IntrinsicInst&) /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp:161:1
#19 0x00005a1198c74eb8 _M_is_engaged /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/optional:433:58
#20 0x00005a1198c74eb8 operator bool /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/optional:942:22
#21 0x00005a1198c74eb8 llvm::InstCombinerImpl::visitCallInst(llvm::CallInst&) /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:3626:9
#22 0x00005a1198c28105 llvm::InstCombinerImpl::run() /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp:5191:22
#23 0x00005a1198c2b85d combineInstructionsOverFunction(llvm::Function&, llvm::InstructionWorklist&, llvm::AAResults*, llvm::AssumptionCache&, llvm::TargetLibraryInfo&, llvm::TargetTransformInfo&, llvm::DominatorTree&, llvm::OptimizationRemarkEmitter&, llvm::BlockFrequencyInfo*, llvm::BranchProbabilityInfo*, llvm::ProfileSummaryInfo*, llvm::InstCombineOptions const&) /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp:5509:9
#24 0x00005a1198c2ad53 llvm::InstCombinePass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp:5564:8
#25 0x00005a1199889e0d llvm::detail::PassModel<llvm::Function, llvm::InstCombinePass, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:90:5
#26 0x00005a1198a5237a llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/include/llvm/IR/PassManagerImpl.h:85:8
#27 0x00005a1196b97d3d llvm::detail::PassModel<llvm::Function, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:90:5
#28 0x00005a1198a561a7 llvm::ModuleToFunctionPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/lib/IR/PassManager.cpp:129:23

Artem-B added a commit that referenced this pull request Oct 31, 2024
The change improves the code in general and, as a side effect, avoids
crashing on an impossible address space casts guarded 
by `__isGlobal/__isShared`, which partially fixes 
#112760

It's still possible to trigger the issue by using explicit AS casts w/o
AS checks, but LLVM should no longer crash on valid code.

This is #112964 + a small fix for the crash on unintended argument
access which was the root cause to revers the earlier version of the patch.
smallp-o-p pushed a commit to smallp-o-p/llvm-project that referenced this pull request Nov 3, 2024
The change improves the code in general and, as a side effect, avoids crashing
on an impossible address space casts guarded by `__isGlobal/__isShared`, which
partially fixes llvm#112760
It's still possible to trigger the issue by using explicit AS casts w/o
AS checks, but LLVM should no longer crash on valid code.
smallp-o-p pushed a commit to smallp-o-p/llvm-project that referenced this pull request Nov 3, 2024
The change improves the code in general and, as a side effect, avoids
crashing on an impossible address space casts guarded 
by `__isGlobal/__isShared`, which partially fixes 
llvm#112760

It's still possible to trigger the issue by using explicit AS casts w/o
AS checks, but LLVM should no longer crash on valid code.

This is llvm#112964 + a small fix for the crash on unintended argument
access which was the root cause to revers the earlier version of the patch.
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
The change improves the code in general and, as a side effect, avoids crashing
on an impossible address space casts guarded by `__isGlobal/__isShared`, which
partially fixes llvm#112760
It's still possible to trigger the issue by using explicit AS casts w/o
AS checks, but LLVM should no longer crash on valid code.
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
The change improves the code in general and, as a side effect, avoids
crashing on an impossible address space casts guarded 
by `__isGlobal/__isShared`, which partially fixes 
llvm#112760

It's still possible to trigger the issue by using explicit AS casts w/o
AS checks, but LLVM should no longer crash on valid code.

This is llvm#112964 + a small fix for the crash on unintended argument
access which was the root cause to revers the earlier version of the patch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:NVPTX llvm:analysis Includes value tracking, cost tables and constant folding llvm:support llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CUDA, NVPTX] LLVM crash with "Cannot cast between two non-generic address spaces"
5 participants