[MLIR,Python] Support converting boolean numpy arrays to and from mlir attributes #113064

kasper0406 · 2024-10-19T20:43:08Z

Currently it is unsupported to:

Convert a MlirAttribute with type i1 to a numpy array
Convert a boolean numpy array to a MlirAttribute

Currently the entire Python application violently crashes with a quite poor error message pybind/pybind11#3336

The complication handling these conversions, is that MlirAttribute represent booleans as a bit-packed i1 type, whereas numpy represents booleans as a byte array with 8 bit used per boolean.

This PR proposes the following approach:

When converting a i1 typed MlirAttribute to a numpy array, we can not directly use the underlying raw data backing the MlirAttribute as a buffer to Python, as done for other types. Instead, a copy of the data is generated using numpy's unpackbits function, and the result is send back to Python.
When constructing a MlirAttribute from a numpy array, first the python data is read as a uint8_t to get it converted to the endianess used internally in mlir. Then the booleans are bitpacked using numpy's bitpack function, and the bitpacked array is saved as the MlirAttribute representation.

Please note that I am not sure if this approach is the desired solution. I'd appreciate any feedback.

llvmbot · 2024-10-19T20:43:44Z

@llvm/pr-subscribers-mlir

Author: Kasper Nielsen (kasper0406)

Changes

Currently it is unsupported to:

Convert a MlirAttribute with type i1 to a numpy array
Convert a boolean numpy array to a MlirAttribute

Currently the entire Python application violently crashes with a quite poor error message pybind/pybind11#3336

The complication handling these conversions, is that MlirAttribute represent booleans as a bit-packed i1 type, whereas numpy represents booleans as a byte array with 8 bit used per boolean.

This PR proposes the following approach:

When converting a i1 typed MlirAttribute to a numpy array, we can not directly use the underlying raw data backing the MlirAttribute as a buffer to Python, as done for other types. Instead, a copy of the data is generated and stored inside PyDenseElementsAttribute when the conversion is requested. The duplicate data is kept around as long as the PyDenseElementsAttribute element is, to ensure the pointer remains valid from Python.
When constructing a MlirAttribute from a numpy array, first the python data is read as a uint8_t to get it converted to the endianess used internally in mlir. Then the booleans are bitpacked, and the bitpacked array is saved as the MlirAttribute representation.

Please note that I am not sure if this approach is the desired solution. I'd appreciate any feedback.

Full diff: https://github.com/llvm/llvm-project/pull/113064.diff

2 Files Affected:

(modified) mlir/lib/Bindings/Python/IRAttributes.cpp (+159-96)
(modified) mlir/test/python/ir/array_attributes.py (+64)

diff --git a/mlir/lib/Bindings/Python/IRAttributes.cpp b/mlir/lib/Bindings/Python/IRAttributes.cpp
index ead81a76c0538d..043b7ed5867e8b 100644
--- a/mlir/lib/Bindings/Python/IRAttributes.cpp
+++ b/mlir/lib/Bindings/Python/IRAttributes.cpp
@@ -757,103 +757,9 @@ class PyDenseElementsAttribute
       throw py::error_already_set();
     }
     auto freeBuffer = llvm::make_scope_exit([&]() { PyBuffer_Release(&view); });
-    SmallVector<int64_t> shape;
-    if (explicitShape) {
-      shape.append(explicitShape->begin(), explicitShape->end());
-    } else {
-      shape.append(view.shape, view.shape + view.ndim);
-    }
 
-    MlirAttribute encodingAttr = mlirAttributeGetNull();
     MlirContext context = contextWrapper->get();
-
-    // Detect format codes that are suitable for bulk loading. This includes
-    // all byte aligned integer and floating point types up to 8 bytes.
-    // Notably, this excludes, bool (which needs to be bit-packed) and
-    // other exotics which do not have a direct representation in the buffer
-    // protocol (i.e. complex, etc).
-    std::optional<MlirType> bulkLoadElementType;
-    if (explicitType) {
-      bulkLoadElementType = *explicitType;
-    } else {
-      std::string_view format(view.format);
-      if (format == "f") {
-        // f32
-        assert(view.itemsize == 4 && "mismatched array itemsize");
-        bulkLoadElementType = mlirF32TypeGet(context);
-      } else if (format == "d") {
-        // f64
-        assert(view.itemsize == 8 && "mismatched array itemsize");
-        bulkLoadElementType = mlirF64TypeGet(context);
-      } else if (format == "e") {
-        // f16
-        assert(view.itemsize == 2 && "mismatched array itemsize");
-        bulkLoadElementType = mlirF16TypeGet(context);
-      } else if (isSignedIntegerFormat(format)) {
-        if (view.itemsize == 4) {
-          // i32
-          bulkLoadElementType = signless
-                                    ? mlirIntegerTypeGet(context, 32)
-                                    : mlirIntegerTypeSignedGet(context, 32);
-        } else if (view.itemsize == 8) {
-          // i64
-          bulkLoadElementType = signless
-                                    ? mlirIntegerTypeGet(context, 64)
-                                    : mlirIntegerTypeSignedGet(context, 64);
-        } else if (view.itemsize == 1) {
-          // i8
-          bulkLoadElementType = signless ? mlirIntegerTypeGet(context, 8)
-                                         : mlirIntegerTypeSignedGet(context, 8);
-        } else if (view.itemsize == 2) {
-          // i16
-          bulkLoadElementType = signless
-                                    ? mlirIntegerTypeGet(context, 16)
-                                    : mlirIntegerTypeSignedGet(context, 16);
-        }
-      } else if (isUnsignedIntegerFormat(format)) {
-        if (view.itemsize == 4) {
-          // unsigned i32
-          bulkLoadElementType = signless
-                                    ? mlirIntegerTypeGet(context, 32)
-                                    : mlirIntegerTypeUnsignedGet(context, 32);
-        } else if (view.itemsize == 8) {
-          // unsigned i64
-          bulkLoadElementType = signless
-                                    ? mlirIntegerTypeGet(context, 64)
-                                    : mlirIntegerTypeUnsignedGet(context, 64);
-        } else if (view.itemsize == 1) {
-          // i8
-          bulkLoadElementType = signless
-                                    ? mlirIntegerTypeGet(context, 8)
-                                    : mlirIntegerTypeUnsignedGet(context, 8);
-        } else if (view.itemsize == 2) {
-          // i16
-          bulkLoadElementType = signless
-                                    ? mlirIntegerTypeGet(context, 16)
-                                    : mlirIntegerTypeUnsignedGet(context, 16);
-        }
-      }
-      if (!bulkLoadElementType) {
-        throw std::invalid_argument(
-            std::string("unimplemented array format conversion from format: ") +
-            std::string(format));
-      }
-    }
-
-    MlirType shapedType;
-    if (mlirTypeIsAShaped(*bulkLoadElementType)) {
-      if (explicitShape) {
-        throw std::invalid_argument("Shape can only be specified explicitly "
-                                    "when the type is not a shaped type.");
-      }
-      shapedType = *bulkLoadElementType;
-    } else {
-      shapedType = mlirRankedTensorTypeGet(shape.size(), shape.data(),
-                                           *bulkLoadElementType, encodingAttr);
-    }
-    size_t rawBufferSize = view.len;
-    MlirAttribute attr =
-        mlirDenseElementsAttrRawBufferGet(shapedType, rawBufferSize, view.buf);
+    MlirAttribute attr = getAttributeFromBuffer(view, signless, explicitType, explicitShape, context);
     if (mlirAttributeIsNull(attr)) {
       throw std::invalid_argument(
           "DenseElementsAttr could not be constructed from the given buffer. "
@@ -963,6 +869,21 @@ class PyDenseElementsAttribute
         // unsigned i16
         return bufferInfo<uint16_t>(shapedType);
       }
+    } else if (mlirTypeIsAInteger(elementType) &&
+               mlirIntegerTypeGetWidth(elementType) == 1) {
+      // i1 / bool
+      if (!m_boolBuffer.has_value()) {
+        // Because i1's are bitpacked within MLIR, we need to convert it into the
+        // one bool per byte representation used by numpy.
+        // We allocate a new array to keep around for this purpose.
+        int64_t numBooleans = mlirElementsAttrGetNumElements(*this);
+        m_boolBuffer = SmallVector<bool, 8>(numBooleans);
+        for (int i = 0; i < numBooleans; i++) {
+          bool value = mlirDenseElementsAttrGetBoolValue(*this, i);
+          m_boolBuffer.value()[i] = value;
+        }
+      }
+      return bufferInfo<bool>(shapedType, "?", m_boolBuffer.value().data());
     }
 
     // TODO: Currently crashes the program.
@@ -1000,6 +921,8 @@ class PyDenseElementsAttribute
   }
 
 private:
+  std::optional<SmallVector<bool, 8>> m_boolBuffer;
+
   static bool isUnsignedIntegerFormat(std::string_view format) {
     if (format.empty())
       return false;
@@ -1016,14 +939,154 @@ class PyDenseElementsAttribute
            code == 'q';
   }
 
+  static MlirType getShapedType(std::optional<MlirType> bulkLoadElementType,
+                                std::optional<std::vector<int64_t>> explicitShape,
+                                Py_buffer& view) {
+    SmallVector<int64_t> shape;
+    if (explicitShape) {
+      shape.append(explicitShape->begin(), explicitShape->end());
+    } else {
+      shape.append(view.shape, view.shape + view.ndim);
+    }
+
+    if (mlirTypeIsAShaped(*bulkLoadElementType)) {
+      if (explicitShape) {
+        throw std::invalid_argument("Shape can only be specified explicitly "
+                                    "when the type is not a shaped type.");
+      }
+      return *bulkLoadElementType;
+    } else {
+      MlirAttribute encodingAttr = mlirAttributeGetNull();
+      return mlirRankedTensorTypeGet(shape.size(), shape.data(),
+                                     *bulkLoadElementType, encodingAttr);
+    }
+  }
+
+  static MlirAttribute getAttributeFromBuffer(Py_buffer& view,
+                                              bool signless,
+                                              std::optional<PyType> explicitType,
+                                              std::optional<std::vector<int64_t>> explicitShape,
+                                              MlirContext& context) {
+    // Detect format codes that are suitable for bulk loading. This includes
+    // all byte aligned integer and floating point types up to 8 bytes.
+    // Notably, this excludes, bool (which needs to be bit-packed) and
+    // other exotics which do not have a direct representation in the buffer
+    // protocol (i.e. complex, etc).
+    std::optional<MlirType> bulkLoadElementType;
+    if (explicitType) {
+      bulkLoadElementType = *explicitType;
+    } else {
+      std::string_view format(view.format);
+      if (format == "f") {
+        // f32
+        assert(view.itemsize == 4 && "mismatched array itemsize");
+        bulkLoadElementType = mlirF32TypeGet(context);
+      } else if (format == "d") {
+        // f64
+        assert(view.itemsize == 8 && "mismatched array itemsize");
+        bulkLoadElementType = mlirF64TypeGet(context);
+      } else if (format == "e") {
+        // f16
+        assert(view.itemsize == 2 && "mismatched array itemsize");
+        bulkLoadElementType = mlirF16TypeGet(context);
+      } else if (format == "?") {
+        // i1
+        // The i1 type needs to be bit-packed, so we will handle it seperately
+        return getAttributeFromBufferBoolBitpack(view, explicitShape, context);
+      } else if (isSignedIntegerFormat(format)) {
+        if (view.itemsize == 4) {
+          // i32
+          bulkLoadElementType = signless
+                                    ? mlirIntegerTypeGet(context, 32)
+                                    : mlirIntegerTypeSignedGet(context, 32);
+        } else if (view.itemsize == 8) {
+          // i64
+          bulkLoadElementType = signless
+                                    ? mlirIntegerTypeGet(context, 64)
+                                    : mlirIntegerTypeSignedGet(context, 64);
+        } else if (view.itemsize == 1) {
+          // i8
+          bulkLoadElementType = signless ? mlirIntegerTypeGet(context, 8)
+                                         : mlirIntegerTypeSignedGet(context, 8);
+        } else if (view.itemsize == 2) {
+          // i16
+          bulkLoadElementType = signless
+                                    ? mlirIntegerTypeGet(context, 16)
+                                    : mlirIntegerTypeSignedGet(context, 16);
+        }
+      } else if (isUnsignedIntegerFormat(format)) {
+        if (view.itemsize == 4) {
+          // unsigned i32
+          bulkLoadElementType = signless
+                                    ? mlirIntegerTypeGet(context, 32)
+                                    : mlirIntegerTypeUnsignedGet(context, 32);
+        } else if (view.itemsize == 8) {
+          // unsigned i64
+          bulkLoadElementType = signless
+                                    ? mlirIntegerTypeGet(context, 64)
+                                    : mlirIntegerTypeUnsignedGet(context, 64);
+        } else if (view.itemsize == 1) {
+          // i8
+          bulkLoadElementType = signless
+                                    ? mlirIntegerTypeGet(context, 8)
+                                    : mlirIntegerTypeUnsignedGet(context, 8);
+        } else if (view.itemsize == 2) {
+          // i16
+          bulkLoadElementType = signless
+                                    ? mlirIntegerTypeGet(context, 16)
+                                    : mlirIntegerTypeUnsignedGet(context, 16);
+        }
+      }
+      if (!bulkLoadElementType) {
+        throw std::invalid_argument(
+            std::string("unimplemented array format conversion from format: ") +
+            std::string(format));
+      }
+    }
+
+    MlirType type = getShapedType(bulkLoadElementType, explicitShape, view);
+    return mlirDenseElementsAttrRawBufferGet(type, view.len, view.buf);
+  }
+
+  // There is a complication for boolean numpy arrays, as numpy represent them as
+  // 8 bits per boolean, whereas MLIR bitpacks them into 8 booleans per byte.
+  // This function does the bit-packing respecting endianess.
+  static MlirAttribute getAttributeFromBufferBoolBitpack(Py_buffer& view,
+                                                         std::optional<std::vector<int64_t>> explicitShape,
+                                                         MlirContext& context) {
+    // First read the content of the python buffer as u8's, to correct for endianess
+    MlirType byteType = getShapedType(mlirIntegerTypeUnsignedGet(context, 8), explicitShape, view);
+    MlirAttribute intermediateAttr = mlirDenseElementsAttrRawBufferGet(byteType, view.len, view.buf);
+
+    // Pack the boolean array according to the i1 bitpacking layout
+    const int numPackedBytes = (view.len + 7) / 8;
+    SmallVector<uint8_t, 8> bitpacked(numPackedBytes);
+    for (int byteNum = 0; byteNum < numPackedBytes; byteNum++) {
+      uint8_t byte = 0;
+      for (int bitNr = 0; 8 * byteNum + bitNr < view.len; bitNr++) {
+        int pos = 8 * byteNum + bitNr;
+        uint8_t boolVal = mlirDenseElementsAttrGetUInt8Value(intermediateAttr, pos) << bitNr;
+        byte |= boolVal;
+      }
+      bitpacked[byteNum] = byte;
+    }
+
+    MlirType bitpackedType = getShapedType(mlirIntegerTypeGet(context, 1), explicitShape, view);
+    return mlirDenseElementsAttrRawBufferGet(bitpackedType, numPackedBytes, bitpacked.data());
+  }
+
   template <typename Type>
   py::buffer_info bufferInfo(MlirType shapedType,
-                             const char *explicitFormat = nullptr) {
+                             const char *explicitFormat = nullptr,
+                             Type* dataOverride = nullptr) {
     intptr_t rank = mlirShapedTypeGetRank(shapedType);
     // Prepare the data for the buffer_info.
     // Buffer is configured for read-only access below.
     Type *data = static_cast<Type *>(
         const_cast<void *>(mlirDenseElementsAttrGetRawData(*this)));
+    if (dataOverride != nullptr) {
+      data = dataOverride;
+    }
     // Prepare the shape for the buffer_info.
     SmallVector<intptr_t, 4> shape;
     for (intptr_t i = 0; i < rank; ++i)
diff --git a/mlir/test/python/ir/array_attributes.py b/mlir/test/python/ir/array_attributes.py
index 2bc403aace8348..16d7322cfe7119 100644
--- a/mlir/test/python/ir/array_attributes.py
+++ b/mlir/test/python/ir/array_attributes.py
@@ -326,6 +326,70 @@ def testGetDenseElementsF64():
         print(np.array(attr))
 
 
+### 1 bit/boolean integer arrays
+# CHECK-LABEL: TEST: testGetDenseElementsI1Signless
+@run
+def testGetDenseElementsI1Signless():
+    with Context():
+        array = np.array([True], dtype=np.bool_)
+        attr = DenseElementsAttr.get(array)
+        # CHECK: dense<true> : tensor<1xi1>
+        print(attr)
+        # CHECK: {{\[}} True]
+        print(np.array(attr))
+
+        array = np.array([[True, False, True], [True, True, False]], dtype=np.bool_)
+        attr = DenseElementsAttr.get(array)
+        # CHECK: dense<{{\[}}[true, false, true], [true, true, false]]> : tensor<2x3xi1>
+        print(attr)
+        # CHECK: {{\[}}[ True False True]
+        # CHECK: {{\[}} True True False]]
+        print(np.array(attr))
+
+        array = np.array([[True, True, False, False], [True, False, True, False]], dtype=np.bool_)
+        attr = DenseElementsAttr.get(array)
+        # CHECK: dense<{{\[}}[true, true, false, false], [true, false, true, false]]> : tensor<2x4xi1>
+        print(attr)
+        # CHECK: {{\[}}[ True True False False]
+        # CHECK: {{\[}} True False True False]]
+        print(np.array(attr))
+
+        array = np.array([
+            [True, True, False, False],
+            [True, False, True, False],
+            [False, False, False, False],
+            [True, True, True, True],
+            [True, False, False, True],
+        ], dtype=np.bool_)
+        attr = DenseElementsAttr.get(array)
+        # CHECK: dense<{{\[}}[true, true, false, false], [true, false, true, false], [false, false, false, false], [true, true, true, true], [true, false, false, true]]> : tensor<5x4xi1>
+        print(attr)
+        # CHECK: {{\[}}[ True True False False]
+        # CHECK: {{\[}} True False True False]
+        # CHECK: {{\[}}False False False False]
+        # CHECK: {{\[}} True True True True]
+        # CHECK: {{\[}} True False False True]]
+        print(np.array(attr))
+
+        array = np.array([
+            [True, True, False, False, True, True, False, False, False],
+            [False, False, False, True, False, True, True, False, True]
+        ], dtype=np.bool_)
+        attr = DenseElementsAttr.get(array)
+        # CHECK: dense<{{\[}}[true, true, false, false, true, true, false, false, false], [false, false, false, true, false, true, true, false, true]]> : tensor<2x9xi1>
+        print(attr)
+        # CHECK: {{\[}}[ True True False False True True False False False]
+        # CHECK: {{\[}}False False False True False True True False True]]
+        print(np.array(attr))
+
+        array = np.array([], dtype=np.bool_)
+        attr = DenseElementsAttr.get(array)
+        # CHECK: dense<> : tensor<0xi1>
+        print(attr)
+        # CHECK: {{\[}}]
+        print(np.array(attr))
+
+
 ### 16 bit integer arrays
 # CHECK-LABEL: TEST: testGetDenseElementsI16Signless
 @run

github-actions · 2024-10-19T20:46:36Z

✅ With the latest revision this PR passed the Python code formatter.

github-actions · 2024-10-19T20:46:36Z

✅ With the latest revision this PR passed the C/C++ code formatter.

kasper0406 · 2024-10-23T15:03:52Z

mlir/lib/Bindings/Python/IRAttributes.cpp

-    size_t rawBufferSize = view.len;
-    MlirAttribute attr =
-        mlirDenseElementsAttrRawBufferGet(shapedType, rawBufferSize, view.buf);
+    MlirAttribute attr = getAttributeFromBuffer(view, signless, explicitType,


I had to move the if-statements to the getAttributeFromBuffer method, as now the i1 case will not follow the usual flow, but instead call getBitpackedAttributeFromBooleanBuffer to construct the MlirAttribute.

mlir/lib/Bindings/Python/IRAttributes.cpp

makslevental · 2024-10-23T19:06:12Z

mlir/lib/Bindings/Python/IRAttributes.cpp

+    // First read the content of the python buffer as u8's, to correct for
+    // endianess


sorry don't understand? what/why does this need to be "corrected"? I can see you're setting little below but shouldn't you be preserving/using the endianness for the host arch (wherever the bindings are running...)? probably I don't understand something here...

From https://mlir.llvm.org/docs/BytecodeFormat/#fixed-width-integers my understanding is that MLIR attributes always have their data stored in a little-endian format. If that is not true for MlirAttributes in general, the code is wrong.

@stellaraccident this is the one i'm most uncertain about

That's for bytecode rather than in memory. If this is converting the in memory format, then that section doesn't apply.

Ack, one thing I don't understand though, is that if I follow the mlirDenseElementsAttrRawBufferGet function, it goes to DenseElementsAttr::getFromRawBuffer, and I believe it eventually ends up in BuiltinAttributes.cpp where it uses the writeBits and readBits functions respectively. Both of these have special handling for big endian, where it seems to convert the format.

It is possible that it ends up calling Base::get instead, which may be why the above is not true. My C++ navigation skills is currently failing me, to follow which code this Base::get call will execute.

Do you know of a way to test this? From reading around other llvm docs, it seems like user-mode qemu is the common approach for testing big-endian systems. Do any of you have experience with that and the Python API?

it goes to DenseElementsAttr::getFromRawBuffer

https://github.com/makslevental/llvm-project/blob/84aa02d3fa1f1f614c4f3c144ec118b2f05ae6b0/mlir/lib/IR/BuiltinAttributes.cpp#L1064

which does indeed hit Base::get

https://github.com/makslevental/llvm-project/blob/84aa02d3fa1f1f614c4f3c144ec118b2f05ae6b0/mlir/lib/IR/BuiltinAttributes.cpp#L1354

which then goes to static ConcreteT StorageUserBase::get

https://github.com/makslevental/llvm-project/blob/a9636b7f60f283926c66e96c036f5b5d9e57c026/mlir/include/mlir/IR/StorageUniquerSupport.h#L177

I think possibly you could just remove the explicit use of little in the various places and defer to host endiannes?

I mean really this is all academic because as I understand it, there are no extant big-endian systems we care about? Someone with more experience can comment on this hypothesis...

Thanks for the pointers!

Unfortunately, it will not work simply remove the "little" part from the numpy calls, as it defaults to big-endian and not the native byteorder.

As you mention, I doubt any big-endian users will ever call this code anyways. What do you think about just throwing a "this is unsupported" exception (which is the current behaviour) if someone calls it on a big-endian system?

I think that check can be made something like:

if (llvm::endianness::native == llvm::endianness::big) { throw py::type_error("Boolean types are unsupported on big-endian systems"); }

I could also use the above check to switch between "little"/"big" endian in the numpy call, but I am considering if throwing is actually better, given we have no way to really test it?

Ya throwing here is acceptable and reasonable (if any IBM customers ever show up we can discuss it then 😅)

mlir/lib/Bindings/Python/IRAttributes.cpp

mlir/lib/Bindings/Python/PybindUtils.h

mlir/lib/Bindings/Python/IRAttributes.cpp

makslevental · 2024-10-28T20:21:40Z

mlir/lib/Bindings/Python/IRAttributes.cpp

+    py::object packed_booleans =
+        packbits_func(unpackedArray, "bitorder"_a = "little");


................ you can do this ..................... wow color me shocked I never noticed/knew you could do "kwargs" like this on cpp side.

mlir/lib/Bindings/Python/IRAttributes.cpp

makslevental

Looks basically fine/good to me (modulo nits). @stellaraccident do you have any complaints/concerns here?

mlir/lib/Bindings/Python/IRAttributes.cpp

mlir/test/python/ir/array_attributes.py

makslevental

LGTM - sorry for the long arduous process. I'll try to be faster next time

kasper0406 · 2024-11-01T21:26:45Z

Awesome, thank you for the review and comments!
I'd appreciate if you can click the merge button, as I don't have the permissions to do so.

llvm-ci · 2024-11-02T08:56:54Z

LLVM Buildbot has detected a new failure on builder sanitizer-x86_64-linux-bootstrap-asan running on sanitizer-buildbot1 while building mlir at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/52/builds/3406

Here is the relevant piece of the build log for the reference

Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/ld.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 86836 of 86837 tests, 88 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
FAIL: lld :: MachO/icf-options.s (86323 of 86836)
******************** TEST 'lld :: MachO/icf-options.s' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: rm -rf /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/MachO/Output/icf-options.s.tmp; mkdir /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/MachO/Output/icf-options.s.tmp
+ rm -rf /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/MachO/Output/icf-options.s.tmp
+ mkdir /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/MachO/Output/icf-options.s.tmp
RUN: at line 4: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/llvm-mc -filetype=obj -triple=x86_64-apple-darwin /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/lld/test/MachO/icf-options.s -o /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/MachO/Output/icf-options.s.tmp/main.o
+ /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/llvm-mc -filetype=obj -triple=x86_64-apple-darwin /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/lld/test/MachO/icf-options.s -o /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/MachO/Output/icf-options.s.tmp/main.o
RUN: at line 5: ld64.lld -arch x86_64 -platform_version macos 11.0 11.0 -syslibroot /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/lld/test/MachO/Inputs/MacOSX.sdk -lSystem -fatal_warnings -lSystem --icf=all -o /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/MachO/Output/icf-options.s.tmp/all /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/MachO/Output/icf-options.s.tmp/main.o 2>&1      | /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/lld/test/MachO/icf-options.s --check-prefix=DIAG-EMPTY --allow-empty
+ ld64.lld -arch x86_64 -platform_version macos 11.0 11.0 -syslibroot /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/lld/test/MachO/Inputs/MacOSX.sdk -lSystem -fatal_warnings -lSystem --icf=all -o /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/MachO/Output/icf-options.s.tmp/all /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/MachO/Output/icf-options.s.tmp/main.o
+ /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/lld/test/MachO/icf-options.s --check-prefix=DIAG-EMPTY --allow-empty
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/lld/test/MachO/icf-options.s:20:19: error: DIAG-EMPTY-NOT: excluded string found in input
# DIAG-EMPTY-NOT: {{.}}
                  ^
<stdin>:1:1: note: found here
==3059833==Unable to get registers from thread 3059503.
^

Input file: <stdin>
Check file: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/lld/test/MachO/icf-options.s

-dump-input=help explains the following input dump.

Input was:
<<<<<<
        1: ==3059833==Unable to get registers from thread 3059503. 
not:20     !                                                        error: no match expected
>>>>>>

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 
Slowest Tests:
--------------------------------------------------------------------------
246.73s: LLVM :: CodeGen/AMDGPU/sched-group-barrier-pipeline-solver.mir
Step 10 (stage2/asan check) failure: stage2/asan check (failure)
...
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/ld.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 86836 of 86837 tests, 88 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
FAIL: lld :: MachO/icf-options.s (86323 of 86836)
******************** TEST 'lld :: MachO/icf-options.s' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: rm -rf /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/MachO/Output/icf-options.s.tmp; mkdir /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/MachO/Output/icf-options.s.tmp
+ rm -rf /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/MachO/Output/icf-options.s.tmp
+ mkdir /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/MachO/Output/icf-options.s.tmp
RUN: at line 4: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/llvm-mc -filetype=obj -triple=x86_64-apple-darwin /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/lld/test/MachO/icf-options.s -o /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/MachO/Output/icf-options.s.tmp/main.o
+ /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/llvm-mc -filetype=obj -triple=x86_64-apple-darwin /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/lld/test/MachO/icf-options.s -o /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/MachO/Output/icf-options.s.tmp/main.o
RUN: at line 5: ld64.lld -arch x86_64 -platform_version macos 11.0 11.0 -syslibroot /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/lld/test/MachO/Inputs/MacOSX.sdk -lSystem -fatal_warnings -lSystem --icf=all -o /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/MachO/Output/icf-options.s.tmp/all /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/MachO/Output/icf-options.s.tmp/main.o 2>&1      | /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/lld/test/MachO/icf-options.s --check-prefix=DIAG-EMPTY --allow-empty
+ ld64.lld -arch x86_64 -platform_version macos 11.0 11.0 -syslibroot /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/lld/test/MachO/Inputs/MacOSX.sdk -lSystem -fatal_warnings -lSystem --icf=all -o /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/MachO/Output/icf-options.s.tmp/all /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/tools/lld/test/MachO/Output/icf-options.s.tmp/main.o
+ /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/lld/test/MachO/icf-options.s --check-prefix=DIAG-EMPTY --allow-empty
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/lld/test/MachO/icf-options.s:20:19: error: DIAG-EMPTY-NOT: excluded string found in input
# DIAG-EMPTY-NOT: {{.}}
                  ^
<stdin>:1:1: note: found here
==3059833==Unable to get registers from thread 3059503.
^

Input file: <stdin>
Check file: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/lld/test/MachO/icf-options.s

-dump-input=help explains the following input dump.

Input was:
<<<<<<
        1: ==3059833==Unable to get registers from thread 3059503. 
not:20     !                                                        error: no match expected
>>>>>>

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 
Slowest Tests:
--------------------------------------------------------------------------
246.73s: LLVM :: CodeGen/AMDGPU/sched-group-barrier-pipeline-solver.mir

makslevental · 2024-11-02T12:07:58Z

LLVM Buildbot has detected a new failure on builder sanitizer-x86_64-linux-bootstrap-asan running on sanitizer-buildbot1 while building mlir at step 2 "annotate".

Flake I'm pretty sure

…r attributes (llvm#113064) Currently it is unsupported to: 1. Convert a `MlirAttribute` with type `i1` to a numpy array 2. Convert a boolean numpy array to a `MlirAttribute` Currently the entire Python application violently crashes with a quite poor error message pybind/pybind11#3336 The complication handling these conversions, is that `MlirAttribute` represent booleans as a bit-packed `i1` type, whereas numpy represents booleans as a byte array with 8 bit used per boolean. This PR proposes the following approach: 1. When converting a `i1` typed `MlirAttribute` to a numpy array, we can not directly use the underlying raw data backing the `MlirAttribute` as a buffer to Python, as done for other types. Instead, a copy of the data is generated using numpy's unpackbits function, and the result is send back to Python. 2. When constructing a `MlirAttribute` from a numpy array, first the python data is read as a `uint8_t` to get it converted to the endianess used internally in mlir. Then the booleans are bitpacked using numpy's bitpack function, and the bitpacked array is saved as the `MlirAttribute` representation. Please note that I am not sure if this approach is the desired solution. I'd appreciate any feedback.

gribozavr · 2024-11-05T14:47:11Z

I'm afraid there is an actual ASan issue here. While the above one is a flake, I am getting an ASan failure that is real.

ASan report

==8566==ERROR: AddressSanitizer: heap-use-after-free on address 0x7b47396a8dd0 at pc 0x5647f3be84fc bp 0x7fff90c4d900 sp 0x7fff90c4d8f8
READ of size 1 at 0x7b47396a8dd0 thread T0
    #0 0x5647f3be84fb in _aligned_strided_to_contig_size1_srcstride0 <redacted>/numpy/_core/src/multiarray/lowlevel_strided_loops.c.src:227:17
    #1 0x5647f3a9dd61 in raw_array_assign_array <redacted>/numpy/_core/src/multiarray/array_assign_array.c:150:13
    #2 0x5647f3a9f4a2 in PyArray_AssignArray <redacted>/numpy/_core/src/multiarray/array_assign_array.c:444:13
    #3 0x5647f3acf4ce in PyArray_CopyInto <redacted>/numpy/_core/src/multiarray/ctors.c:2895:12
    #4 0x5647f3acf4ce in PyArray_FromArray <redacted>/numpy/_core/src/multiarray/ctors.c:1981:23
    #5 0x5647f3acee59 in PyArray_FromAny_int <redacted>/numpy/_core/src/multiarray/ctors.c:1620:25
    #6 0x5647f3acfa05 in PyArray_CheckFromAny_int <redacted>/numpy/_core/src/multiarray/ctors.c:1848:11
    #7 0x5647f3b49b6a in _array_fromobject_generic <redacted>/numpy/_core/src/multiarray/multiarraymodule.c:1681:28
    #8 0x5647f3b3fec4 in array_array <redacted>/numpy/_core/src/multiarray/multiarraymodule.c:1747:21
    #9 0x5647f53af212 in cfunction_vectorcall_FASTCALL_KEYWORDS python_runtime/<redacted>/v3_11/Objects/methodobject.c:443:24
    #10 0x5647f535cbf5 in _PyObject_VectorcallTstate python_runtime/<redacted>/v3_11/Include/internal/pycore_call.h:92:11
    #11 0x5647f535cbf5 in PyObject_Vectorcall python_runtime/<redacted>/v3_11/Objects/call.c:299:12
    #12 0x5647f54a1662 in _PyEval_EvalFrameDefault python_runtime/<redacted>/v3_11/Python/ceval.c
    #13 0x5647f549871f in _PyEval_EvalFrame python_runtime/<redacted>/v3_11/Include/internal/pycore_ceval.h:94:16
    #14 0x5647f549871f in _PyEval_Vector python_runtime/<redacted>/v3_11/Python/ceval.c:6442:24
    #15 0x5647f54985ce in PyEval_EvalCode python_runtime/<redacted>/v3_11/Python/ceval.c:1149:21
    #16 0x5647f5500d0f in run_eval_code_obj python_runtime/<redacted>/v3_11/Python/pythonrun.c:1741:9
    #17 0x5647f54ff126 in run_mod python_runtime/<redacted>/v3_11/Python/pythonrun.c:1762:19
    #18 0x5647f54fd45c in pyrun_file python_runtime/<redacted>/v3_11/Python/pythonrun.c:1657:15
    #19 0x5647f54fd45c in _PyRun_SimpleFileObject python_runtime/<redacted>/v3_11/Python/pythonrun.c:440:13
    #20 0x5647f54fcf1d in _PyRun_AnyFileObject python_runtime/<redacted>/v3_11/Python/pythonrun.c:79:15
    #21 0x5647f533c7d1 in pymain_run_file_obj python_runtime/<redacted>/v3_11/Modules/main.c:360:15
    #22 0x5647f533c7d1 in pymain_run_file python_runtime/<redacted>/v3_11/Modules/main.c:379:15
    #23 0x5647f533bf3d in pymain_run_python python_runtime/<redacted>/v3_11/Modules/main.c:601:21
    #24 0x5647f533bf3d in Py_RunMain python_runtime/<redacted>/v3_11/Modules/main.c:680:5
    #25 0x5647f533c1f1 in pymain_main python_runtime/<redacted>/v3_11/Modules/main.c:710:12
    #26 0x5647f533c24a in Py_BytesMain python_runtime/<redacted>/v3_11/Modules/main.c:734:12
    #27 0x5647e919845e in main devtools/python/launcher/python_interpreter.cc:67:10
    #28 0x7f27397923d3 in __libc_start_main (/usr/grte/v5/lib64/libc.so.6+0x613d3) (BuildId: 9a996398ce14a94560b0c642eb4f6e94)
    #29 0x5647e90be029 in _start /usr/grte/v5/debug-src/src/csu/../sysdeps/x86_64/start.S:120

0x7b47396a8dd0 is located 0 bytes inside of 8-byte region [0x7b47396a8dd0,0x7b47396a8dd8)
freed by thread T0 here:
    #0 0x5647e915dd06 in free <redacted>/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:51:3
    #1 0x5647f3a9c89b in _npy_free_cache <redacted>/numpy/_core/src/multiarray/alloc.c:145:5
    #2 0x5647f3a9c89b in default_free <redacted>/numpy/_core/src/multiarray/alloc.c:331:5
    #3 0x5647f3a9c9f4 in PyDataMem_UserFREE <redacted>/numpy/_core/src/multiarray/alloc.c:395:5
    #4 0x5647f3aac5ac in array_dealloc <redacted>/numpy/_core/src/multiarray/arrayobject.c:444:13
    #5 0x5647f53b5ca4 in _Py_Dealloc python_runtime/<redacted>/v3_11/Objects/object.c:2390:5
    #6 0x5647eb2ee09f in Py_DECREF python_runtime/<redacted>/v3_11/Include/object.h:538:9
    #7 0x5647eb2ee09f in Py_XDECREF python_runtime/<redacted>/v3_11/Include/object.h:602:9
    #8 0x5647eb2ee09f in dec_ref pybind11/<redacted>/include/pybind11/pytypes.h:284:9
    #9 0x5647eb2ee09f in ~object pybind11/<redacted>/include/pybind11/pytypes.h:380:17
    #10 0x5647eb2ee09f in (anonymous namespace)::PyDenseElementsAttribute::getBooleanBufferFromBitpackedAttribute() <redacted>/llvm-project/mlir/lib/Bindings/Python/IRAttributes.cpp:1095:3
    #11 0x5647eb2e09e8 in (anonymous namespace)::PyDenseElementsAttribute::accessBuffer() <redacted>/llvm-project/mlir/lib/Bindings/Python/IRAttributes.cpp:880:14
    #12 0x5647eb2e8d43 in operator() pybind11/<redacted>/include/pybind11/pybind11.h:1975:54
    #13 0x5647eb2e8d43 in operator() pybind11/<redacted>/include/pybind11/pybind11.h:1962:40
    #14 0x5647eb2e8d43 in pybind11::class_<(anonymous namespace)::PyDenseElementsAttribute, mlir::python::PyAttribute>& pybind11::class_<(anonymous namespace)::PyDenseElementsAttribute, mlir::python::PyAttribute>::def_buffer<pybind11::class_<(anonymous namespace)::PyDenseElementsAttribute, mlir::python::PyAttribute>& pybind11::class_<(anonymous namespace)::PyDenseElementsAttribute, mlir::python::PyAttribute>::def_buffer<pybind11::buffer_info, (anonymous namespace)::PyDenseElementsAttribute>(pybind11::buffer_info ((anonymous namespace)::PyDenseElementsAttribute::*)())::'lambda'((anonymous namespace)::PyDenseElementsAttribute&)>(pybind11::buffer_info&&)::'lambda'(_object*, void*)::__invoke(_object*, void*) pybind11/<redacted>/include/pybind11/pybind11.h:1957:13
    #15 0x5647eb227b6e in pybind11_getbuffer pybind11/<redacted>/include/pybind11/detail/class.h:592:16
    #16 0x5647f533dfb7 in PyObject_GetBuffer python_runtime/<redacted>/v3_11/Objects/abstract.c:390:15
    #17 0x5647f53a9733 in _PyManagedBuffer_FromObject python_runtime/<redacted>/v3_11/Objects/memoryobject.c:96:9
    #18 0x5647f53a95d2 in PyMemoryView_FromObject python_runtime/<redacted>/v3_11/Objects/memoryobject.c:797:42
    #19 0x5647f3accd58 in _array_from_array_like <redacted>/numpy/_core/src/multiarray/ctors.c:1443:32
    #20 0x5647f3aa2951 in PyArray_DiscoverDTypeAndShape_Recursive <redacted>/numpy/_core/src/multiarray/array_coercion.c:1041:32
    #21 0x5647f3aa3585 in PyArray_DiscoverDTypeAndShape <redacted>/numpy/_core/src/multiarray/array_coercion.c:1303:16
    #22 0x5647f3acebf5 in PyArray_FromAny_int <redacted>/numpy/_core/src/multiarray/ctors.c:1578:12
    #23 0x5647f3acfa05 in PyArray_CheckFromAny_int <redacted>/numpy/_core/src/multiarray/ctors.c:1848:11
    #24 0x5647f3b49b6a in _array_fromobject_generic <redacted>/numpy/_core/src/multiarray/multiarraymodule.c:1681:28
    #25 0x5647f3b3fec4 in array_array <redacted>/numpy/_core/src/multiarray/multiarraymodule.c:1747:21
    #26 0x5647f53af212 in cfunction_vectorcall_FASTCALL_KEYWORDS python_runtime/<redacted>/v3_11/Objects/methodobject.c:443:24
    #27 0x5647f535cbf5 in _PyObject_VectorcallTstate python_runtime/<redacted>/v3_11/Include/internal/pycore_call.h:92:11
    #28 0x5647f535cbf5 in PyObject_Vectorcall python_runtime/<redacted>/v3_11/Objects/call.c:299:12
    #29 0x5647f54a1662 in _PyEval_EvalFrameDefault python_runtime/<redacted>/v3_11/Python/ceval.c
    #30 0x5647f549871f in _PyEval_EvalFrame python_runtime/<redacted>/v3_11/Include/internal/pycore_ceval.h:94:16
    #31 0x5647f549871f in _PyEval_Vector python_runtime/<redacted>/v3_11/Python/ceval.c:6442:24
    #32 0x5647f54985ce in PyEval_EvalCode python_runtime/<redacted>/v3_11/Python/ceval.c:1149:21
    #33 0x5647f5500d0f in run_eval_code_obj python_runtime/<redacted>/v3_11/Python/pythonrun.c:1741:9
    #34 0x5647f54ff126 in run_mod python_runtime/<redacted>/v3_11/Python/pythonrun.c:1762:19
    #35 0x5647f54fd45c in pyrun_file python_runtime/<redacted>/v3_11/Python/pythonrun.c:1657:15
    #36 0x5647f54fd45c in _PyRun_SimpleFileObject python_runtime/<redacted>/v3_11/Python/pythonrun.c:440:13
    #37 0x5647f54fcf1d in _PyRun_AnyFileObject python_runtime/<redacted>/v3_11/Python/pythonrun.c:79:15
    #38 0x5647f533c7d1 in pymain_run_file_obj python_runtime/<redacted>/v3_11/Modules/main.c:360:15
    #39 0x5647f533c7d1 in pymain_run_file python_runtime/<redacted>/v3_11/Modules/main.c:379:15
    #40 0x5647f533bf3d in pymain_run_python python_runtime/<redacted>/v3_11/Modules/main.c:601:21
    #41 0x5647f533bf3d in Py_RunMain python_runtime/<redacted>/v3_11/Modules/main.c:680:5

previously allocated by thread T0 here:
    #0 0x5647e915dfa4 in malloc <redacted>/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:67:3
    #1 0x5647f3a9c780 in _npy_alloc_cache <redacted>/numpy/_core/src/multiarray/alloc.c:106:9
    #2 0x5647f3a9c780 in default_malloc <redacted>/numpy/_core/src/multiarray/alloc.c:291:12
    #3 0x5647f3a9c8ee in PyDataMem_UserNEW <redacted>/numpy/_core/src/multiarray/alloc.c:364:14
    #4 0x5647f3acb59f in PyArray_NewFromDescr_int <redacted>/numpy/_core/src/multiarray/ctors.c:879:20
    #5 0x5647f3acbb3d in PyArray_NewFromDescrAndBase <redacted>/numpy/_core/src/multiarray/ctors.c:1037:12
    #6 0x5647f3acbb3d in PyArray_NewFromDescr <redacted>/numpy/_core/src/multiarray/ctors.c:1022:12
    #7 0x5647f3abb3e8 in unpack_bits <redacted>/numpy/_core/src/multiarray/compiled_base.c:1866:28
    #8 0x5647f3abb3e8 in io_unpack <redacted>/numpy/_core/src/multiarray/compiled_base.c:2042:12
    #9 0x5647f53afa6a in cfunction_call python_runtime/<redacted>/v3_11/Objects/methodobject.c:542:18
    #10 0x5647f535c5ca in _PyObject_MakeTpCall python_runtime/<redacted>/v3_11/Objects/call.c:214:18
    #11 0x5647f535cc1c in _PyObject_VectorcallTstate python_runtime/<redacted>/v3_11/Include/internal/pycore_call.h:90:16
    #12 0x5647f535cc1c in PyObject_Vectorcall python_runtime/<redacted>/v3_11/Objects/call.c:299:12
    #13 0x5647f3aa970b in dispatcher_vectorcall <redacted>/numpy/_core/src/multiarray/arrayfunction_override.c:567:18
    #14 0x5647f535cd64 in _PyVectorcall_Call python_runtime/<redacted>/v3_11/Objects/call.c:257:24
    #15 0x5647f535cd64 in _PyObject_Call python_runtime/<redacted>/v3_11/Objects/call.c:328:16
    #16 0x5647f535ce8f in PyObject_Call python_runtime/<redacted>/v3_11/Objects/call.c:355:12
    #17 0x5647eb2e307e in call pybind11/<redacted>/include/pybind11/cast.h:1974:28
    #18 0x5647eb2e307e in pybind11::object pybind11::detail::object_api<pybind11::handle>::operator()<(pybind11::return_value_policy)1, pybind11::array_t<unsigned char, 16>&, pybind11::arg_v>(pybind11::array_t<unsigned char, 16>&, pybind11::arg_v&&) const pybind11/<redacted>/include/pybind11/cast.h:2110:75
    #19 0x5647eb2ed942 in (anonymous namespace)::PyDenseElementsAttribute::getBooleanBufferFromBitpackedAttribute() <redacted>/llvm-project/mlir/lib/Bindings/Python/IRAttributes.cpp:1089:9
    #20 0x5647eb2e09e8 in (anonymous namespace)::PyDenseElementsAttribute::accessBuffer() <redacted>/llvm-project/mlir/lib/Bindings/Python/IRAttributes.cpp:880:14
    #21 0x5647eb2e8d43 in operator() pybind11/<redacted>/include/pybind11/pybind11.h:1975:54
    #22 0x5647eb2e8d43 in operator() pybind11/<redacted>/include/pybind11/pybind11.h:1962:40
    #23 0x5647eb2e8d43 in pybind11::class_<(anonymous namespace)::PyDenseElementsAttribute, mlir::python::PyAttribute>& pybind11::class_<(anonymous namespace)::PyDenseElementsAttribute, mlir::python::PyAttribute>::def_buffer<pybind11::class_<(anonymous namespace)::PyDenseElementsAttribute, mlir::python::PyAttribute>& pybind11::class_<(anonymous namespace)::PyDenseElementsAttribute, mlir::python::PyAttribute>::def_buffer<pybind11::buffer_info, (anonymous namespace)::PyDenseElementsAttribute>(pybind11::buffer_info ((anonymous namespace)::PyDenseElementsAttribute::*)())::'lambda'((anonymous namespace)::PyDenseElementsAttribute&)>(pybind11::buffer_info&&)::'lambda'(_object*, void*)::__invoke(_object*, void*) pybind11/<redacted>/include/pybind11/pybind11.h:1957:13
    #24 0x5647eb227b6e in pybind11_getbuffer pybind11/<redacted>/include/pybind11/detail/class.h:592:16
    #25 0x5647f533dfb7 in PyObject_GetBuffer python_runtime/<redacted>/v3_11/Objects/abstract.c:390:15
    #26 0x5647f53a9733 in _PyManagedBuffer_FromObject python_runtime/<redacted>/v3_11/Objects/memoryobject.c:96:9
    #27 0x5647f53a95d2 in PyMemoryView_FromObject python_runtime/<redacted>/v3_11/Objects/memoryobject.c:797:42
    #28 0x5647f3accd58 in _array_from_array_like <redacted>/numpy/_core/src/multiarray/ctors.c:1443:32
    #29 0x5647f3aa2951 in PyArray_DiscoverDTypeAndShape_Recursive <redacted>/numpy/_core/src/multiarray/array_coercion.c:1041:32
    #30 0x5647f3aa3585 in PyArray_DiscoverDTypeAndShape <redacted>/numpy/_core/src/multiarray/array_coercion.c:1303:16
    #31 0x5647f3acebf5 in PyArray_FromAny_int <redacted>/numpy/_core/src/multiarray/ctors.c:1578:12
    #32 0x5647f3acfa05 in PyArray_CheckFromAny_int <redacted>/numpy/_core/src/multiarray/ctors.c:1848:11
    #33 0x5647f3b49b6a in _array_fromobject_generic <redacted>/numpy/_core/src/multiarray/multiarraymodule.c:1681:28
    #34 0x5647f3b3fec4 in array_array <redacted>/numpy/_core/src/multiarray/multiarraymodule.c:1747:21
    #35 0x5647f53af212 in cfunction_vectorcall_FASTCALL_KEYWORDS python_runtime/<redacted>/v3_11/Objects/methodobject.c:443:24
    #36 0x5647f535cbf5 in _PyObject_VectorcallTstate python_runtime/<redacted>/v3_11/Include/internal/pycore_call.h:92:11
    #37 0x5647f535cbf5 in PyObject_Vectorcall python_runtime/<redacted>/v3_11/Objects/call.c:299:12
    #38 0x5647f54a1662 in _PyEval_EvalFrameDefault python_runtime/<redacted>/v3_11/Python/ceval.c

SUMMARY: AddressSanitizer: heap-use-after-free <redacted>/numpy/_core/src/multiarray/lowlevel_strided_loops.c.src:227:17 in _aligned_strided_to_contig_size1_srcstride0
Shadow bytes around the buggy address:
  0x7b47396a8b00: fa fa fd fa fa fa 00 04 fa fa fd fd fa fa fd fd
  0x7b47396a8b80: fa fa fd fa fa fa fd fa fa fa fd fd fa fa fd fa
  0x7b47396a8c00: fa fa fd fd fa fa 00 04 fa fa fd fa fa fa fd fa
  0x7b47396a8c80: fa fa fd fd fa fa fd fa fa fa 00 04 fa fa fd fd
  0x7b47396a8d00: fa fa fd fa fa fa fd fa fa fa fd fa fa fa fd fa
=>0x7b47396a8d80: fa fa fd fa fa fa fd fa fa fa[fd]fd fa fa fd fa
  0x7b47396a8e00: fa fa fd fd fa fa 00 04 fa fa 00 fa fa fa 00 fa
  0x7b47396a8e80: fa fa 00 00 fa fa 00 00 fa fa 01 fa fa fa 00 fa
  0x7b47396a8f00: fa fa fd fd fa fa fd fd fa fa fd fd fa fa fd fd
  0x7b47396a8f80: fa fa fd fd fa fa fd fd fa fa fd fd fa fa fd fd
  0x7b47396a9000: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==8566==ABORTING
FileCheck error: '<stdin>' is empty.

IIUC what's happening here, there is a bug in getBooleanBufferFromBitpackedAttribute. It allocates a buffer on line 1089 as a local variable:

    py::object unpacked_booleans =
        unpackbits_func(packedArray, "bitorder"_a = "little");

then extracts a pointer into it:

    py::buffer_info pythonBuffer =
        unpacked_booleans.cast<py::buffer>().request();

finally, the pointer is returned from the function, but the buffer is then immediately freed as the function runs the destructors for local variables at }:

    return bufferInfo<bool>(shapedType, (bool *)pythonBuffer.ptr, "?");
  }

I'm going to revert the patch now.

makslevental · 2024-11-05T15:04:50Z

then extracts a pointer into it:

    py::buffer_info pythonBuffer =
        unpacked_booleans.cast<py::buffer>().request();

Ah I believe there's some API equivalent to release() because we do munge py:buffers elsewhere...

I'm going to revert the patch now.

Sure - you can send a PR and I'll stamp or you can just push (if you have push privileges)

…from mlir attributes (#113064)" This reverts commit fb7bf7a. There is an ASan issue here, see the discussion on #113064.

gribozavr · 2024-11-05T15:09:48Z

I pushed the revert as 0a68171.

makslevental · 2024-11-05T15:14:07Z

here's the dance you have to do to release

llvm-project/mlir/lib/Bindings/Python/IRAttributes.cpp

Lines 1228 to 1239 in 6b64f36

    
           int flags = PyBUF_STRIDES; 
        
           std::unique_ptr<Py_buffer> view = std::make_unique<Py_buffer>(); 
        
           if (PyObject_GetBuffer(buffer.ptr(), view.get(), flags) != 0) { 
        
             throw py::error_already_set(); 
        
           } 
        
           // This scope releaser will only release if we haven't yet transferred 
        
           // ownership. 
        
           auto freeBuffer = llvm::make_scope_exit([&]() { 
        
             if (view) 
        
               PyBuffer_Release(view.get()); 
        
           });

(or something like that)

but you will also have to think about https://pybind11.readthedocs.io/en/stable/advanced/functions.html#return-value-policies

I'll try to reason it out ~tomorrow

kasper0406 · 2024-11-05T23:50:23Z

Agh dang. I will try to give it a look tomorrow.
Are there any information available for how to get the ASan report? I tried building with -DLLVM_USE_SANITIZER=Address, but I do not see any issues running ninja -C build check-mlir with that option.

joker-eph · 2024-11-05T23:52:08Z

I'm afraid there is an actual ASan issue here. While the above one is a flake, I am getting an ASan failure that is real.

@gribozavr : is this an upstream test failing or something downstream here? If downstream can you provide a reproducer please?

…from mlir attributes (llvm#113064)" This reverts commit fb7bf7a. There is an ASan issue here, see the discussion on llvm#113064.

…r attributes (unrevert) (#115481) This PR re-introduces the functionality of #113064, which was reverted in 0a68171 due to memory lifetime issues. Notice that I was not able to re-produce the ASan results myself, so I have not been able to verify that this PR really fixes the issue. --- Currently it is unsupported to: 1. Convert a MlirAttribute with type i1 to a numpy array 2. Convert a boolean numpy array to a MlirAttribute Currently the entire Python application violently crashes with a quite poor error message pybind/pybind11#3336 The complication handling these conversions, is that MlirAttribute represent booleans as a bit-packed i1 type, whereas numpy represents booleans as a byte array with 8 bit used per boolean. This PR proposes the following approach: 1. When converting a i1 typed MlirAttribute to a numpy array, we can not directly use the underlying raw data backing the MlirAttribute as a buffer to Python, as done for other types. Instead, a copy of the data is generated using numpy's unpackbits function, and the result is send back to Python. 2. When constructing a MlirAttribute from a numpy array, first the python data is read as a uint8_t to get it converted to the endianess used internally in mlir. Then the booleans are bitpacked using numpy's bitpack function, and the bitpacked array is saved as the MlirAttribute representation.

…f18bd3aba Local branch amd-gfx cbff18b [AMDGPU] Rewrite GFX12 SGPR hazard handling to dedicated pass Remote branch main 0a68171 Revert "[MLIR,Python] Support converting boolean numpy arrays to and from mlir attributes (llvm#113064)"

kasper0406 added 3 commits October 19, 2024 19:55

[Python] Attempt at getting boolean types working

053d0b7

Refactorings

d5da538

Cleanups

52b49ac

kasper0406 requested review from ftynse, makslevental and stellaraccident as code owners October 19, 2024 20:43

llvmbot added the mlir label Oct 19, 2024

kasper0406 added 7 commits October 19, 2024 23:01

Fix style

6d3204c

More styles

f8a21fc

Use numpy to bitpack and unpack, to avoid additional fields

73df6fb

Small refactoring

93156b1

Fix styles

90868b8

Minor rename

d216d43

Code format

6543732

kasper0406 commented Oct 23, 2024

View reviewed changes

makslevental reviewed Oct 23, 2024

View reviewed changes

mlir/lib/Bindings/Python/IRAttributes.cpp Outdated Show resolved Hide resolved

makslevental reviewed Oct 23, 2024

View reviewed changes

mlir/lib/Bindings/Python/IRAttributes.cpp Outdated Show resolved Hide resolved

makslevental reviewed Oct 23, 2024

View reviewed changes

mlir/lib/Bindings/Python/IRAttributes.cpp Show resolved Hide resolved

makslevental reviewed Oct 23, 2024

View reviewed changes

mlir/lib/Bindings/Python/IRAttributes.cpp Outdated Show resolved Hide resolved

makslevental reviewed Oct 23, 2024

View reviewed changes

mlir/lib/Bindings/Python/PybindUtils.h Outdated Show resolved Hide resolved

Address comments

75c8264

makslevental reviewed Oct 28, 2024

View reviewed changes

mlir/lib/Bindings/Python/IRAttributes.cpp Outdated Show resolved Hide resolved

makslevental reviewed Oct 28, 2024

View reviewed changes

mlir/lib/Bindings/Python/IRAttributes.cpp Outdated Show resolved Hide resolved

makslevental reviewed Oct 28, 2024

View reviewed changes

Fix nits

e5b10a3

makslevental requested a review from jpienaar October 29, 2024 15:56

jpienaar reviewed Oct 30, 2024

View reviewed changes

mlir/lib/Bindings/Python/IRAttributes.cpp Outdated Show resolved Hide resolved

mlir/test/python/ir/array_attributes.py Outdated Show resolved Hide resolved

kasper0406 added 3 commits November 1, 2024 13:40

Fix comments

b65d7d6

Throw an exception if used on big-endian machines

c9b2100

Fix C++ formatting

a1ae520

makslevental approved these changes Nov 1, 2024

View reviewed changes

makslevental merged commit fb7bf7a into llvm:main Nov 2, 2024
8 checks passed

kasper0406 deleted the kn/boolean-types branch November 4, 2024 13:39

gribozavr added a commit that referenced this pull request Nov 5, 2024

Revert "[MLIR,Python] Support converting boolean numpy arrays to and …

0a68171

…from mlir attributes (#113064)" This reverts commit fb7bf7a. There is an ASan issue here, see the discussion on #113064.

kasper0406 restored the kn/boolean-types branch November 5, 2024 20:32

kasper0406 mentioned this pull request Nov 8, 2024

[MLIR,Python] Support converting boolean numpy arrays to and from mlir attributes (unrevert) #115481

Merged

		// First read the content of the python buffer as u8's, to correct for
		// endianess

		py::object packed_booleans =
		packbits_func(unpackedArray, "bitorder"_a = "little");

[MLIR,Python] Support converting boolean numpy arrays to and from mlir attributes #113064

[MLIR,Python] Support converting boolean numpy arrays to and from mlir attributes #113064

Uh oh!

Conversation

kasper0406 commented Oct 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Oct 19, 2024

Uh oh!

github-actions bot commented Oct 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

makslevental Nov 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

makslevental Nov 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

makslevental Oct 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

makslevental left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

makslevental left a comment

Choose a reason for hiding this comment

Uh oh!

kasper0406 commented Nov 1, 2024

Uh oh!

Uh oh!

llvm-ci commented Nov 2, 2024

Uh oh!

makslevental commented Nov 2, 2024

Uh oh!

gribozavr commented Nov 5, 2024

Uh oh!

makslevental commented Nov 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gribozavr commented Nov 5, 2024

Uh oh!

makslevental commented Nov 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kasper0406 commented Nov 5, 2024

Uh oh!

joker-eph commented Nov 5, 2024

Uh oh!

kasper0406 commented Oct 19, 2024 •

edited

Loading

github-actions bot commented Oct 19, 2024 •

edited

Loading

github-actions bot commented Oct 19, 2024 •

edited

Loading

makslevental Nov 1, 2024 •

edited

Loading

makslevental Nov 1, 2024 •

edited

Loading

makslevental Oct 28, 2024 •

edited

Loading

makslevental left a comment •

edited

Loading

makslevental commented Nov 5, 2024 •

edited

Loading

makslevental commented Nov 5, 2024 •

edited

Loading