Skip to content

[libc] Add aligned_alloc #96586

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 3, 2024
Merged

[libc] Add aligned_alloc #96586

merged 1 commit into from
Jul 3, 2024

Conversation

PiJoules
Copy link
Contributor

This adds support for aligned_alloc with the freelist allocator. This works by finding blocks large enough to hold the requested size plus some shift amount that's at most the requested alignment. Blocks that meet this requirement but aren't properly aligned can be split such that the usable_space of a new block is aligned properly. The "padding" block created will be merged with the previous block if one exists.

@llvmbot
Copy link
Member

llvmbot commented Jun 25, 2024

@llvm/pr-subscribers-libc

Author: None (PiJoules)

Changes

This adds support for aligned_alloc with the freelist allocator. This works by finding blocks large enough to hold the requested size plus some shift amount that's at most the requested alignment. Blocks that meet this requirement but aren't properly aligned can be split such that the usable_space of a new block is aligned properly. The "padding" block created will be merged with the previous block if one exists.


Patch is 26.04 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/96586.diff

9 Files Affected:

  • (modified) libc/src/__support/block.h (+87)
  • (modified) libc/src/__support/freelist.h (+20)
  • (modified) libc/src/__support/freelist_heap.h (+45-6)
  • (added) libc/src/stdlib/aligned_alloc.h (+20)
  • (modified) libc/src/stdlib/freelist_malloc.cpp (+5)
  • (modified) libc/test/src/__support/CMakeLists.txt (+1)
  • (modified) libc/test/src/__support/block_test.cpp (+200)
  • (modified) libc/test/src/__support/freelist_heap_test.cpp (+81-5)
  • (modified) libc/test/src/__support/freelist_malloc_test.cpp (+18)
diff --git a/libc/src/__support/block.h b/libc/src/__support/block.h
index 580f20e1ec4a4..8a7608e479e61 100644
--- a/libc/src/__support/block.h
+++ b/libc/src/__support/block.h
@@ -16,6 +16,7 @@
 #include "src/__support/CPP/optional.h"
 #include "src/__support/CPP/span.h"
 #include "src/__support/CPP/type_traits.h"
+#include "src/__support/libc_assert.h"
 
 #include <stdint.h>
 
@@ -261,6 +262,40 @@ class Block {
 
   constexpr Block(size_t prev_outer_size, size_t outer_size);
 
+  bool usable_space_is_aligned(size_t alignment) const {
+    return reinterpret_cast<uintptr_t>(usable_space()) % alignment == 0;
+  }
+
+  size_t extra_space_for_adjustment(size_t alignment) const {
+    if (usable_space_is_aligned(alignment))
+      return 0;
+
+    // We need to ensure we can always split this block into a "padding" block
+    // and the aligned block. To do this, we need enough extra space for at
+    // least one block.
+    //
+    // |block   |usable_space                          |
+    // |........|......................................|
+    //                            ^
+    //                            Alignment requirement
+    //
+    //
+    // |block   |space   |block   |usable_space        |
+    // |........|........|........|....................|
+    //                            ^
+    //                            Alignment requirement
+    //
+    uintptr_t start = reinterpret_cast<uintptr_t>(usable_space());
+    alignment = cpp::max(alignment, ALIGNMENT);
+    size_t adjustment = align_up(start + BLOCK_OVERHEAD, alignment) - start;
+    return adjustment;
+  }
+
+  bool can_allocate(size_t alignment, size_t size) const;
+
+  static void allocate(Block *&block, size_t alignment, size_t size,
+                       Block *&new_prev, Block *&new_next);
+
 private:
   /// Consumes the block and returns as a span of bytes.
   static ByteSpan as_bytes(Block *&&block);
@@ -357,6 +392,58 @@ void Block<OffsetType, kAlign>::free(Block *&block) {
   merge_next(block);
 }
 
+template <typename OffsetType, size_t kAlign>
+bool Block<OffsetType, kAlign>::can_allocate(size_t alignment,
+                                             size_t size) const {
+  if (usable_space_is_aligned(alignment) && inner_size() >= size)
+    return true; // Size and alignment constraints met.
+
+  // Either the alignment isn't met or we don't have enough size.
+  // If we don't meet alignment, we can always adjust such that we do meet the
+  // alignment. If we meet the alignment but just don't have enough size. This
+  // check will fail anyway.
+  size_t adjustment = extra_space_for_adjustment(alignment);
+  if (inner_size() >= size + adjustment)
+    return true;
+
+  return false;
+}
+
+template <typename OffsetType, size_t kAlign>
+void Block<OffsetType, kAlign>::allocate(Block *&block, size_t alignment,
+                                         size_t size, Block *&new_prev,
+                                         Block *&new_next) {
+  if (!block->usable_space_is_aligned(alignment)) {
+    size_t adjustment = block->extra_space_for_adjustment(alignment);
+    size_t new_inner_size = adjustment - BLOCK_OVERHEAD;
+    LIBC_ASSERT(new_inner_size % ALIGNMENT == 0 &&
+                "The adjustment calculation should always return a new size "
+                "that's a multiple of ALIGNMENT");
+
+    Block *aligned_block = *Block::split(block, adjustment - BLOCK_OVERHEAD);
+    LIBC_ASSERT(aligned_block->usable_space_is_aligned(alignment) &&
+                "The aligned block isn't aligned somehow.");
+
+    Block *prev = block->prev();
+    if (prev) {
+      // If there is a block before this, we can merge the current one with the
+      // newly created one.
+      merge_next(prev);
+    } else {
+      // Otherwise, this was the very first block in the chain. Now we can make
+      // it the new first block.
+      new_prev = block;
+    }
+
+    block = aligned_block;
+  }
+
+  // Now get a block for the requested size.
+  optional<Block *> next = Block::split(block, size);
+  if (next)
+    new_next = *next;
+}
+
 template <typename OffsetType, size_t kAlign>
 optional<Block<OffsetType, kAlign> *>
 Block<OffsetType, kAlign>::split(Block *&block, size_t new_inner_size) {
diff --git a/libc/src/__support/freelist.h b/libc/src/__support/freelist.h
index 0641ba93807d6..f8f5cd1878a32 100644
--- a/libc/src/__support/freelist.h
+++ b/libc/src/__support/freelist.h
@@ -66,6 +66,8 @@ template <size_t NUM_BUCKETS = 6> class FreeList {
   ///   A span with a size of 0.
   cpp::span<cpp::byte> find_chunk(size_t size) const;
 
+  template <typename Cond> cpp::span<cpp::byte> find_chunk_if(Cond op) const;
+
   /// Removes a chunk from this freelist.
   bool remove_chunk(cpp::span<cpp::byte> chunk);
 
@@ -111,6 +113,24 @@ bool FreeList<NUM_BUCKETS>::add_chunk(span<cpp::byte> chunk) {
   return true;
 }
 
+template <size_t NUM_BUCKETS>
+template <typename Cond>
+span<cpp::byte> FreeList<NUM_BUCKETS>::find_chunk_if(Cond op) const {
+  for (size_t i = 0; i < chunks_.size(); ++i) {
+    FreeListNode *node = chunks_[i];
+
+    while (node != nullptr) {
+      span<cpp::byte> chunk(reinterpret_cast<cpp::byte *>(node), node->size);
+      if (op(chunk))
+        return chunk;
+
+      node = node->next;
+    }
+  }
+
+  return {};
+}
+
 template <size_t NUM_BUCKETS>
 span<cpp::byte> FreeList<NUM_BUCKETS>::find_chunk(size_t size) const {
   if (size == 0)
diff --git a/libc/src/__support/freelist_heap.h b/libc/src/__support/freelist_heap.h
index 3569baf27bdaa..100b800f84ae6 100644
--- a/libc/src/__support/freelist_heap.h
+++ b/libc/src/__support/freelist_heap.h
@@ -24,6 +24,8 @@ namespace LIBC_NAMESPACE {
 using cpp::optional;
 using cpp::span;
 
+inline constexpr bool IsPow2(size_t x) { return x && (x & (x - 1)) == 0; }
+
 static constexpr cpp::array<size_t, 6> DEFAULT_BUCKETS{16,  32,  64,
                                                        128, 256, 512};
 
@@ -32,6 +34,9 @@ template <size_t NUM_BUCKETS = DEFAULT_BUCKETS.size()> class FreeListHeap {
   using BlockType = Block<>;
   using FreeListType = FreeList<NUM_BUCKETS>;
 
+  static constexpr size_t MIN_ALIGNMENT =
+      cpp::max(BlockType::ALIGNMENT, alignof(max_align_t));
+
   struct HeapStats {
     size_t total_bytes;
     size_t bytes_allocated;
@@ -55,6 +60,7 @@ template <size_t NUM_BUCKETS = DEFAULT_BUCKETS.size()> class FreeListHeap {
   }
 
   void *allocate(size_t size);
+  void *aligned_allocate(size_t alignment, size_t size);
   void free(void *ptr);
   void *realloc(void *ptr, size_t size);
   void *calloc(size_t num, size_t size);
@@ -74,6 +80,8 @@ template <size_t NUM_BUCKETS = DEFAULT_BUCKETS.size()> class FreeListHeap {
     freelist_.set_freelist_node(node, chunk);
   }
 
+  void *allocate_impl(size_t alignment, size_t size);
+
 private:
   span<cpp::byte> block_to_span(BlockType *block) {
     return span<cpp::byte>(block->usable_space(), block->inner_size());
@@ -109,20 +117,29 @@ struct FreeListHeapBuffer : public FreeListHeap<NUM_BUCKETS> {
 };
 
 template <size_t NUM_BUCKETS>
-void *FreeListHeap<NUM_BUCKETS>::allocate(size_t size) {
-  // Find a chunk in the freelist. Split it if needed, then return
-  auto chunk = freelist_.find_chunk(size);
+void *FreeListHeap<NUM_BUCKETS>::allocate_impl(size_t alignment, size_t size) {
+  // Find a chunk in the freelist. Split it if needed, then return.
+  auto chunk =
+      freelist_.find_chunk_if([alignment, size](span<cpp::byte> chunk) {
+        BlockType *block = BlockType::from_usable_space(chunk.data());
+        return block->can_allocate(alignment, size);
+      });
 
   if (chunk.data() == nullptr)
     return nullptr;
   freelist_.remove_chunk(chunk);
 
   BlockType *chunk_block = BlockType::from_usable_space(chunk.data());
+  LIBC_ASSERT(!chunk_block->used());
 
   // Split that chunk. If there's a leftover chunk, add it to the freelist
-  optional<BlockType *> result = BlockType::split(chunk_block, size);
-  if (result)
-    freelist_.add_chunk(block_to_span(*result));
+  BlockType *prev = nullptr;
+  BlockType *next = nullptr;
+  BlockType::allocate(chunk_block, alignment, size, prev, next);
+  if (next)
+    freelist_.add_chunk(block_to_span(next));
+  if (prev)
+    freelist_.add_chunk(block_to_span(prev));
 
   chunk_block->mark_used();
 
@@ -133,6 +150,28 @@ void *FreeListHeap<NUM_BUCKETS>::allocate(size_t size) {
   return chunk_block->usable_space();
 }
 
+template <size_t NUM_BUCKETS>
+void *FreeListHeap<NUM_BUCKETS>::allocate(size_t size) {
+  return allocate_impl(MIN_ALIGNMENT, size);
+}
+
+template <size_t NUM_BUCKETS>
+void *FreeListHeap<NUM_BUCKETS>::aligned_allocate(size_t alignment,
+                                                  size_t size) {
+  if (size == 0)
+    return nullptr;
+
+  // The alignment must be an integral power of two.
+  if (!IsPow2(alignment))
+    return nullptr;
+
+  // The size parameter must be an integral multiple of alignment.
+  if (size % alignment != 0)
+    return nullptr;
+
+  return allocate_impl(alignment, size);
+}
+
 template <size_t NUM_BUCKETS> void FreeListHeap<NUM_BUCKETS>::free(void *ptr) {
   cpp::byte *bytes = static_cast<cpp::byte *>(ptr);
 
diff --git a/libc/src/stdlib/aligned_alloc.h b/libc/src/stdlib/aligned_alloc.h
new file mode 100644
index 0000000000000..7f294c8114d49
--- /dev/null
+++ b/libc/src/stdlib/aligned_alloc.h
@@ -0,0 +1,20 @@
+//===-- Implementation header for aligned_alloc -----------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include <stddef.h>
+
+#ifndef LLVM_LIBC_SRC_STDLIB_ALIGNED_ALLOC_H
+#define LLVM_LIBC_SRC_STDLIB_ALIGNED_ALLOC_H
+
+namespace LIBC_NAMESPACE {
+
+void *aligned_alloc(size_t alignment, size_t size);
+
+} // namespace LIBC_NAMESPACE
+
+#endif // LLVM_LIBC_SRC_STDLIB_ALIGNED_ALLOC_H
diff --git a/libc/src/stdlib/freelist_malloc.cpp b/libc/src/stdlib/freelist_malloc.cpp
index 4d3c42ca90bab..684c447a204e4 100644
--- a/libc/src/stdlib/freelist_malloc.cpp
+++ b/libc/src/stdlib/freelist_malloc.cpp
@@ -7,6 +7,7 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/__support/freelist_heap.h"
+#include "src/stdlib/aligned_alloc.h"
 #include "src/stdlib/calloc.h"
 #include "src/stdlib/free.h"
 #include "src/stdlib/malloc.h"
@@ -42,4 +43,8 @@ LLVM_LIBC_FUNCTION(void *, realloc, (void *ptr, size_t size)) {
   return freelist_heap->realloc(ptr, size);
 }
 
+LLVM_LIBC_FUNCTION(void *, aligned_alloc, (size_t alignment, size_t size)) {
+  return freelist_heap->aligned_allocate(alignment, size);
+}
+
 } // namespace LIBC_NAMESPACE
diff --git a/libc/test/src/__support/CMakeLists.txt b/libc/test/src/__support/CMakeLists.txt
index 082a002959c95..bbe856de39bea 100644
--- a/libc/test/src/__support/CMakeLists.txt
+++ b/libc/test/src/__support/CMakeLists.txt
@@ -8,6 +8,7 @@ add_libc_test(
     block_test.cpp
   DEPENDS
     libc.src.__support.CPP.array
+    libc.src.__support.CPP.bit
     libc.src.__support.CPP.span
     libc.src.__support.block
     libc.src.string.memcpy
diff --git a/libc/test/src/__support/block_test.cpp b/libc/test/src/__support/block_test.cpp
index 6614e4b583d3f..020f0ee5d910b 100644
--- a/libc/test/src/__support/block_test.cpp
+++ b/libc/test/src/__support/block_test.cpp
@@ -8,6 +8,7 @@
 #include <stddef.h>
 
 #include "src/__support/CPP/array.h"
+#include "src/__support/CPP/bit.h"
 #include "src/__support/CPP/span.h"
 #include "src/__support/block.h"
 #include "src/string/memcpy.h"
@@ -36,6 +37,7 @@ using SmallOffsetBlock = LIBC_NAMESPACE::Block<uint16_t>;
   template <typename BlockType> void LlvmLibcBlockTest##TestCase::RunTest()
 
 using LIBC_NAMESPACE::cpp::array;
+using LIBC_NAMESPACE::cpp::bit_ceil;
 using LIBC_NAMESPACE::cpp::byte;
 using LIBC_NAMESPACE::cpp::span;
 
@@ -567,3 +569,201 @@ TEST_FOR_EACH_BLOCK_TYPE(CanGetConstBlockFromUsableSpace) {
   const BlockType *block2 = BlockType::from_usable_space(ptr);
   EXPECT_EQ(block1, block2);
 }
+
+TEST_FOR_EACH_BLOCK_TYPE(CanAllocate) {
+  constexpr size_t kN = 1024;
+
+  // Ensure we can allocate everything up to the block size within this block.
+  for (size_t i = 0; i < kN - BlockType::BLOCK_OVERHEAD; ++i) {
+    alignas(BlockType::ALIGNMENT) array<byte, kN> bytes{};
+    auto result = BlockType::init(bytes);
+    ASSERT_TRUE(result.has_value());
+    BlockType *block = *result;
+
+    constexpr size_t kAlign = 1; // Effectively ignores alignmenr.
+    EXPECT_TRUE(block->can_allocate(kAlign, i));
+
+    // For each can_allocate, we should be able to do a successful call to
+    // allocate.
+    BlockType *prev = nullptr;
+    BlockType *next = nullptr;
+    BlockType::allocate(block, kAlign, i, prev, next);
+    EXPECT_NE(block, static_cast<BlockType *>(nullptr));
+  }
+
+  alignas(BlockType::ALIGNMENT) array<byte, kN> bytes{};
+  auto result = BlockType::init(bytes);
+  ASSERT_TRUE(result.has_value());
+  BlockType *block = *result;
+
+  // Given a block of size kN (assuming it's also a power of two), we should be
+  // able to allocate a block within it that's aligned to half its size. This is
+  // because regardless of where the buffer is located, we can always find a
+  // starting location within it that meets this alignment.
+  EXPECT_TRUE(block->can_allocate(kN / 2, 1));
+}
+
+TEST_FOR_EACH_BLOCK_TYPE(AllocateAlreadyAligned) {
+  constexpr size_t kN = 1024;
+
+  alignas(BlockType::ALIGNMENT) array<byte, kN> bytes{};
+  auto result = BlockType::init(bytes);
+  ASSERT_TRUE(result.has_value());
+  BlockType *block = *result;
+
+  // This should result in no new blocks.
+  constexpr size_t kAlignment = BlockType::ALIGNMENT;
+  constexpr size_t kExpectedSize = BlockType::ALIGNMENT;
+  EXPECT_TRUE(block->can_allocate(kAlignment, kExpectedSize));
+
+  BlockType *prev = nullptr;
+  BlockType *next = nullptr;
+  BlockType::allocate(block, BlockType::ALIGNMENT, kExpectedSize, prev, next);
+
+  // Since this is already aligned, there should be no previous block.
+  EXPECT_EQ(prev, static_cast<BlockType *>(nullptr));
+
+  // Ensure we the block is aligned and the size we expect.
+  EXPECT_NE(block, static_cast<BlockType *>(nullptr));
+  EXPECT_TRUE(block->usable_space_is_aligned(BlockType::ALIGNMENT));
+  EXPECT_EQ(block->inner_size(), kExpectedSize);
+
+  // Check the next block.
+  EXPECT_NE(next, static_cast<BlockType *>(nullptr));
+  EXPECT_EQ(block->next(), next);
+  EXPECT_EQ(next->next(), static_cast<BlockType *>(nullptr));
+  EXPECT_EQ(reinterpret_cast<byte *>(next) + next->outer_size(), &*bytes.end());
+}
+
+TEST_FOR_EACH_BLOCK_TYPE(AllocateNeedsAlignment) {
+  constexpr size_t kN = 1024;
+
+  alignas(kN) array<byte, kN> bytes{};
+  auto result = BlockType::init(bytes);
+  ASSERT_TRUE(result.has_value());
+  BlockType *block = *result;
+
+  // Ensure first the usable_data is only aligned to the block alignment.
+  ASSERT_EQ(block->usable_space(), bytes.data() + BlockType::BLOCK_OVERHEAD);
+  ASSERT_EQ(block->prev(), static_cast<BlockType *>(nullptr));
+
+  // Now pick an alignment such that the usable space is not already aligned to
+  // it. We want to explicitly test that the block will split into one before
+  // it.
+  constexpr size_t kAlignment = bit_ceil(BlockType::BLOCK_OVERHEAD) * 8;
+  ASSERT_FALSE(block->usable_space_is_aligned(kAlignment));
+
+  constexpr size_t kSize = BlockType::ALIGNMENT;
+  EXPECT_TRUE(block->can_allocate(kAlignment, kSize));
+
+  BlockType *prev = nullptr;
+  BlockType *next = nullptr;
+  BlockType::allocate(block, kAlignment, kSize, prev, next);
+
+  // Check the previous block was created appropriately. Since this block is the
+  // first block, a new one should be made before this.
+  EXPECT_NE(prev, static_cast<BlockType *>(nullptr));
+  EXPECT_EQ(block->prev(), prev);
+  EXPECT_EQ(prev->next(), block);
+  EXPECT_EQ(prev->outer_size(), reinterpret_cast<uintptr_t>(block) -
+                                    reinterpret_cast<uintptr_t>(prev));
+
+  // Ensure we the block is aligned and the size we expect.
+  EXPECT_NE(next, static_cast<BlockType *>(nullptr));
+  EXPECT_TRUE(block->usable_space_is_aligned(kAlignment));
+
+  // Check the next block.
+  EXPECT_NE(next, static_cast<BlockType *>(nullptr));
+  EXPECT_EQ(block->next(), next);
+  EXPECT_EQ(next->next(), static_cast<BlockType *>(nullptr));
+  EXPECT_EQ(reinterpret_cast<byte *>(next) + next->outer_size(), &*bytes.end());
+}
+
+TEST_FOR_EACH_BLOCK_TYPE(PreviousBlockMergedIfNotFirst) {
+  constexpr size_t kN = 1024;
+
+  alignas(kN) array<byte, kN> bytes{};
+  auto result = BlockType::init(bytes);
+  ASSERT_TRUE(result.has_value());
+  BlockType *block = *result;
+
+  // Split the block roughly halfway and work on the second half.
+  auto result2 = BlockType::split(block, kN / 2);
+  ASSERT_TRUE(result2.has_value());
+  BlockType *newblock = *result2;
+  ASSERT_EQ(newblock->prev(), block);
+  size_t old_prev_size = block->outer_size();
+
+  // Now pick an alignment such that the usable space is not already aligned to
+  // it. We want to explicitly test that the block will split into one before
+  // it.
+  constexpr size_t kAlignment = bit_ceil(BlockType::BLOCK_OVERHEAD) * 8;
+  ASSERT_FALSE(newblock->usable_space_is_aligned(kAlignment));
+
+  // Ensure we can allocate in the new block.
+  constexpr size_t kSize = BlockType::ALIGNMENT;
+  EXPECT_TRUE(newblock->can_allocate(kAlignment, kSize));
+
+  BlockType *prev = nullptr;
+  BlockType *next = nullptr;
+  BlockType::allocate(newblock, kAlignment, kSize, prev, next);
+
+  // Now there should be no new previous block. Instead, the padding we did
+  // create should be merged into the original previous block.
+  EXPECT_EQ(prev, static_cast<BlockType *>(nullptr));
+  EXPECT_EQ(newblock->prev(), block);
+  EXPECT_EQ(block->next(), newblock);
+  EXPECT_GT(block->outer_size(), old_prev_size);
+}
+
+TEST_FOR_EACH_BLOCK_TYPE(CanRemergeBlockAllocations) {
+  // Finally to ensure we made the split blocks correctly via allocate. We
+  // should be able to reconstruct the original block from the blocklets.
+  //
+  // This is the same setup as with the `AllocateNeedsAlignment` test case.
+  constexpr size_t kN = 1024;
+
+  alignas(kN) array<byte, kN> bytes{};
+  auto result = BlockType::init(bytes);
+  ASSERT_TRUE(result.has_value());
+  BlockType *block = *result;
+
+  // Ensure first the usable_data is only aligned to the block alignment.
+  ASSERT_EQ(block->usable_space(), bytes.data() + BlockType::BLOCK_OVERHEAD);
+  ASSERT_EQ(block->prev(), static_cast<BlockType *>(nullptr));
+
+  // Now pick an alignment such that the usable space is not already aligned to
+  // it. We want to explicitly test that the block will split into one before
+  // it.
+  constexpr size_t kAlignment = bit_ceil(BlockType::BLOCK_OVERHEAD) * 8;
+  ASSERT_FALSE(block->usable_space_is_aligned(kAlignment));
+
+  constexpr size_t kSize = BlockType::ALIGNMENT;
+  EXPECT_TRUE(block->can_allocate(kAlignment, kSize));
+
+  BlockType *prev = nullptr;
+  BlockType *next = nullptr;
+  BlockType::allocate(block, kAlignment, kSize, prev, next);
+
+  // Check we have the appropriate blocks.
+  ASSERT_NE(prev, static_cast<BlockType *>(nullptr));
+  ASSERT_FALSE(prev->last());
+  ASSERT_EQ(block->prev(), prev);
+  EXPECT_NE(next, static_cast<BlockType *>(nullptr));
+  EXPECT_NE(next, static_cast<BlockType *>(nullptr));
+  EXPECT_EQ(block->next(), next);
+  EXPECT_EQ(next->next(), static_cast<BlockType *>(nullptr));
+  ASSERT_TRUE(next->last());
+
+  // Now check for successful merges.
+  EXPECT_TRUE(BlockType::merge_next(prev));
+  EXPECT_EQ(prev->next(), next);
+  EXPECT_TRUE(BlockType::merge_next(prev));
+  EXPECT_EQ(prev->next(), static_cast<BlockType *>(nullptr));
+  EXPECT_TRUE(prev->last());
+
+  // We should have the original buffer.
+  EXPECT_EQ(reinterpret_cast<byte *>(prev), &*bytes.begin());
+  EXPECT_EQ(prev->outer_size(), bytes.size());
+  EXPECT_EQ(reinterpret_cast<byte *>(prev) + prev->outer_size(), &*bytes.end());
+}
diff --git a/libc/test/src/__support/freelist_heap_test.cpp b/libc/test/src/__support/freelist_heap_test.cpp
index a35cb5589ed62..16e283a4a8759 100644
--- a/libc/test/src/__support/freelist_heap_test.cpp
+++ b/libc/test/src/__support/freelist_heap_test.cpp
@@ -6,6 +6,7 @@
 ...
[truncated]

PiJoules referenced this pull request Jun 25, 2024
Summary:
This test fails due to alignment issues, it's likely that it's
misaligned on other targets too and they just don't crash on it.
@PiJoules maybe we should run this with ubsan?
EXPECT_NE(next, static_cast<BlockType *>(nullptr));
EXPECT_EQ(block->next(), next);
EXPECT_EQ(next->next(), static_cast<BlockType *>(nullptr));
EXPECT_EQ(reinterpret_cast<byte *>(next) + next->outer_size(), &*bytes.end());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't end one past the end of the array? Intentional (vs. &bytes.back())?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, intentional since this check should assert that the end of next (next + next->outer_size()) should be the exact same as the end of the original buffer we used for the block (&*bytes.end()). I believe this is legal in C++ as long as the ptr is never derefenced.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But aren't you dereferencing that pointer? I'm petty sure its always UB to deref end(), even if it points into a real allocation, since if they don't share the same provenance. Pointer provenance rules are pretty complicated, and C and C++ don't agree on the precise semantics. Newer standards have closed that gap to a large extent, but I don't think they precisely match. Peter Sewel has had a lot to say on the matter (https://www.cl.cam.ac.uk/~pes20/cerberus/cerberus-popl2019.pdf, https://www.cl.cam.ac.uk/~pes20/cerberus/clarifying-provenance-v4.html).

If you want to avoid UB altogether, maybe get the address of the last element w/ back(), then create an equivalent pointer to end() w/ pointer arithmetic. It may be even better to use uintptr_t, and document that you're checking some kind of layout invariant that your data structure is expected to maintain.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our cpp::array::end returns a T*, so indeed this looks like an explicit derefence to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, after much clarification, I did not know that &*ptr still counts as a dereference for C++. Apparently it's only in C where they cancel each other out. Updated.

Comment on lines 413 to 415
void Block<OffsetType, kAlign>::allocate(Block *&block, size_t alignment,
size_t size, Block *&new_prev,
Block *&new_next) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think this API would make more sense if you passed in a struct instead of the pointer references?

Suggested change
void Block<OffsetType, kAlign>::allocate(Block *&block, size_t alignment,
size_t size, Block *&new_prev,
Block *&new_next) {
struct block_info{ Block* block, Block* prev, Block* next};
void Block<OffsetType, kAlign>::allocate(BlockInfo &block_info, size_t alignment,
size_t size) {

I find the mutable references to pointers to make for an odd API, and this reads a bit more naturally to me.

Feel free to disregard this though, since this comes down to style, and it could be this API is this shape for another reason.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 please make the suggested change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the struct but made the API slightly different. BlockInfo is the return type now representing the allocated aligned block, a potentially new previous block, and a new next block. The function doesn't take any references now.

Comment on lines 413 to 415
void Block<OffsetType, kAlign>::allocate(Block *&block, size_t alignment,
size_t size, Block *&new_prev,
Block *&new_next) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the struct but made the API slightly different. BlockInfo is the return type now representing the allocated aligned block, a potentially new previous block, and a new next block. The function doesn't take any references now.

EXPECT_NE(next, static_cast<BlockType *>(nullptr));
EXPECT_EQ(block->next(), next);
EXPECT_EQ(next->next(), static_cast<BlockType *>(nullptr));
EXPECT_EQ(reinterpret_cast<byte *>(next) + next->outer_size(), &*bytes.end());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, intentional since this check should assert that the end of next (next + next->outer_size()) should be the exact same as the end of the original buffer we used for the block (&*bytes.end()). I believe this is legal in C++ as long as the ptr is never derefenced.

Copy link
Contributor

@ilovepi ilovepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something I'm realizing should either be tested, or documented as a limitation, is the case where the caller aligns the allocation themselves, by bumping the allocation size up and then aligning the pointer to get the correct alignment. I've seen that pattern in many codebases, even in the Rust core libraries. If the allocator isn't able to free those correctly, we should make sure that is written down somewhere as a limitation. If it is supported, we should test it. I know that usage is often incorrect, but I've seen it enough to know that people will do that, even if aligned_alloc is available.


// This is the return type for `allocate` which can split one block into up to
// three blocks.
struct BlockInfo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I know I suggested BlockInfo, but this looks like some kind of ListNode, or header ... I'm not going to bikeshed on the naming here if you're happy w/ it, but I figured I should point that out, since I'm partially responsible.

Comment on lines 655 to 660
constexpr size_t kSize = BlockType::ALIGNMENT;
EXPECT_TRUE(block->can_allocate(kAlignment, kSize));

auto [aligned_block, prev, next] =
BlockType::allocate(block, kAlignment, kSize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
constexpr size_t kSize = BlockType::ALIGNMENT;
EXPECT_TRUE(block->can_allocate(kAlignment, kSize));
auto [aligned_block, prev, next] =
BlockType::allocate(block, kAlignment, kSize);
EXPECT_TRUE(block->can_allocate(kAlignment, BlockType::ALIGNMENT));
auto [aligned_block, prev, next] =
BlockType::allocate(block, kAlignment, BlockType::ALIGNMENT);

It's not clear why you need the variable here, when you can use the original constant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kSize value here doesn't really matter (it could be something like 10 but I just happen to pick BlockType::ALIGNMENT), so if for some reason someone decides to change it for a new value, they can just change kSize rather than copying the new value everywhere. But for this PR I'm not opposed to just using BlockType::ALIGNMENT in place of kSize.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you expect it to change in the future, then its fine to leave as is. It's just that the extra variable didn't seem necessary, even if its const.

Comment on lines +443 to +456
LIBC_ASSERT(maybe_aligned_block.has_value() &&
"This split should always result in a new block. The check in "
"`can_allocate` ensures that we have enough space here to make "
"two blocks.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we triple check block->can_allocate(alignment, size) within this LIBC_ASSERT then?

There's an implicit requirement on this method that the block has previously been checked via can_allocate. That requirement should be in a comment at above the function definition (or declaration) rather than buried in an assert in the definition (prefer BOTH).

I almost wonder if we should explicitly check that and return early if can_allocate returns false, regardless of asserts being enabled or not? Though I guess the interface isn't designed to be falliable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the idea is that this function will always succeed if can_allocate returns true. I'll add that assert at the very start of this also.

Comment on lines 452 to 453
Block *prev = original->prev();
if (prev) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Block *prev = original->prev();
if (prev) {
if (Block *prev = original->prev()) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

span<cpp::byte> FreeList<NUM_BUCKETS>::find_chunk_if(Cond op) const {
for (FreeListNode *node : chunks_) {
while (node != nullptr) {
span<cpp::byte> chunk(reinterpret_cast<cpp::byte *>(node), node->size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

block.h has:

61 using ByteSpan = cpp::span<LIBC_NAMESPACE::cpp::byte>;

Perhaps useful to do that in libc/src/__support/freelist.h too. Or add more using statements, here and in the other headers. Isn't using in a header frowned upon though, since TU's that include that header get those using statements? Would be happy to remove them (in another, distinct PR).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG. I'll do that in a separate PR.

Comment on lines 448 to 450
Block *aligned_block = *maybe_aligned_block;
LIBC_ASSERT(aligned_block->is_usable_space_aligned(alignment) &&
"The aligned block isn't aligned somehow.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aligned_block isn't used until L463. Can we move these statements closer to the use? Perhaps we can even avoid creating a variable for aligned_block? Or do we need this assert before calling merge_next?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be moved closer and after merge_next

@PiJoules
Copy link
Contributor Author

Something I'm realizing should either be tested, or documented as a limitation, is the case where the caller aligns the allocation themselves, by bumping the allocation size up and then aligning the pointer to get the correct alignment. I've seen that pattern in many codebases, even in the Rust core libraries. If the allocator isn't able to free those correctly, we should make sure that is written down somewhere as a limitation. If it is supported, we should test it. I know that usage is often incorrect, but I've seen it enough to know that people will do that, even if aligned_alloc is available.

Is the case you mention something like this?

char *ptr = (char *)malloc(size + desired_alignment);
uintptr_t ptr_int = reinterpret_cast<uintptr_t>(ptr);
if (ptr_int % desired_alignment != 0) {
  ptr += (desired_alignment - (ptr_int % desired_alignment));
}
...
free(ptr);  // ptr may not be the exact same value as returned by malloc

@ilovepi
Copy link
Contributor

ilovepi commented Jul 1, 2024

Something I'm realizing should either be tested, or documented as a limitation, is the case where the caller aligns the allocation themselves, by bumping the allocation size up and then aligning the pointer to get the correct alignment. I've seen that pattern in many codebases, even in the Rust core libraries. If the allocator isn't able to free those correctly, we should make sure that is written down somewhere as a limitation. If it is supported, we should test it. I know that usage is often incorrect, but I've seen it enough to know that people will do that, even if aligned_alloc is available.

Is the case you mention something like this?

char *ptr = (char *)malloc(size + desired_alignment);
uintptr_t ptr_int = reinterpret_cast<uintptr_t>(ptr);
if (ptr_int % desired_alignment != 0) {
  ptr += (desired_alignment - (ptr_int % desired_alignment));
}
...
free(ptr);  // ptr may not be the exact same value as returned by malloc

Yeah, that's about right. I also did some looking at the projects where I'd seen this and it seems like at least a few of the codebases that used to do this no longer do (Rust compiler and a few networking libraries), so maybe this is less important than I thought? Probably fine to just document that as some kind of abuse of the API and call it a day, better still if there's something to quote from the standard.

@PiJoules
Copy link
Contributor Author

PiJoules commented Jul 2, 2024

Something I'm realizing should either be tested, or documented as a limitation, is the case where the caller aligns the allocation themselves, by bumping the allocation size up and then aligning the pointer to get the correct alignment. I've seen that pattern in many codebases, even in the Rust core libraries. If the allocator isn't able to free those correctly, we should make sure that is written down somewhere as a limitation. If it is supported, we should test it. I know that usage is often incorrect, but I've seen it enough to know that people will do that, even if aligned_alloc is available.

Is the case you mention something like this?

char *ptr = (char *)malloc(size + desired_alignment);
uintptr_t ptr_int = reinterpret_cast<uintptr_t>(ptr);
if (ptr_int % desired_alignment != 0) {
  ptr += (desired_alignment - (ptr_int % desired_alignment));
}
...
free(ptr);  // ptr may not be the exact same value as returned by malloc

Yeah, that's about right. I also did some looking at the projects where I'd seen this and it seems like at least a few of the codebases that used to do this no longer do (Rust compiler and a few networking libraries), so maybe this is less important than I thought? Probably fine to just document that as some kind of abuse of the API and call it a day, better still if there's something to quote from the standard.

I think for now we can mark this as not supported and that free can only accept the pointer values returned explicitly by the other alloc functions.

@ilovepi
Copy link
Contributor

ilovepi commented Jul 2, 2024

Something I'm realizing should either be tested, or documented as a limitation, is the case where the caller aligns the allocation themselves, by bumping the allocation size up and then aligning the pointer to get the correct alignment. I've seen that pattern in many codebases, even in the Rust core libraries. If the allocator isn't able to free those correctly, we should make sure that is written down somewhere as a limitation. If it is supported, we should test it. I know that usage is often incorrect, but I've seen it enough to know that people will do that, even if aligned_alloc is available.

Is the case you mention something like this?

char *ptr = (char *)malloc(size + desired_alignment);
uintptr_t ptr_int = reinterpret_cast<uintptr_t>(ptr);
if (ptr_int % desired_alignment != 0) {
  ptr += (desired_alignment - (ptr_int % desired_alignment));
}
...
free(ptr);  // ptr may not be the exact same value as returned by malloc

Yeah, that's about right. I also did some looking at the projects where I'd seen this and it seems like at least a few of the codebases that used to do this no longer do (Rust compiler and a few networking libraries), so maybe this is less important than I thought? Probably fine to just document that as some kind of abuse of the API and call it a day, better still if there's something to quote from the standard.

I think for now we can mark this as not supported and that free can only accept the pointer values returned explicitly by the other alloc functions.

SGTM

Copy link
Contributor

@ilovepi ilovepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. All my comments have been addressed and from what I can tell the implementation appears correct. Since it unblocks baremetal support I think we should land it now, with the expectation that we'll be moving to more performant/robust implementations in the future.

This adds support for aligned_alloc with the freelist allocator. This
works by finding blocks large enough to hold the requested size plus
some shift amount that's at most the requested alignment. Blocks that
meet this requirement but aren't properly aligned can be split such that
the usable_space of a new block is aligned properly. The "padding" block
created will be merged with the previous block if one exists.
@PiJoules PiJoules merged commit 3eebeb7 into llvm:main Jul 3, 2024
4 of 5 checks passed
@PiJoules PiJoules deleted the aligned_alloc branch July 3, 2024 23:26
kbluck pushed a commit to kbluck/llvm-project that referenced this pull request Jul 6, 2024
This adds support for aligned_alloc with the freelist allocator. This
works by finding blocks large enough to hold the requested size plus
some shift amount that's at most the requested alignment. Blocks that
meet this requirement but aren't properly aligned can be split such that
the usable_space of a new block is aligned properly. The "padding" block
created will be merged with the previous block if one exists.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants