Skip to content

Commit 9fa6e31

Browse files
committed
[projection] Add a new data structure for use in NewProjection called PointerIntEnum.
PointerIntEnum is a more powerful PointerIntPair data structure. It uses an enum with special cases to understand characteristics of the data and then uses this information and the some tricks to be able to represent: 1. Up to tagged bit number of pointer cases. The cases are stored inline. 2. Inline indices up to 4096. 3. Out of line indices > 4096. It takes advantage of the trick that we use in the runtime already to distinguish pointers from indices: namely that the zero page on modern OSes do not allocate the zero page. I made unittests for all of the operations so it is pretty well tested out. I am going to use this in a subsequent commit to compress projection in the common case (the inline case) down to 1/3 of its size. The reason why the inline case is common is that in most cases where projection is used it will be targeting relative offsets in an array which are not likely to be greater than a page. The mallocing of memory just enables us to degrade gracefully.
1 parent 748f713 commit 9fa6e31

File tree

3 files changed

+825
-0
lines changed

3 files changed

+825
-0
lines changed

include/swift/Basic/PointerIntEnum.h

Lines changed: 316 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,316 @@
1+
//===--- PointerIntEnum.h -------------------------------------------------===//
2+
//
3+
// This source file is part of the Swift.org open source project
4+
//
5+
// Copyright (c) 2014 - 2015 Apple Inc. and the Swift project authors
6+
// Licensed under Apache License v2.0 with Runtime Library Exception
7+
//
8+
// See http://swift.org/LICENSE.txt for license information
9+
// See http://swift.org/CONTRIBUTORS.txt for the list of Swift project authors
10+
//
11+
//===----------------------------------------------------------------------===//
12+
13+
#include "llvm/Support/PointerLikeTypeTraits.h"
14+
#include <cassert>
15+
#include <climits>
16+
#include <cstdlib>
17+
#include <cstring>
18+
#include <type_traits>
19+
20+
namespace swift {
21+
22+
/// A pointer sized ADT that is able to compactly represent a Swift like enum
23+
/// that can contain both Integer and Pointer payloads.
24+
///
25+
/// This is done by taking the ideas behind PointerIntPair and taking advantage
26+
/// of an additional property that we already use in the runtime: namely that on
27+
/// all modern OSes that we care about, the zero page is not allocated since it
28+
/// is used as a copy on write "zeroed" page. This enables us to distinguish
29+
/// whether or not we have a pointer or an index by restricting the size of
30+
/// indices to be less than 4096. Most modern OSes (including Darwin) do not map
31+
/// the zero page. That means that if the lower 61 bits of the uintptr_t is less
32+
/// than 4096, then we have an index and a pointer otherwise.
33+
///
34+
/// Given this limitation, we store integers greater than 4096 out of line in a
35+
/// malloced integer. This is a good trade-off for certain types of compiler
36+
/// optimizations like relative array indexing where it is unlikely for someone
37+
/// to address more than 1 page worth of items at a time. But it is important to
38+
/// degrade gracefully in such a case.
39+
///
40+
/// In order to support these use cases, the c++ enum class that we use to
41+
/// define our type must have a specific form:
42+
///
43+
/// enum class EnumTy : uint64_t {
44+
/// Invalid = 0,
45+
///
46+
/// // PointerKinds
47+
/// Ptr1 = 1,
48+
/// ...
49+
/// PtrN = N,
50+
/// LastPointerKind = PtrN,
51+
///
52+
/// // Index Kinds
53+
/// //
54+
/// // This is an index >= 4096, requiring us to malloc memory. It needs
55+
/// // to be able to be stored in at most 3 bits, implying it must be >= 7.
56+
///
57+
/// LargeIndex = 7,
58+
/// Index1 = 8,
59+
/// Index2 = 9,
60+
/// Index3 = 10,
61+
/// Index4 = 11,
62+
/// Index5 = 12,
63+
/// LastIndexKind = Index5,
64+
/// };
65+
///
66+
/// In words, we have the following requirements:
67+
///
68+
/// 1. An Invalid case must be defined as being zero.
69+
/// 2. We can only no more than N PointerKinds where N is the number of tagged
70+
/// pointer bits that we have.
71+
/// 3. LargeIndex must be equal to ((1 << NumTaggedBits)-1).
72+
/// 4. All index kinds must be greater than LargeIndex.
73+
///
74+
/// \tparam EnumTy The enum type that is used for cases
75+
/// \tparam PointerTy The pointer like type used for pointer payloads.
76+
/// \tparam NumPointerKindBits The number of bits that can be used for pointer
77+
/// kinds. Must be no more than the number of tagged bits in PointerTy.
78+
/// \tparam NumIndexKindBits The number of bits that can be used for index
79+
/// kinds.
80+
/// \tparam PtrTraits The pointer traits of PointerTy
81+
/// \tparam ScribbleMemory Instead of freeing any malloced memory, scribble the
82+
/// memory. This enables us to test that memory is properly being
83+
/// deallocated. Should only be set to true during unittesting.
84+
template <typename EnumTy, typename PointerTy, unsigned NumPointerKindBits,
85+
unsigned NumIndexKindBits,
86+
typename PtrTraits = llvm::PointerLikeTypeTraits<PointerTy>,
87+
bool ScribbleMemory = false>
88+
class PointerIntEnum {
89+
/// If we have stored a pointer, this gives the offset of the kind in Index.
90+
static constexpr unsigned PointerKindBitOffset =
91+
sizeof(uintptr_t) * CHAR_BIT - NumPointerKindBits;
92+
93+
/// This is a mask for the lower PointerBitWidth - NumPointerKindBits bits of
94+
/// Index.
95+
static constexpr uintptr_t PointerBitMask =
96+
(uintptr_t(1) << PointerKindBitOffset) - 1;
97+
98+
/// A bit mask used to grab index kind bits from a large index.
99+
static constexpr uint64_t IndexKindBitMask =
100+
(uint64_t(1) << NumIndexKindBits) - 1;
101+
102+
/// This is the offset to the index kind bits at the top of projection.
103+
static constexpr unsigned IndexKindBitOffset =
104+
sizeof(uintptr_t) * CHAR_BIT - NumIndexKindBits;
105+
106+
/// This is a mask that can be used to strip off the index kind from the top
107+
/// of Index.
108+
static constexpr uintptr_t IndexKindOffsetBitMask =
109+
(uintptr_t(1) << IndexKindBitOffset) - 1;
110+
111+
/// This is the max index that a projection can represent without
112+
/// mallocing. The zero page on modern OSes is never mapped so, we can use
113+
/// this value to determine if we have a pointer or an index.
114+
///
115+
/// We also use this as a mask to grab the index bits from a PointerIntEnum
116+
/// with an index kind.
117+
static constexpr uintptr_t MaxSmallIndex = (uintptr_t(1) << 12) - 1;
118+
119+
/// The pointer sized type used for the actual storage.
120+
///
121+
/// Never access this directly. Instead use the following methods:
122+
///
123+
/// * getRawKind(): Returns the actual kind stored in the kind bits. This
124+
/// means it will return LargeIndex.
125+
/// * getKind(): Same as RawKind except if the kind is LargeIndex, will
126+
/// discover the real underlying kind in the malloced memory.
127+
/// * getIndex(): Asserts if getKind() is a pointer storing kind.
128+
/// * getRawPointer(): Returns the underlying pointer as a void *. Asserts if
129+
/// getKind() is an index storing kind.
130+
/// * getPointer(): Returns the underlying pointer cast into
131+
/// PointerTy. Asserts if getKind() is an index storing kind.
132+
uintptr_t Index;
133+
134+
public:
135+
PointerIntEnum() : PointerIntEnum(EnumTy::Invalid, nullptr) {}
136+
137+
PointerIntEnum(EnumTy Kind, unsigned NewIndex) {
138+
initWithIndex(Kind, NewIndex);
139+
}
140+
PointerIntEnum(EnumTy Kind, PointerTy Ptr) {
141+
initWithPointer(Kind, PtrTraits::getAsVoidPointer(Ptr));
142+
}
143+
144+
PointerIntEnum(PointerIntEnum &&P) : Index() { std::swap(Index, P.Index); }
145+
PointerIntEnum(const PointerIntEnum &P) : Index() { *this = P; }
146+
147+
~PointerIntEnum() {
148+
// If we have a large index, free the index.
149+
if (getRawKind() != EnumTy::LargeIndex)
150+
return;
151+
freeMemory();
152+
}
153+
154+
PointerIntEnum &operator=(const PointerIntEnum &P) {
155+
// If we already haev a raw kind, we need to free memory.
156+
if (getRawKind() == EnumTy::LargeIndex)
157+
freeMemory();
158+
159+
auto NewRawKind = P.getRawKind();
160+
if (NewRawKind == EnumTy::LargeIndex ||
161+
NewRawKind > EnumTy::LastPointerKind) {
162+
initWithIndex(P.getKind(), P.getIndex());
163+
return *this;
164+
}
165+
166+
initWithPointer(P.getKind(), P.getRawPointer());
167+
return *this;
168+
}
169+
170+
void operator=(PointerIntEnum &&P) { std::swap(Index, P.Index); }
171+
172+
bool isValid() const { return getRawKind() != EnumTy::Invalid; }
173+
174+
bool operator==(const PointerIntEnum &Other) const {
175+
assert((isValid() && Other.isValid()) &&
176+
"Can not compare valid projections");
177+
auto Kind1 = getRawKind();
178+
179+
// First make sure that the raw kinds line up.
180+
if (Kind1 != Other.getRawKind()) {
181+
return false;
182+
}
183+
184+
// Then if we don't have a large index just compare index.
185+
if (Kind1 != EnumTy::LargeIndex)
186+
return Index == Other.Index;
187+
// Otherwise, we need to grab the actual index pointer from the memory that
188+
// we malloced.
189+
return getIndex() == Other.getIndex();
190+
}
191+
192+
bool operator!=(const PointerIntEnum &Other) const {
193+
return !(*this == Other);
194+
}
195+
196+
/// Convenience method for getting the raw underlying kind.
197+
EnumTy getKind() const {
198+
// First grab the bits of projection excluding the top 3 bits. If these bits
199+
// take ona value <= 4095, then we have a small index.
200+
if ((Index & IndexKindOffsetBitMask) <= MaxSmallIndex) {
201+
return EnumTy(unsigned(Index >> IndexKindBitOffset));
202+
}
203+
204+
// Otherwise, we have some sort of pointer. If the kind is not a kind for a
205+
// large pointer, return the kind.
206+
auto Kind = EnumTy(unsigned(Index >> PointerKindBitOffset));
207+
if (Kind != EnumTy::LargeIndex)
208+
return Kind;
209+
210+
// Ok, we *do* have an index type, but the index is >= 2047. Thus we know
211+
// that the Index is really a pointer to a single uint64_t value that was
212+
// malloced and stores our index. Grab the kind from the first
213+
// NumIndexKindBits (currently 4) bits of the 64 bit word.
214+
uint64_t Value;
215+
memcpy(&Value, getRawPointer(), sizeof(Value));
216+
return EnumTy(unsigned(Value & IndexKindBitMask));
217+
}
218+
219+
/// Convenience method for getting the underlying index. Assumes that this
220+
/// projection is valid. Otherwise it asserts.
221+
unsigned getIndex() const {
222+
assert(unsigned(getRawKind()) > unsigned(EnumTy::LastPointerKind) &&
223+
"Not an index new projection kind");
224+
// Just return the bottom 11 bits if we have a small index.
225+
if (getRawKind() != EnumTy::LargeIndex)
226+
return unsigned(Index & MaxSmallIndex);
227+
228+
// Otherwise, we have a large index. Convert our index into a pointer
229+
uint64_t Value;
230+
memcpy(&Value, getRawPointer(), sizeof(Value));
231+
return unsigned(Value >> NumIndexKindBits);
232+
}
233+
234+
/// Convenience method for getting the raw underlying index as a pointer.
235+
void *getRawPointer() const {
236+
assert((unsigned(getRawKind()) <= unsigned(EnumTy::LastPointerKind) ||
237+
getRawKind() == EnumTy::LargeIndex) &&
238+
"Not a pointer projection kind");
239+
// We assume that all of the types of pointers that are stored are 8 bit
240+
// aligned. We store out pointer in the bottom 61 bits, so just shift out by
241+
// 3 and reinterpret_cast to a PointerTy .
242+
return reinterpret_cast<void *>(Index << NumPointerKindBits);
243+
}
244+
245+
PointerTy getPointer() const {
246+
return PtrTraits::getFromVoidPointer(getRawPointer());
247+
}
248+
249+
/// Convenience method for getting the raw underlying kind. This means that we
250+
/// will return LargeIndex as a kind instead of returning the kind from the
251+
/// lower bits of the malloced large index.
252+
EnumTy getRawKind() const {
253+
// First grab the bits of projection excluding the top 3 bits. If these bits
254+
// take ona value <= 2047, then we have a small index.
255+
if ((Index & IndexKindOffsetBitMask) <= MaxSmallIndex) {
256+
return EnumTy(unsigned(Index >> IndexKindBitOffset));
257+
}
258+
259+
// Otherwise, we have some sort of pointer.
260+
return EnumTy(unsigned(Index >> PointerKindBitOffset));
261+
}
262+
263+
private:
264+
/// Initialize this PointerIntEnum with the kind \p Kind and the Pointer \p
265+
/// Ptr.
266+
///
267+
/// This is an internal helper routine that should not be used directly since
268+
/// it does not properly handle freeing memory.
269+
void initWithIndex(EnumTy Kind, unsigned NewIndex) {
270+
// If new index is less than the max Small Index, then quickly initialize.
271+
if (NewIndex <= MaxSmallIndex) {
272+
// Initialize Index with NewIndex.
273+
Index = NewIndex;
274+
// We store the Kind in the upper 4 bits.
275+
Index |= uintptr_t(Kind) << IndexKindBitOffset;
276+
return;
277+
}
278+
279+
// We store the index, shifted to the left by 4 bits and the kind in the
280+
// bottom 4 bits.
281+
uint64_t FinalNewIndex = uint64_t(NewIndex) << NumIndexKindBits;
282+
FinalNewIndex |= unsigned(Kind);
283+
284+
// If we have a large index, malloc the memory and initialize it with our
285+
// new index.
286+
initWithPointer(EnumTy::LargeIndex, new uint64_t(FinalNewIndex));
287+
}
288+
289+
/// Initialize this PointerIntEnum with the kind \p Kind and the Pointer \p
290+
/// Ptr.
291+
///
292+
/// This is an internal helper routine that should not be used directly since
293+
/// it does not properly handle freeing memory.
294+
void initWithPointer(EnumTy Kind, void *Ptr) {
295+
// Make sure the pointer is at least 8 bit aligned.
296+
assert((uintptr_t(Ptr) & ((1 << NumPointerKindBits) - 1)) == 0);
297+
Index = uintptr_t(Ptr) >> NumPointerKindBits;
298+
Index |= uintptr_t(Kind) << PointerKindBitOffset;
299+
}
300+
301+
/// If we have an index payload that is greater than 4096, this routine frees
302+
/// the malloced memory.
303+
void freeMemory() {
304+
assert(getRawKind() == EnumTy::LargeIndex &&
305+
"Freeing memory of a non-large index enum");
306+
void *Ptr = getRawPointer();
307+
if (ScribbleMemory) {
308+
uint64_t SentinelValue = -1ULL;
309+
memcpy(Ptr, &SentinelValue, sizeof(SentinelValue));
310+
return;
311+
}
312+
delete reinterpret_cast<uint64_t *>(getRawPointer());
313+
}
314+
};
315+
316+
} // end swift namespace

unittests/Basic/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ add_swift_unittest(SwiftBasicTests
1919
SuccessorMapTest.cpp
2020
Unicode.cpp
2121
BlotMapVectorTest.cpp
22+
PointerIntEnumTest.cpp
2223

2324
${generated_tests}
2425
)

0 commit comments

Comments
 (0)