Skip to content

Commit 96a0c79

Browse files
committed
[basic] Add a simple vector backed 2 stage multi map.
I have been using this in a bunch of places in the compiler and rather than implement it by hand over and over (and maybe messing up), this commit just commits a correct implementation. This data structure is a map backed by a vector like data structure. It has two phases: 1. An insertion phase when the map is mutable and one inserts (key, value) pairs into the map. These are just appeneded into the storage array. 2. A frozen stage when the map is immutable and one can now perform map queries on the multimap. The map transitions from the mutable, thawed phase to the immutable, frozen phase by performing a stable_sort of its internal storage by only the key. Since this is a stable_sort, we know that the relative insertion order of values is preserved if their keys equal. Thus the sorting will have created contiguous regions in the array of values, all mapped to the same key, that are insertion order. Thus by finding the lower_bound for a given key, we are guaranteed to get the first element in that continguous range. We can then do a forward search to find the end of the region, allowing us to then return an ArrayRef to these internal values. The reason why I keep on finding myself using this is that this map enables one to map a key to an array of values without needing to store small vectors in a map or use heap allocated memory, all key, value pairs are stored inline (in potentially a single SmallVector given that one is using SmallFrozenMultiMap).
1 parent d6eebe9 commit 96a0c79

File tree

3 files changed

+375
-0
lines changed

3 files changed

+375
-0
lines changed

include/swift/Basic/FrozenMultiMap.h

Lines changed: 187 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,187 @@
1+
//===--- FrozenMultiMap.h ----------------------------------*- C++ --------===//
2+
//
3+
// This source file is part of the Swift.org open source project
4+
//
5+
// Copyright (c) 2014 - 2020 Apple Inc. and the Swift project authors
6+
// Licensed under Apache License v2.0 with Runtime Library Exception
7+
//
8+
// See https://swift.org/LICENSE.txt for license information
9+
// See https://swift.org/CONTRIBUTORS.txt for the list of Swift project authors
10+
//
11+
//===----------------------------------------------------------------------===//
12+
///
13+
/// \file
14+
///
15+
/// A 2 stage multi-map. Initially the multimap is mutable and can only be
16+
/// initialized. Once complete, the map is frozen and can be only used for map
17+
/// operations. It is guaranteed that all values are still in insertion order.
18+
///
19+
/// DISCUSSION: These restrictions flow from the internal implementation of the
20+
/// multi-map being a pair of keys, values. We form the map property by
21+
/// performing a stable_sort of the (key, value) in the process of freezing the
22+
/// map.
23+
///
24+
//===----------------------------------------------------------------------===//
25+
26+
#ifndef SWIFT_BASIC_FROZENMULTIMAP_H
27+
#define SWIFT_BASIC_FROZENMULTIMAP_H
28+
29+
#include "swift/Basic/LLVM.h"
30+
#include "swift/Basic/STLExtras.h"
31+
#include "llvm/ADT/SmallVector.h"
32+
#include <vector>
33+
34+
namespace swift {
35+
36+
template <typename Key, typename Value,
37+
typename VectorStorage = std::vector<std::pair<Key, Value>>>
38+
class FrozenMultiMap {
39+
VectorStorage storage;
40+
bool frozen = false;
41+
42+
private:
43+
struct PairToSecondElt;
44+
45+
public:
46+
using PairToSecondEltRange =
47+
TransformRange<ArrayRef<std::pair<Key, Value>>, PairToSecondElt>;
48+
49+
FrozenMultiMap() = default;
50+
51+
void insert(const Key &key, const Value &value) {
52+
assert(!isFrozen() && "Can not insert new keys once map is frozen");
53+
storage.emplace_back(key, value);
54+
}
55+
56+
Optional<PairToSecondEltRange> find(const Key &key) const {
57+
assert(isFrozen() &&
58+
"Can not perform a find operation until the map is frozen");
59+
// Since our array is sorted, we need to first find the first pair with our
60+
// inst as the first element.
61+
auto start = std::lower_bound(
62+
storage.begin(), storage.end(), std::make_pair(key, Value()),
63+
[&](const std::pair<Key, Value> &p1, const std::pair<Key, Value> &p2) {
64+
return p1.first < p2.first;
65+
});
66+
if (start == storage.end() || start->first != key) {
67+
return None;
68+
}
69+
70+
// Ok, we found our first element. Now scan forward until we find a pair
71+
// whose instruction is not our own instruction.
72+
auto end = find_if_not(
73+
start, storage.end(),
74+
[&](const std::pair<Key, Value> &pair) { return pair.first == key; });
75+
unsigned count = std::distance(start, end);
76+
ArrayRef<std::pair<Key, Value>> slice(&*start, count);
77+
return PairToSecondEltRange(slice, PairToSecondElt());
78+
}
79+
80+
bool isFrozen() const { return frozen; }
81+
82+
/// Set this map into its frozen state when we
83+
void setFrozen() {
84+
std::stable_sort(storage.begin(), storage.end(),
85+
[&](const std::pair<Key, Value> &lhs,
86+
const std::pair<Key, Value> &rhs) {
87+
// Only compare the first entry so that we preserve
88+
// insertion order.
89+
return lhs.first < rhs.first;
90+
});
91+
frozen = true;
92+
}
93+
94+
unsigned size() const { return storage.size(); }
95+
bool empty() const { return storage.empty(); }
96+
97+
struct iterator : std::iterator<std::forward_iterator_tag,
98+
std::pair<Key, ArrayRef<Value>>> {
99+
using base_iterator = typename decltype(storage)::iterator;
100+
101+
FrozenMultiMap &map;
102+
base_iterator baseIter;
103+
Optional<std::pair<Key, PairToSecondEltRange>> currentValue;
104+
105+
iterator(FrozenMultiMap &map, base_iterator iter)
106+
: map(map), baseIter(iter), currentValue() {
107+
// If we are end, just return.
108+
if (iter == map.storage.end()) {
109+
return;
110+
}
111+
112+
// Otherwise, prime our first range.
113+
updateCurrentValue();
114+
}
115+
116+
void updateCurrentValue() {
117+
base_iterator end = map.storage.end();
118+
auto rangeEnd = std::find_if_not(std::next(baseIter), end,
119+
[&](const std::pair<Key, Value> &elt) {
120+
return elt.first == baseIter->first;
121+
});
122+
unsigned count = std::distance(baseIter, rangeEnd);
123+
ArrayRef<std::pair<Key, Value>> slice(&*baseIter, count);
124+
currentValue = {baseIter->first,
125+
PairToSecondEltRange(slice, PairToSecondElt())};
126+
}
127+
128+
iterator &operator++() {
129+
baseIter = std::find_if_not(std::next(baseIter), map.storage.end(),
130+
[&](const std::pair<Key, Value> &elt) {
131+
return elt.first == baseIter->first;
132+
});
133+
updateCurrentValue();
134+
return *this;
135+
}
136+
137+
iterator operator++(int) {
138+
auto tmp = *this;
139+
baseIter = std::find_if_not(std::next(baseIter), map.storage.end(),
140+
[&](const std::pair<Key, Value> &elt) {
141+
return elt.first == baseIter->first;
142+
});
143+
updateCurrentValue();
144+
return tmp;
145+
}
146+
147+
std::pair<Key, PairToSecondEltRange> operator*() const {
148+
return *currentValue;
149+
}
150+
151+
bool operator==(const iterator &RHS) const {
152+
return baseIter == RHS.baseIter;
153+
}
154+
155+
bool operator!=(const iterator &RHS) const {
156+
return baseIter != RHS.baseIter;
157+
}
158+
};
159+
160+
/// Return a range of (key, ArrayRef<Value>) pairs. The keys are guaranteed to
161+
/// be in key sorted order and the ArrayRef<Value> are in insertion order.
162+
llvm::iterator_range<iterator> getRange() const {
163+
assert(isFrozen() &&
164+
"Can not create range until data structure is frozen?!");
165+
auto *self = const_cast<FrozenMultiMap *>(this);
166+
iterator iter1 = iterator(*self, self->storage.begin());
167+
iterator iter2 = iterator(*self, self->storage.end());
168+
return llvm::make_range(iter1, iter2);
169+
}
170+
};
171+
172+
template <typename Key, typename Value, typename Storage>
173+
struct FrozenMultiMap<Key, Value, Storage>::PairToSecondElt {
174+
PairToSecondElt() {}
175+
176+
Value operator()(const std::pair<Key, Value> &pair) const {
177+
return pair.second;
178+
}
179+
};
180+
181+
template <typename Key, typename Value, unsigned SmallSize>
182+
using SmallFrozenMultiMap =
183+
FrozenMultiMap<Key, Value, SmallVector<std::pair<Key, Value>, SmallSize>>;
184+
185+
} // namespace swift
186+
187+
#endif

unittests/Basic/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ add_swift_unittest(SwiftBasicTests
1515
EncodedSequenceTest.cpp
1616
ExponentialGrowthAppendingBinaryByteStreamTests.cpp
1717
FileSystemTest.cpp
18+
FrozenMultiMapTest.cpp
1819
ImmutablePointerSetTest.cpp
1920
JSONSerialization.cpp
2021
OptionSetTest.cpp
Lines changed: 187 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,187 @@
1+
//===--- FrozenMultiMapTest.cpp -------------------------------------------===//
2+
//
3+
// This source file is part of the Swift.org open source project
4+
//
5+
// Copyright (c) 2014 - 2020 Apple Inc. and the Swift project authors
6+
// Licensed under Apache License v2.0 with Runtime Library Exception
7+
//
8+
// See https://swift.org/LICENSE.txt for license information
9+
// See https://swift.org/CONTRIBUTORS.txt for the list of Swift project authors
10+
//
11+
//===----------------------------------------------------------------------===//
12+
13+
#define DEBUG_TYPE "swift-frozen-multi-map-test"
14+
15+
#include "swift/Basic/FrozenMultiMap.h"
16+
#include "swift/Basic/LLVM.h"
17+
#include "swift/Basic/Lazy.h"
18+
#include "swift/Basic/NullablePtr.h"
19+
#include "swift/Basic/Range.h"
20+
#include "swift/Basic/STLExtras.h"
21+
#include "llvm/ADT/Optional.h"
22+
#include "llvm/ADT/STLExtras.h"
23+
#include "llvm/ADT/SmallString.h"
24+
#include "llvm/ADT/StringExtras.h"
25+
#include "llvm/Support/Debug.h"
26+
#include "llvm/Support/raw_ostream.h"
27+
#include "gtest/gtest.h"
28+
#include <map>
29+
#include <random>
30+
#include <set>
31+
32+
using namespace swift;
33+
34+
namespace {
35+
36+
class Canary {
37+
static unsigned currentID;
38+
unsigned id;
39+
40+
public:
41+
static void resetIDs() { currentID = 0; }
42+
Canary(unsigned id) : id(id) {}
43+
Canary() {
44+
id = currentID;
45+
++currentID;
46+
}
47+
48+
unsigned getID() const { return id; }
49+
bool operator<(const Canary &other) const { return id < other.id; }
50+
51+
bool operator==(const Canary &other) const { return id == other.id; }
52+
53+
bool operator!=(const Canary &other) const { return !(*this == other); }
54+
};
55+
56+
unsigned Canary::currentID = 0;
57+
58+
} // namespace
59+
60+
TEST(FrozenMultiMapCustomTest, SimpleFind) {
61+
Canary::resetIDs();
62+
FrozenMultiMap<Canary, Canary> map;
63+
64+
auto key1 = Canary();
65+
auto key2 = Canary();
66+
map.insert(key1, Canary());
67+
map.insert(key1, Canary());
68+
map.insert(key1, Canary());
69+
map.insert(key2, Canary());
70+
map.insert(key2, Canary());
71+
72+
map.setFrozen();
73+
74+
EXPECT_EQ(map.size(), 5u);
75+
{
76+
auto r = map.find(key1);
77+
EXPECT_TRUE(r.hasValue());
78+
EXPECT_EQ(r->size(), 3u);
79+
EXPECT_EQ((*r)[0].getID(), 2u);
80+
EXPECT_EQ((*r)[1].getID(), 3u);
81+
EXPECT_EQ((*r)[2].getID(), 4u);
82+
}
83+
84+
{
85+
auto r = map.find(key2);
86+
EXPECT_TRUE(r.hasValue());
87+
EXPECT_EQ(r->size(), 2u);
88+
EXPECT_EQ((*r)[0].getID(), 5u);
89+
EXPECT_EQ((*r)[1].getID(), 6u);
90+
}
91+
}
92+
93+
TEST(FrozenMultiMapCustomTest, SimpleIter) {
94+
Canary::resetIDs();
95+
FrozenMultiMap<Canary, Canary> map;
96+
97+
auto key1 = Canary();
98+
auto key2 = Canary();
99+
map.insert(key1, Canary());
100+
map.insert(key1, Canary());
101+
map.insert(key1, Canary());
102+
map.insert(key2, Canary());
103+
map.insert(key2, Canary());
104+
105+
map.setFrozen();
106+
107+
EXPECT_EQ(map.size(), 5u);
108+
109+
auto range = map.getRange();
110+
111+
EXPECT_EQ(std::distance(range.begin(), range.end()), 2);
112+
113+
auto iter = range.begin();
114+
{
115+
auto p = *iter;
116+
EXPECT_EQ(p.first.getID(), key1.getID());
117+
EXPECT_EQ(p.second.size(), 3u);
118+
EXPECT_EQ(p.second[0].getID(), 2u);
119+
EXPECT_EQ(p.second[1].getID(), 3u);
120+
EXPECT_EQ(p.second[2].getID(), 4u);
121+
}
122+
123+
++iter;
124+
{
125+
auto p = *iter;
126+
EXPECT_EQ(p.first.getID(), key2.getID());
127+
EXPECT_EQ(p.second.size(), 2u);
128+
EXPECT_EQ(p.second[0].getID(), 5u);
129+
EXPECT_EQ(p.second[1].getID(), 6u);
130+
}
131+
}
132+
133+
TEST(FrozenMultiMapCustomTest, RandomAgainstStdMultiMap) {
134+
Canary::resetIDs();
135+
FrozenMultiMap<unsigned, unsigned> map;
136+
std::multimap<unsigned, unsigned> stdMultiMap;
137+
138+
auto seed =
139+
std::chrono::high_resolution_clock::now().time_since_epoch().count();
140+
std::mt19937 mt_rand(seed);
141+
142+
std::vector<unsigned> keyIdList;
143+
for (unsigned i = 0; i < 1024; ++i) {
144+
unsigned keyID = mt_rand() % 20;
145+
keyIdList.push_back(keyID);
146+
for (unsigned valueID = (mt_rand()) % 15; valueID < 15; ++valueID) {
147+
map.insert(keyID, valueID);
148+
stdMultiMap.emplace(keyID, valueID);
149+
}
150+
}
151+
152+
map.setFrozen();
153+
154+
// Then for each key.
155+
for (unsigned i : keyIdList) {
156+
// Make sure that we have the same elements in the same order for each key.
157+
auto range = *map.find(i);
158+
auto stdRange = stdMultiMap.equal_range(i);
159+
EXPECT_EQ(std::distance(range.begin(), range.end()),
160+
std::distance(stdRange.first, stdRange.second));
161+
auto modernStdRange = llvm::make_range(stdRange.first, stdRange.second);
162+
for (auto p : llvm::zip(range, modernStdRange)) {
163+
unsigned lhs = std::get<0>(p);
164+
unsigned rhs = std::get<1>(p).second;
165+
EXPECT_EQ(lhs, rhs);
166+
}
167+
}
168+
169+
// Then check that when we iterate over both ranges, we get the same order.
170+
{
171+
auto range = map.getRange();
172+
auto rangeIter = range.begin();
173+
auto stdRangeIter = stdMultiMap.begin();
174+
175+
while (rangeIter != range.end()) {
176+
auto rangeElt = *rangeIter;
177+
178+
for (unsigned i : indices(rangeElt.second)) {
179+
EXPECT_EQ(rangeElt.first, stdRangeIter->first);
180+
EXPECT_EQ(rangeElt.second[i], stdRangeIter->second);
181+
++stdRangeIter;
182+
}
183+
184+
++rangeIter;
185+
}
186+
}
187+
}

0 commit comments

Comments
 (0)