Skip to content

Commit bf935a0

Browse files
committed
[clangd] Make categorical features 64 bit in DecisionForest Model.
CodeCompletionContext::Kind has 36 Kinds. The completion model used to support categorical features of 32 cardinality. Due to this clangd tests were failing asan tests due to overflow. This patch makes the completion model support 64 cardinality of categorical features by storing ENUM Features as uint64_t instead of uint32_t. Verified that this fixes the asan failures. Latency: 6.7ms (old) VS 6.8ms (new) per 1000 predictions. Differential Revision: https://reviews.llvm.org/D97770
1 parent 0caf736 commit bf935a0

File tree

2 files changed

+7
-6
lines changed

2 files changed

+7
-6
lines changed

clang-tools-extra/clangd/benchmarks/CompletionModel/DecisionForestBenchmark.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ std::vector<Example> generateRandomDataset(int NumExamples) {
5151
: RandInt(20));
5252
E.setSemaSaysInScope(FlipCoin(0.5)); // Boolean.
5353
E.setScope(RandInt(4)); // 4 Scopes.
54-
E.setContextKind(RandInt(32)); // 32 Context kinds.
54+
E.setContextKind(RandInt(36)); // 36 Context kinds.
5555
E.setIsInstanceMember(FlipCoin(0.5)); // Boolean.
5656
E.setHadContextType(FlipCoin(0.6)); // Boolean.
5757
E.setHadSymbolType(FlipCoin(0.6)); // Boolean.

clang-tools-extra/clangd/quality/CompletionModelCodegen.py

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -131,18 +131,19 @@ class can be used to represent a code completion candidate.
131131
feature, feature))
132132
elif f["kind"] == "ENUM":
133133
setters.append(
134-
"void set%s(unsigned V) { %s = 1 << V; }" % (feature, feature))
134+
"void set%s(unsigned V) { %s = 1LL << V; }" % (feature, feature))
135135
else:
136136
raise ValueError("Unhandled feature type.", f["kind"])
137137

138138
# Class members represent all the features of the Example.
139139
class_members = [
140-
"uint32_t %s = 0;" % f['name']
140+
"uint%d_t %s = 0;"
141+
% (64 if f["kind"] == "ENUM" else 32, f['name'])
141142
for f in features_json
142143
]
143144
getters = [
144-
"LLVM_ATTRIBUTE_ALWAYS_INLINE uint32_t get%s() const { return %s; }"
145-
% (f['name'], f['name'])
145+
"LLVM_ATTRIBUTE_ALWAYS_INLINE uint%d_t get%s() const { return %s; }"
146+
% (64 if f["kind"] == "ENUM" else 32, f['name'], f['name'])
146147
for f in features_json
147148
]
148149
nline = "\n "
@@ -245,7 +246,7 @@ def gen_cpp_code(forest_json, features_json, filename, cpp_class):
245246
246247
%s
247248
248-
#define BIT(X) (1 << X)
249+
#define BIT(X) (1LL << X)
249250
250251
%s
251252

0 commit comments

Comments
 (0)