-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[NFC] [clang] Use std::string instead of StringRef to reduce stack usage #114285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Comparisons between StringRef and string literals are lowered by MSVC to happen on the stack. All string literals are allocated unique stack slots increasing the overall stack usage of calculateAttributeSpellingListIndex. Use of std::string forces allocation of a string type per literal, reducing the overall stack usage of the function. Change-Id: I7ac39fc96ab2e559318413fa26ef304c3fb0bd6e
@llvm/pr-subscribers-clang Author: Chinmay Deshpande (chinmaydd) ChangesComparisons between StringRef and string literals are lowered by MSVC to happen on the stack. All string literals are allocated unique stack slots increasing the overall stack usage of Full diff: https://github.com/llvm/llvm-project/pull/114285.diff 1 Files Affected:
diff --git a/clang/lib/Basic/Attributes.cpp b/clang/lib/Basic/Attributes.cpp
index 867d241a2cf847..0083a5ae0a55fb 100644
--- a/clang/lib/Basic/Attributes.cpp
+++ b/clang/lib/Basic/Attributes.cpp
@@ -157,8 +157,10 @@ unsigned AttributeCommonInfo::calculateAttributeSpellingListIndex() const {
// Both variables will be used in tablegen generated
// attribute spell list index matching code.
auto Syntax = static_cast<AttributeCommonInfo::Syntax>(getSyntax());
- StringRef Scope = normalizeAttrScopeName(getScopeName(), Syntax);
- StringRef Name = normalizeAttrName(getAttrName(), Scope, Syntax);
+ // We use std::string instead of StringRef to prevent local stack
+ // allocation of literal strings for comparison.
+ const std::string Scope = normalizeAttrScopeName(getScopeName(), Syntax);
+ const std::string Name = normalizeAttrName(getAttrName(), Scope, Syntax);
#include "clang/Sema/AttrSpellingListIndex.inc"
}
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't see how or why this would introduce stackusage. std::string temporaries should be avoided whenever possible
The code that's included afterwards contains a lot of comparisons with string literals:
IIUC, @chinmaydd claims that MSVC allocates these literals on the stack, and they take up more space than a |
That sounds like MSVC's problem to solve. Why does the amount of stack size matter |
Yeah, I'd like to know what the benefit is -- |
I was able to see the internal bug report. This shows up when some code on Windows runs clang (via a library call), and it ends up crashing due to stack overflow. The crash happens in this function, in "__chkstk" to be exact. Edit: I'm afraid that reducing stack usage in this function will only delay the problem until later. I don't know if the code can be set up to have more stack space allocated, but I think that this would be the proper thing to do. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you really want to work around MSVC's bug, you could change tablegen to emit the string literals as constant globals / StringLiteral
I spent a bit thinking about this, and this function is incredibly wasteful. It ends up re-doing a ton of work that is ALREADY done. It was convenient at the time to write it that way, but with a little intelligence at tablegen, we can simplify this function a million-times. Some things we can do (and should do all): 2- we could reduce the number of comparisons against the name:
3- STOP being dummies and re-doing all the work that the For cases where there are multiple spellings, we can compare length! For example: We already know it is ONE of those two, and just need to figure out which. IN SUMMARY: I am convinced somewhat we can just teach tablegen to be 'smarter' about this file and save ourselves ~100% of those string compares and replace them with numeric comparisons, or enum comparisons. The result is something that is: |
BTW, @chinmaydd : Please let me know if this is something you're able to do. I might be able to free up some cycles later this week/next week to put effort into this if not. |
Thank you everyone for the review. On my local setup, this function occupies close to 185KB of local stack (MSVC 14.38.33130), which is really wasteful. Like @kparzysz highlighted, we observed a crash in Another such function we observed was MSVC does not provide a flag such as @erichkeane thanks for taking the time to suggest a clear way forward. I'll write a patch that attempts to achieve the same. I'll close this PR for now. |
Great, let me know if you need help/advice! I thought about it at lunch and had some thoughts: 1- The 2- A LOT of our attributes have only 1 unique flattenedspelling (that is, only 1 variant to check), so we can optimize for that by just doing a 3- A LOT of our attributes only have 1 unique 4- I don't really see any/many that have duplicates of sizes (I see 5 that have ANY duplicate sizes), so this case just doing a 5- OF the very few that are rest, we have 2 options: a) We just revert to string checks. b) We try to get clever. 6- The function that is responsible for this file is 7- PLEASE when you go for review, put in a comment/gist/etc, the contents of the .inc file to make sure this is doing everything we want. |
Comparisons between StringRef and string literals are lowered by MSVC to happen on the stack. All string literals are allocated unique stack slots increasing the overall stack usage of
calculateAttributeSpellingListIndex
. Use ofstd::string
forces allocation of a string type per literal, reducing the overall stack usage of the function.