-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[clang] Implement statement printer for EmbedExpr #117770
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-clang Author: circl (circl-lastname) ChangesFixes #107869 Due to the other FIXMEs in I've used Full diff: https://github.com/llvm/llvm-project/pull/117770.diff 9 Files Affected:
diff --git a/clang/include/clang/AST/Expr.h b/clang/include/clang/AST/Expr.h
index 708c8656decbe0..95d341957c59a2 100644
--- a/clang/include/clang/AST/Expr.h
+++ b/clang/include/clang/AST/Expr.h
@@ -4917,19 +4917,24 @@ class EmbedExpr final : public Expr {
SourceLocation EmbedKeywordLoc;
IntegerLiteral *FakeChildNode = nullptr;
const ASTContext *Ctx = nullptr;
+ StringRef Filename;
+ bool IsAngled;
EmbedDataStorage *Data;
unsigned Begin = 0;
unsigned NumOfElements;
public:
- EmbedExpr(const ASTContext &Ctx, SourceLocation Loc, EmbedDataStorage *Data,
- unsigned Begin, unsigned NumOfElements);
+ EmbedExpr(const ASTContext &Ctx, SourceLocation Loc, StringRef Filename,
+ bool IsAngled, EmbedDataStorage *Data, unsigned Begin,
+ unsigned NumOfElements);
explicit EmbedExpr(EmptyShell Empty) : Expr(SourceLocExprClass, Empty) {}
SourceLocation getLocation() const { return EmbedKeywordLoc; }
SourceLocation getBeginLoc() const { return EmbedKeywordLoc; }
SourceLocation getEndLoc() const { return EmbedKeywordLoc; }
+ StringRef getFilename() const { return Filename; }
+ bool getIsAngled() const { return IsAngled; }
StringLiteral *getDataStringLiteral() const { return Data->BinaryData; }
EmbedDataStorage *getData() const { return Data; }
diff --git a/clang/include/clang/Lex/Preprocessor.h b/clang/include/clang/Lex/Preprocessor.h
index 3312d4ed1d798d..0a4ca35224497c 100644
--- a/clang/include/clang/Lex/Preprocessor.h
+++ b/clang/include/clang/Lex/Preprocessor.h
@@ -2761,7 +2761,8 @@ class Preprocessor {
// Binary data inclusion
void HandleEmbedDirective(SourceLocation HashLoc, Token &Tok,
const FileEntry *LookupFromFile = nullptr);
- void HandleEmbedDirectiveImpl(SourceLocation HashLoc,
+ void HandleEmbedDirectiveImpl(SourceLocation HashLoc, StringRef Filename,
+ bool IsAngled,
const LexEmbedParametersResult &Params,
StringRef BinaryContents);
@@ -3066,6 +3067,8 @@ class EmptylineHandler {
/// Helper class to shuttle information about #embed directives from the
/// preprocessor to the parser through an annotation token.
struct EmbedAnnotationData {
+ std::string Filename;
+ bool IsAngled;
StringRef BinaryData;
};
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 24abd5d95dd844..435bc4d4807bc4 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -7086,8 +7086,8 @@ class Sema final : public SemaBase {
SourceLocation RPLoc);
// #embed
- ExprResult ActOnEmbedExpr(SourceLocation EmbedKeywordLoc,
- StringLiteral *BinaryData);
+ ExprResult ActOnEmbedExpr(SourceLocation EmbedKeywordLoc, StringRef Filename,
+ bool IsAngled, StringLiteral *BinaryData);
// Build a potentially resolved SourceLocExpr.
ExprResult BuildSourceLocExpr(SourceLocIdentKind Kind, QualType ResultTy,
diff --git a/clang/lib/AST/Expr.cpp b/clang/lib/AST/Expr.cpp
index a4fb4d5a1f2ec4..eeea6da07d304f 100644
--- a/clang/lib/AST/Expr.cpp
+++ b/clang/lib/AST/Expr.cpp
@@ -2383,11 +2383,11 @@ APValue SourceLocExpr::EvaluateInContext(const ASTContext &Ctx,
}
EmbedExpr::EmbedExpr(const ASTContext &Ctx, SourceLocation Loc,
- EmbedDataStorage *Data, unsigned Begin,
- unsigned NumOfElements)
+ StringRef Filename, bool IsAngled, EmbedDataStorage *Data,
+ unsigned Begin, unsigned NumOfElements)
: Expr(EmbedExprClass, Ctx.IntTy, VK_PRValue, OK_Ordinary),
- EmbedKeywordLoc(Loc), Ctx(&Ctx), Data(Data), Begin(Begin),
- NumOfElements(NumOfElements) {
+ EmbedKeywordLoc(Loc), Ctx(&Ctx), Filename(Filename), IsAngled(IsAngled),
+ Data(Data), Begin(Begin), NumOfElements(NumOfElements) {
setDependence(ExprDependence::None);
FakeChildNode = IntegerLiteral::Create(
Ctx, llvm::APInt::getZero(Ctx.getTypeSize(getType())), getType(), Loc);
diff --git a/clang/lib/AST/StmtPrinter.cpp b/clang/lib/AST/StmtPrinter.cpp
index c8677d11b64e8d..da6b4cb5f66955 100644
--- a/clang/lib/AST/StmtPrinter.cpp
+++ b/clang/lib/AST/StmtPrinter.cpp
@@ -1202,7 +1202,19 @@ void StmtPrinter::VisitSourceLocExpr(SourceLocExpr *Node) {
}
void StmtPrinter::VisitEmbedExpr(EmbedExpr *Node) {
- llvm::report_fatal_error("Not implemented");
+ // FIXME: This doesn't handle offsets, prefixes, and if_empty yet
+ OS << NL << "#embed";
+ if (Node->getIsAngled()) {
+ OS << "<";
+ } else {
+ OS << "\"";
+ }
+ OS << Node->getFilename();
+ if (Node->getIsAngled()) {
+ OS << ">" << NL;
+ } else {
+ OS << "\"" << NL;
+ }
}
void StmtPrinter::VisitConstantExpr(ConstantExpr *Node) {
diff --git a/clang/lib/Lex/PPDirectives.cpp b/clang/lib/Lex/PPDirectives.cpp
index a23ad40884f249..62eda8401636c4 100644
--- a/clang/lib/Lex/PPDirectives.cpp
+++ b/clang/lib/Lex/PPDirectives.cpp
@@ -3871,8 +3871,8 @@ Preprocessor::LexEmbedParameters(Token &CurTok, bool ForHasEmbed) {
}
void Preprocessor::HandleEmbedDirectiveImpl(
- SourceLocation HashLoc, const LexEmbedParametersResult &Params,
- StringRef BinaryContents) {
+ SourceLocation HashLoc, StringRef Filename, bool IsAngled,
+ const LexEmbedParametersResult &Params, StringRef BinaryContents) {
if (BinaryContents.empty()) {
// If we have no binary contents, the only thing we need to emit are the
// if_empty tokens, if any.
@@ -3902,6 +3902,8 @@ void Preprocessor::HandleEmbedDirectiveImpl(
}
EmbedAnnotationData *Data = new (BP) EmbedAnnotationData;
+ Data->Filename = Filename;
+ Data->IsAngled = IsAngled;
Data->BinaryData = BinaryContents;
Toks[CurIdx].startToken();
@@ -4006,5 +4008,6 @@ void Preprocessor::HandleEmbedDirective(SourceLocation HashLoc, Token &EmbedTok,
if (Callbacks)
Callbacks->EmbedDirective(HashLoc, Filename, isAngled, MaybeFileRef,
*Params);
- HandleEmbedDirectiveImpl(HashLoc, *Params, BinaryContents);
+ HandleEmbedDirectiveImpl(HashLoc, Filename, isAngled, *Params,
+ BinaryContents);
}
diff --git a/clang/lib/Parse/ParseInit.cpp b/clang/lib/Parse/ParseInit.cpp
index 63b1d7bd9db53e..06ced273a65e92 100644
--- a/clang/lib/Parse/ParseInit.cpp
+++ b/clang/lib/Parse/ParseInit.cpp
@@ -451,7 +451,8 @@ ExprResult Parser::createEmbedExpr() {
StringLiteral *BinaryDataArg = CreateStringLiteralFromStringRef(
Data->BinaryData, Context.UnsignedCharTy);
- Res = Actions.ActOnEmbedExpr(StartLoc, BinaryDataArg);
+ Res = Actions.ActOnEmbedExpr(StartLoc, Data->Filename, Data->IsAngled,
+ BinaryDataArg);
}
return Res;
}
diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index c9d7444d5865a5..f5deacf709d606 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -16770,12 +16770,13 @@ ExprResult Sema::BuildSourceLocExpr(SourceLocIdentKind Kind, QualType ResultTy,
}
ExprResult Sema::ActOnEmbedExpr(SourceLocation EmbedKeywordLoc,
+ StringRef Filename, bool IsAngled,
StringLiteral *BinaryData) {
EmbedDataStorage *Data = new (Context) EmbedDataStorage;
Data->BinaryData = BinaryData;
return new (Context)
- EmbedExpr(Context, EmbedKeywordLoc, Data, /*NumOfElements=*/0,
- Data->getDataElementCount());
+ EmbedExpr(Context, EmbedKeywordLoc, Filename, IsAngled, Data,
+ /*NumOfElements=*/0, Data->getDataElementCount());
}
static bool maybeDiagnoseAssignmentToFunction(Sema &S, QualType DstType,
diff --git a/clang/lib/Sema/SemaInit.cpp b/clang/lib/Sema/SemaInit.cpp
index 7c03a12e812809..609ca34a8c1d20 100644
--- a/clang/lib/Sema/SemaInit.cpp
+++ b/clang/lib/Sema/SemaInit.cpp
@@ -525,9 +525,9 @@ class InitListChecker {
}
}
- Result = new (SemaRef.Context)
- EmbedExpr(SemaRef.Context, Embed->getLocation(), Embed->getData(),
- CurEmbedIndex, ElsCount);
+ Result = new (SemaRef.Context) EmbedExpr(
+ SemaRef.Context, Embed->getLocation(), Embed->getFilename(),
+ Embed->getIsAngled(), Embed->getData(), CurEmbedIndex, ElsCount);
CurEmbedIndex += ElsCount;
if (CurEmbedIndex >= Embed->getDataElementCount()) {
CurEmbed = nullptr;
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably want a changelog
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the fix!
I wonder if the new change be tested somehow?
Also, we probably need to update Serialization code with handling for the new fields of EmbedExpr
.
EmbedAnnotationData *Data = new (BP) EmbedAnnotationData; | ||
Data->Filename = Filename; | ||
Data->IsAngled = IsAngled; | ||
Data->BinaryData = BinaryContents; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I struggle to see how the current implementation actually prevents the memory leak. EmbedAnnotationData
is allocated first as if the std::string
inside was empty, then the string is assigned with an array of chars which will likely extend string's size, giving no chance to deallocate proper size later. I wonder if keeping a StringRef
in EmbedAnnotationData
is possible instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there such a thing as an OwningStringRef then? I am not sure how to guarantee the string exists, and then is deallocated along with the tokens. I thought std::string would be a ptr to the dynamically sized string data, but I haven't used the STL much
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None that I can think of... I don't think it would've helped though since I don't think the bump allocator calls the destructors.
I think we need to find a good place to store filename strings that come from embed directives. @AaronBallman any chance you have a good ideas about where can we store the filename string to manage the memory in a good way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar issue with the deserializer, not sure where to store the string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if it's something like a flexible array member and the EmbedAnnotationData is allocated to the right size the first time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where can we store the filename string to manage the memory in a good way?
Can’t we just copy the string data into the ASTContext’s allocator and store a StringRef to it, or am I missing something here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure you can do that during the lexing stage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the way this may need to work is to find out where we're getting the file name from and seeing how long-lived that string is. It's passed here as a StringRef
, and it would nice if we can store it as a StringRef
in the EmbedAnnotationData
rather than as a std::string
. My guess is that the name comes from the FileManager
and is as long-lived as the SourceManager
which is owned by ASTContext
, but we'd need to verify that.
I put it in the C23 features section, if that's ok |
Fixes #107869
Due to the other FIXMEs in
#embed
code, this doesn't yet handle offsets, prefixes, and if_empty, however I feel it's a good start, and preferable to crashing.I've used
StringRef
to store the filename insideEmbedExpr
, andstd::string
to store the filename inside theannot_embed
token. Hopefully this prevents the memory leak fixed in #95802, but this assumes the token array exists while the AST does, if this is a false assumption I can try changing how the storage works.