-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[ELF] Reject error-prone meta characters in input section description #84130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ELF] Reject error-prone meta characters in input section description #84130
Conversation
Created using spr 1.3.5-bogner
@llvm/pr-subscribers-lld @llvm/pr-subscribers-lld-elf Author: Fangrui Song (MaskRay) ChangesOur lexing rule is loose and recognizes certain non-wildcard meta Ideally, the lexer should be state-aware to report more errors like GNU Full diff: https://github.com/llvm/llvm-project/pull/84130.diff 2 Files Affected:
diff --git a/lld/ELF/ScriptParser.cpp b/lld/ELF/ScriptParser.cpp
index f0ede1f43bbdb3..282f95bd04b085 100644
--- a/lld/ELF/ScriptParser.cpp
+++ b/lld/ELF/ScriptParser.cpp
@@ -717,9 +717,19 @@ SmallVector<SectionPattern, 0> ScriptParser::readInputSectionsList() {
StringMatcher SectionMatcher;
// Break if the next token is ), EXCLUDE_FILE, or SORT*.
- while (!errorCount() && peek() != ")" && peek() != "EXCLUDE_FILE" &&
- peekSortKind() == SortSectionPolicy::Default)
+ while (!errorCount() && peekSortKind() == SortSectionPolicy::Default) {
+ StringRef s = peek();
+ if (s == ")" || s == "EXCLUDE_FILE")
+ break;
+ // Detect common mistakes that certain non-wildcard meta characters used
+ // without a closing ')'.
+ if (s.size() == 1 && strchr("(){}", s[0])) {
+ skip();
+ setError("section pattern is expected");
+ break;
+ }
SectionMatcher.addPattern(unquote(next()));
+ }
if (!SectionMatcher.empty())
ret.push_back({std::move(excludeFilePat), std::move(SectionMatcher)});
diff --git a/lld/test/ELF/linkerscript/wildcards.s b/lld/test/ELF/linkerscript/wildcards.s
index 1eea27891dfc2c..24d4102559c95e 100644
--- a/lld/test/ELF/linkerscript/wildcards.s
+++ b/lld/test/ELF/linkerscript/wildcards.s
@@ -91,24 +91,31 @@ SECTIONS {
.text : { *([.]abc .ab[v-y] ) }
}
-## Test a few non-wildcard meta characters rejected by GNU ld.
+## Test a few non-wildcard characters rejected by GNU ld.
#--- lbrace.lds
-# RUN: ld.lld -T lbrace.lds a.o -o out
+# RUN: not ld.lld -T lbrace.lds a.o 2>&1 | FileCheck %s --check-prefix=ERR-LBRACE --match-full-lines --strict-whitespace
+# ERR-LBRACE:{{.*}}: section pattern is expected
+# ERR-LBRACE-NEXT:>>> .text : { *(.a* { ) }
+# ERR-LBRACE-NEXT:>>> ^
SECTIONS {
.text : { *(.a* { ) }
}
#--- lparen.lds
-## ( is recognized as a section name pattern. Note, ( is rejected by GNU ld.
-# RUN: ld.lld -T lparen.lds a.o -o out
-# RUN: llvm-objdump --section-headers out | FileCheck --check-prefix=SEC-NO %s
+# RUN: not ld.lld -T lparen.lds a.o 2>&1 | FileCheck %s --check-prefix=ERR-LPAREN --match-full-lines --strict-whitespace
+# ERR-LPAREN:{{.*}}: section pattern is expected
+# ERR-LPAREN-NEXT:>>> .text : { *(.a* ( ) }
+# ERR-LPAREN-NEXT:>>> ^
SECTIONS {
- .text : { *(.a* ( ) }
+ .text : { *(.a* ( ) }
}
#--- rbrace.lds
-# RUN: ld.lld -T rbrace.lds a.o -o out
+# RUN: not ld.lld -T rbrace.lds a.o 2>&1 | FileCheck %s --check-prefix=ERR-RBRACE --match-full-lines --strict-whitespace
+# ERR-RBRACE:{{.*}}: section pattern is expected
+# ERR-RBRACE-NEXT:>>> .text : { *(.a* } ) }
+# ERR-RBRACE-NEXT:>>> ^
SECTIONS {
.text : { *(.a* } ) }
}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small suggestions on the test and a comment.
# RUN: ld.lld -T lbrace.lds a.o -o out | ||
# RUN: not ld.lld -T lbrace.lds a.o 2>&1 | FileCheck %s --check-prefix=ERR-LBRACE --match-full-lines --strict-whitespace | ||
# ERR-LBRACE:{{.*}}: section pattern is expected | ||
# ERR-LBRACE-NEXT:>>> .text : { *(.a* { ) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth adding a case when there is no space between the disallowed character for example .text : { *(.a*{)
as I understand it (,),{,} are lexed as a single token so the spaces shouldn't matter.
From reading the line and the tests all having the character separated by spaces, it made me double check that we could catch more than just a single isolated character.
if (s.size() == 1 && strchr("(){}", s[0]))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion. Updated. Changed s.size() == 1
to !s.empty()
to be clearer that we just guard again ""
special case.
lld/ELF/ScriptParser.cpp
Outdated
StringRef s = peek(); | ||
if (s == ")" || s == "EXCLUDE_FILE") | ||
break; | ||
// Detect common mistakes that certain non-wildcard meta characters used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest
// Detect common mistakes when certain non-wildcard meta characters are used without a closing )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
Created using spr 1.3.5-bogner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the updates.
Created using spr 1.3.5-bogner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Created using spr 1.3.5-bogner
Created using spr 1.3.5-bogner
The lexer is overly permissive. When parsing file patterns in an input
section description and there is a missing
)
, we would accept manynon-sensible tokens (e.g.
}
) as patterns, leading to confusion, e.g.*(SORT_BY_ALIGNMENT(SORT_BY_NAME(.text*)) } PROVIDE_HIDDEN(__code_end = .)
(#81804).
Ideally, the lexer should be stateful to report more errors like GNU ld
and get rid of hacks like
ScriptLexer::maybeSplitExpr
, but that wouldrequire a large rewrite of the lexer. For now, just reject certain
non-wildcard meta characters to detect common mistakes.