Skip to content

Support raw identifiers (backtick-delimited identifiers containing non-identifier characters). #76636

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Mar 12, 2025

Conversation

allevato
Copy link
Member

@allevato allevato commented Sep 22, 2024

Implementation of SE-0451.

Comment on lines +114 to +118
/// Returns true if this identifier contains non-identifier characters and
/// must always be escaped with backticks, even in contexts were other
/// escaped identifiers could omit backticks (like keywords as argument
/// labels).
bool mustAlwaysBeEscaped() const;
Copy link
Member

@rintaro rintaro Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some test cases for code completion checking mustAlwaysBeEscaped() items are actually escaped in code completion suggestions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Those did need to be fixed.

@allevato
Copy link
Member Author

This PR is ready except for the issue mentioned in the description of specialized C++ templates getting escaped by backticks (and now since the backticks are added to the mangling, it changes their mangled names as well). After the holidays I'll take another look at how to handle these.

@allevato
Copy link
Member Author

allevato commented Jan 7, 2025

The issues with imported C++ template specializations have been fixed.

@allevato
Copy link
Member Author

allevato commented Jan 7, 2025

swiftlang/swift-syntax#2857

@swift-ci please test

@allevato
Copy link
Member Author

allevato commented Jan 7, 2025

@swift-ci please test

@allevato
Copy link
Member Author

allevato commented Jan 8, 2025

swiftlang/swift-syntax#2857

@swift-ci please test

@grynspan
Copy link
Contributor

grynspan commented Jan 8, 2025

Is there a named feature I can test for with hasFeature()? We may not be able to adopt until 6.3 without it because we risk breaking on older 6.2 toolchains that don't support raw identifiers.

@allevato
Copy link
Member Author

allevato commented Jan 8, 2025

Is there a named feature I can test for with hasFeature()? We may not be able to adopt until 6.3 without it because we risk breaking on older 6.2 toolchains that don't support raw identifiers.

Looks like LANGUAGE_FEATURE(RawIdentifiers, ...) works for this, without being an upcoming or experimental feature (it's neither). Thanks for the suggestion!

@allevato
Copy link
Member Author

allevato commented Jan 8, 2025

swiftlang/swift-syntax#2857

@swift-ci please test

@allevato
Copy link
Member Author

allevato commented Jan 8, 2025

swiftlang/swift-syntax#2857

@swift-ci please test

1 similar comment
@allevato
Copy link
Member Author

allevato commented Jan 9, 2025

swiftlang/swift-syntax#2857

@swift-ci please test

@allevato
Copy link
Member Author

allevato commented Jan 9, 2025

swiftlang/swift-syntax#2857

@swift-ci please test macOS platform

1 similar comment
@allevato
Copy link
Member Author

swiftlang/swift-syntax#2857

@swift-ci please test macOS platform

@allevato
Copy link
Member Author

@DougGregor @jckarter Can you review please, or forward along to someone who can? Thanks!

Copy link
Member

@rintaro rintaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parser and Index related changes looks good to me.
Deferring to others for other areas.

if (Name.mustAlwaysBeEscaped()) {
// Names that are raw identifiers must always be escaped regardless of
// their position.
shouldEscapeKeywords = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This calls Lexer:: identifierMustAlwaysBeEscaped(StringRef) twice, in Name.mustAlwaysBeEscaped() and Builder.escapeKeyword() which is a waste. Can we avoid that? I'm fine with just manually wrapping it with back-ticks here, or you could factor out the escaping logic from Builder.escapeKeyword.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I've factored out a escapeWithBackticks method in the builder so it can be called directly.

Identifier Name) {
if (Name.mustAlwaysBeEscaped()) {
SmallString<16> buffer;
Builder.addBaseName(Builder.escapeKeyword(Name.str(), true, buffer));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use swift-ide-test in test/IDE for testing these? Our primary place for testing code completion results is test/IDE/complete_*.swift.
test/SourceKit/CodeComplete is for SourceKit specific functionalities, like response format, ordering or such.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the tests to test/IDE and updated them to use %batch-code-completion.

Copy link
Member

@rintaro rintaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parse/IDE/Index changes lgtm

@@ -382,18 +391,16 @@ class CodeCompletionResultBuilder {
#define KEYWORD(kw) .Case(#kw, true)
shouldEscape = llvm::StringSwitch<bool>(Word)
#include "swift/AST/TokenKinds.def"
.Default(false);
.Default(Lexer::identifierMustAlwaysBeEscaped(Word));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: indentation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what clang-format wants, apparently, because it's treated as a continuation of the expression llvm::StringSwitch... above.

@DougGregor
Copy link
Member

@allevato I'm so sorry, I thought this review had been picked up before.

Copy link
Member

@DougGregor DougGregor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great, please go ahead and merge

@@ -3062,7 +3062,8 @@ void ASTMangler::appendAnyGenericType(const GenericTypeDecl *decl,
// `ClassTemplateSpecializationDecl`'s name does not include information about
// template arguments, and in order to prevent name clashes we use the
// name of the Swift decl which does include template arguments.
appendIdentifier(nominal->getName().str());
appendIdentifier(nominal->getName().str(),
/*allowRawIdentifiers=*/false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we should revisit the naming of imported class template specializations in light of raw identifiers, but I don't want to bring any of that discussion into this pull request.

if (auto attr = proto->getAttrs().getAttribute<ObjCAttr>()) {
auto attr = proto->getAttrs().getAttribute<ObjCAttr>();
if (attr) {
checkObjCNameValidity(proto, attr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This checkObjCNameValidity should be outside of the if so that we diagnose, e.g.,

@objc
protocol `some protocol` { }

right?

[EDIT: No, I'm wrong, because you have a test like this. I'm not sure how it works, though]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just checking for the presence of any @objc attr, which may or may not have an explicit name (the latter is checked by checkObjCNameValidity), so I think this is working as intended. Unless I'm misunderstanding your comment?

func `+`(lhs: Int, rhs: Int) -> Int // expected-error{{expected identifier in function declaration}}

@propertyWrapper
struct `@PoorlyNamedWrapper`<`The Value`> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's hideous! But makes a nice test case, thanks

@allevato allevato enabled auto-merge March 11, 2025 20:01
@allevato
Copy link
Member Author

@swift-ci please test

@allevato allevato disabled auto-merge March 11, 2025 21:17
allevato and others added 9 commits March 11, 2025 17:18
Raw identifiers are backtick-delimited identifiers that can contain any
non-identifier character other than the backtick itself, CR, LF, or other
non-printable ASCII code units, and which are also not composed entirely
of operator characters.
This requires no additional work beyond just parsing the identifier because Clang module maps already support such names in quotes.
The original module names themselves must still be valid unescaped identifiers; most of the serialization logic in the compiler depends on the name of a module matching its name on the file system, and it would be very complex to turn escaped identifiers into file-safe names.
… JSON file.

For build systems that already generate these files, it makes sense to include the aliases so that the map file serves as a comprehensive index of how the module inputs are referenced.
`#fileID` never accounted for the possibility that someone one have
a module alias _itself_, so it always generated the module's real
(physical) name. This _technically_ changes the behavior of `#fileID`
for self-aliased modules, but since nobody would have ever had a reason
to do that before raw identifiers, it's unlikely that this change would
affect anyone in practice.
If a decl is exported to Objective-C (explicitly or implicitly), it
must be given an explicit name that is a valid Objective-C identifier.
Imported C++ template specializations receive identifiers that contain
their type signature; e.g., `X<Y, Z>`. Since this means the identifier
contains non-identifier characters, the new behavior was trying to
escape them with backticks in ASTPrinter, ASTMangler, and the runtime
metadata. This pulls that back to preserve the current behavior for
specifically those types.
This lets clients test `#if hasFeature(RawIdentifiers)` to
determine compiler support.
@allevato
Copy link
Member Author

@swift-ci please test

@allevato allevato enabled auto-merge March 11, 2025 22:17
@allevato
Copy link
Member Author

@swift-ci please test macOS platform

allevato added a commit to allevato/swift-driver that referenced this pull request Mar 12, 2025
This is the companion to the frontend changes for SE-0451 in
swiftlang/swift#76636. The first component
of a `-module-alias` flag is allowed to be a raw identifier (without
the backticks), which allows for uses like the following:

```
import `//some/module:name`
```

The *target* of the alias (the actual module name) must still be a
valid Swift identifier, since that corresponds to file system
artifacts that need stricter naming conventions.
@allevato allevato merged commit 68876a6 into swiftlang:main Mar 12, 2025
4 of 5 checks passed
@allevato allevato deleted the rich-identifiers branch March 12, 2025 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants