Skip to content

[Parse] Introduce /.../ regex literals #42119

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 12, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 83 additions & 0 deletions include/swift/AST/DiagnosticEngine.h
Original file line number Diff line number Diff line change
Expand Up @@ -848,6 +848,7 @@ namespace swift {
friend class DiagnosticTransaction;
friend class CompoundDiagnosticTransaction;
friend class DiagnosticStateRAII;
friend class DiagnosticQueue;

public:
explicit DiagnosticEngine(SourceManager &SourceMgr)
Expand Down Expand Up @@ -1137,10 +1138,20 @@ namespace swift {
/// Send \c diag to all diagnostic consumers.
void emitDiagnostic(const Diagnostic &diag);

/// Handle a new diagnostic, which will either be emitted, or added to an
/// active transaction.
void handleDiagnostic(Diagnostic &&diag);

/// Clear any tentative diagnostics.
void clearTentativeDiagnostics();

/// Send all tentative diagnostics to all diagnostic consumers and
/// delete them.
void emitTentativeDiagnostics();

/// Forward all tentative diagnostics to a different diagnostic engine.
void forwardTentativeDiagnosticsTo(DiagnosticEngine &targetEngine);

public:
DiagnosticKind declaredDiagnosticKindFor(const DiagID id);

Expand Down Expand Up @@ -1333,6 +1344,78 @@ namespace swift {
}
};

/// Represents a queue of diagnostics that have their emission delayed until
/// the queue is destroyed. This is similar to DiagnosticTransaction, but
/// with a few key differences:
///
/// - The queue maintains its own diagnostic engine (which may be accessed
/// through `getDiags()`), and diagnostics must be specifically emitted
/// using that engine to be enqueued.
/// - It allows for non-LIFO transactions, as each queue operates
/// independently.
/// - A queue can be drained multiple times without having to be recreated
/// (unlike DiagnosticTransaction, it has no concept of "closing").
///
/// Note you may add DiagnosticTransactions to the queue's diagnostic engine,
/// but they must be closed before attempting to clear or emit the diagnostics
/// in the queue.
///
class DiagnosticQueue final {
/// The underlying diagnostic engine that the diagnostics will be emitted
/// by.
DiagnosticEngine &UnderlyingEngine;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me propose an alternate design for DiagnosticQueue:

  • There is no UnderlyingEngine member
  • DiagnosticQueue::DiagnosticQueue() just adds a ForwardingDiagnosticConsumer to QueueEngine
  • emit() just closes the QueueEngine's transaction, flushing the tentative diagnostics out to consumers

I think that would mean you wouldn't have to modify DiagnosticEngine at all. Does your design have advantages over that one? (It might—I'd like to hear about them.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I did play about with ForwardingDiagnosticConsumer a bit, the main issue is that is forwards directly to the consumers and bypasses the target engine altogether, so doesn't update any state on the engine (including hadAnyError!). Now, we should definitely fix that, although it's not clear to me whether all of its clients actually want the diagnostic to be fully re-evaluated by the target engine. There are a couple of other things:

  • We'd need to turn a DiagnosticInfo back into a Diagnostic to be re-evaluated by the target engine (as it may need to become a tentative diagnostic). We'd need special handling to make sure e.g warning-as-error is preserved if the original diagnostic engine had it set.
  • Because we'd be emitting the diagnostic in a dummy engine, we'd lose the caching for decl rendering (though we probably ought to cache that on the ASTContext or in the evaluator).

I don't think there's anything here that can't be solved, but it seemed less clean to me than creating a dummy diagnostic engine to effectively serve as a Diagnostic builder, which can then be passed onto a target engine. If we want ForwardingDiagnosticConsumer to properly support being able to forward-and-re-evaluate a diagnostic, then I agree we should change this to do that. But either way we'll need DiagnosticEngine changes I believe.


/// A temporary engine used to queue diagnostics.
DiagnosticEngine QueueEngine;

/// Whether the queued diagnostics should be emitted on the destruction of
/// the queue, or whether they should be cleared.
bool EmitOnDestruction;

public:
DiagnosticQueue(const DiagnosticQueue &) = delete;
DiagnosticQueue &operator=(const DiagnosticQueue &) = delete;

/// Create a new diagnostic queue with a given engine to forward the
/// diagnostics to.
explicit DiagnosticQueue(DiagnosticEngine &engine, bool emitOnDestruction)
: UnderlyingEngine(engine), QueueEngine(engine.SourceMgr),
EmitOnDestruction(emitOnDestruction) {
// Open a transaction to avoid emitting any diagnostics for the temporary
// engine.
QueueEngine.TransactionCount++;
}

/// Retrieve the engine which may be used to enqueue diagnostics.
DiagnosticEngine &getDiags() { return QueueEngine; }

/// Retrieve the underlying engine which will receive the diagnostics.
DiagnosticEngine &getUnderlyingDiags() { return UnderlyingEngine; }

/// Clear this queue and erase all diagnostics recorded.
void clear() {
assert(QueueEngine.TransactionCount == 1 &&
"Must close outstanding DiagnosticTransactions before draining");
QueueEngine.clearTentativeDiagnostics();
}

/// Emit all the diagnostics recorded by this queue.
void emit() {
assert(QueueEngine.TransactionCount == 1 &&
"Must close outstanding DiagnosticTransactions before draining");
QueueEngine.forwardTentativeDiagnosticsTo(UnderlyingEngine);
}

~DiagnosticQueue() {
if (EmitOnDestruction) {
emit();
} else {
clear();
}
QueueEngine.TransactionCount--;
}
};

inline void
DiagnosticEngine::diagnoseWithNotes(InFlightDiagnostic parentDiag,
llvm::function_ref<void(void)> builder) {
Expand Down
8 changes: 7 additions & 1 deletion include/swift/AST/DiagnosticsParse.def
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,9 @@ ERROR(forbidden_extended_escaping_string,none,
ERROR(regex_literal_parsing_error,none,
"%0", (StringRef))

ERROR(prefix_slash_not_allowed,none,
"prefix operator may not contain '/'", ())

//------------------------------------------------------------------------------
// MARK: Lexer diagnostics
//------------------------------------------------------------------------------
Expand Down Expand Up @@ -140,7 +143,10 @@ ERROR(lex_invalid_escape_delimiter,none,
ERROR(lex_invalid_closing_delimiter,none,
"too many '#' characters in closing delimiter", ())

ERROR(lex_unterminated_regex,none,
ERROR(lex_regex_literal_invalid_starting_char,none,
"regex literal may not start with %0; add backslash to escape",
(StringRef))
ERROR(lex_regex_literal_unterminated,none,
"unterminated regex literal", ())

ERROR(lex_invalid_unicode_scalar,none,
Expand Down
5 changes: 3 additions & 2 deletions include/swift/Basic/LangOptions.h
Original file line number Diff line number Diff line change
Expand Up @@ -564,8 +564,9 @@ namespace swift {
/// Enables dumping type witness systems from associated type inference.
bool DumpTypeWitnessSystems = false;

/// Enables `/.../` syntax regular-expression literals
bool EnableForwardSlashRegexLiterals = false;
/// Enables `/.../` syntax regular-expression literals. This requires
/// experimental string processing. Note this does not affect `#/.../#`.
bool EnableBareSlashRegexLiterals = false;

/// Sets the target we are building for and updates platform conditions
/// to match.
Expand Down
55 changes: 47 additions & 8 deletions include/swift/Parse/Lexer.h
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,16 @@ enum class LexerMode {
SIL
};

/// Whether or not the lexer should attempt to lex a `/.../` regex literal.
enum class LexerForwardSlashRegexMode {
/// No `/.../` regex literals will be lexed.
None,
/// A `/.../` regex literal will be lexed, but only if successful.
Tentative,
/// A `/.../` regex literal will always be lexed for a '/' character.
Always
};

/// Kinds of conflict marker which the lexer might encounter.
enum class ConflictMarkerKind {
/// A normal or diff3 conflict marker, initiated by at least 7 "<"s,
Expand All @@ -75,7 +85,10 @@ class Lexer {
const LangOptions &LangOpts;
const SourceManager &SourceMgr;
const unsigned BufferID;
DiagnosticEngine *Diags;

/// A queue of diagnostics to emit when a token is consumed. We want to queue
/// them, as the parser may backtrack and re-lex a token.
Optional<DiagnosticQueue> DiagQueue;

using State = LexerState;

Expand Down Expand Up @@ -109,6 +122,10 @@ class Lexer {
/// a .sil file.
const LexerMode LexMode;

/// Whether or not a `/.../` literal will be lexed.
LexerForwardSlashRegexMode ForwardSlashRegexMode =
LexerForwardSlashRegexMode::None;

/// True if we should skip past a `#!` line at the start of the file.
const bool IsHashbangAllowed;

Expand Down Expand Up @@ -154,6 +171,19 @@ class Lexer {

void initialize(unsigned Offset, unsigned EndOffset);

/// Retrieve the diagnostic engine for emitting diagnostics for the current
/// token.
DiagnosticEngine *getTokenDiags() {
return DiagQueue ? &DiagQueue->getDiags() : nullptr;
}

/// Retrieve the underlying diagnostic engine we emit diagnostics to. Note
/// this should only be used for diagnostics not concerned with the current
/// token.
DiagnosticEngine *getUnderlyingDiags() {
return DiagQueue ? &DiagQueue->getUnderlyingDiags() : nullptr;
}

public:
/// Create a normal lexer that scans the whole source buffer.
///
Expand Down Expand Up @@ -209,6 +239,10 @@ class Lexer {
LeadingTriviaResult = LeadingTrivia;
TrailingTriviaResult = TrailingTrivia;
}
// Emit any diagnostics recorded for this token.
if (DiagQueue)
DiagQueue->emit();

if (Result.isNot(tok::eof))
lexImpl();
}
Expand Down Expand Up @@ -298,12 +332,12 @@ class Lexer {
void restoreState(State S, bool enableDiagnostics = false) {
assert(S.isValid());
CurPtr = getBufferPtrForSourceLoc(S.Loc);
// Don't reemit diagnostics while readvancing the lexer.
llvm::SaveAndRestore<DiagnosticEngine*>
D(Diags, enableDiagnostics ? Diags : nullptr);

lexImpl();

// Don't re-emit diagnostics from readvancing the lexer.
if (DiagQueue && !enableDiagnostics)
DiagQueue->clear();

// Restore Trivia.
if (TriviaRetention == TriviaRetentionMode::WithTrivia)
LeadingTrivia = S.LeadingTrivia;
Expand Down Expand Up @@ -505,7 +539,7 @@ class Lexer {

void getStringLiteralSegments(const Token &Str,
SmallVectorImpl<StringSegment> &Segments) {
return getStringLiteralSegments(Str, Segments, Diags);
return getStringLiteralSegments(Str, Segments, getTokenDiags());
}

static SourceLoc getSourceLoc(const char *Loc) {
Expand All @@ -531,6 +565,11 @@ class Lexer {
void operator=(const SILBodyRAII&) = delete;
};

/// Attempt to re-lex a regex literal with forward slashes `/.../` from a
/// given lexing state. If \p mustBeRegex is set to true, a regex literal will
/// always be lexed. Otherwise, it will not be lexed if it may be ambiguous.
void tryLexForwardSlashRegexLiteralFrom(State S, bool mustBeRegex);

private:
/// Nul character meaning kind.
enum class NulCharacterKind {
Expand Down Expand Up @@ -595,8 +634,8 @@ class Lexer {
void lexStringLiteral(unsigned CustomDelimiterLen = 0);
void lexEscapedIdentifier();

/// Attempt to lex a regex literal, returning true if a regex literal was
/// lexed, false if this is not a regex literal.
/// Attempt to lex a regex literal, returning true if lexing should continue,
/// false if this is not a regex literal.
bool tryLexRegexLiteral(const char *TokStart);

void tryLexEditorPlaceholder();
Expand Down
14 changes: 14 additions & 0 deletions include/swift/Parse/Parser.h
Original file line number Diff line number Diff line change
Expand Up @@ -559,6 +559,11 @@ class Parser {
return f(backtrackScope);
}

/// Discard the current token. This will avoid interface hashing or updating
/// the previous loc. Only should be used if you've completely re-lexed
/// a different token at that position.
SourceLoc discardToken();

/// Consume a token that we created on the fly to correct the original token
/// stream from lexer.
void consumeExtraToken(Token K);
Expand Down Expand Up @@ -1752,8 +1757,17 @@ class Parser {
ParserResult<Expr>
parseExprPoundCodeCompletion(Optional<StmtKind> ParentKind);

UnresolvedDeclRefExpr *makeExprOperator(const Token &opToken);
UnresolvedDeclRefExpr *parseExprOperator();

/// Try re-lex a '/' operator character as a regex literal. This should be
/// called when parsing in an expression position to ensure a regex literal is
/// correctly parsed.
///
/// If \p mustBeRegex is set to true, a regex literal will always be lexed if
/// enabled. Otherwise, it will not be lexed if it may be ambiguous.
void tryLexRegexLiteral(bool mustBeRegex);

void validateCollectionElement(ParserResult<Expr> element);

//===--------------------------------------------------------------------===//
Expand Down
30 changes: 23 additions & 7 deletions lib/AST/DiagnosticEngine.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1040,24 +1040,40 @@ DiagnosticBehavior DiagnosticState::determineBehavior(const Diagnostic &diag) {

void DiagnosticEngine::flushActiveDiagnostic() {
assert(ActiveDiagnostic && "No active diagnostic to flush");
handleDiagnostic(std::move(*ActiveDiagnostic));
ActiveDiagnostic.reset();
}

void DiagnosticEngine::handleDiagnostic(Diagnostic &&diag) {
if (TransactionCount == 0) {
emitDiagnostic(*ActiveDiagnostic);
emitDiagnostic(diag);
WrappedDiagnostics.clear();
WrappedDiagnosticArgs.clear();
} else {
onTentativeDiagnosticFlush(*ActiveDiagnostic);
TentativeDiagnostics.emplace_back(std::move(*ActiveDiagnostic));
onTentativeDiagnosticFlush(diag);
TentativeDiagnostics.emplace_back(std::move(diag));
}
ActiveDiagnostic.reset();
}

void DiagnosticEngine::clearTentativeDiagnostics() {
TentativeDiagnostics.clear();
WrappedDiagnostics.clear();
WrappedDiagnosticArgs.clear();
}

void DiagnosticEngine::emitTentativeDiagnostics() {
for (auto &diag : TentativeDiagnostics) {
emitDiagnostic(diag);
}
TentativeDiagnostics.clear();
WrappedDiagnostics.clear();
WrappedDiagnosticArgs.clear();
clearTentativeDiagnostics();
}

void DiagnosticEngine::forwardTentativeDiagnosticsTo(
DiagnosticEngine &targetEngine) {
for (auto &diag : TentativeDiagnostics) {
targetEngine.handleDiagnostic(std::move(diag));
}
clearTentativeDiagnostics();
}

/// Returns the access level of the least accessible PrettyPrintedDeclarations
Expand Down
10 changes: 7 additions & 3 deletions lib/Frontend/CompilerInvocation.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -504,6 +504,13 @@ static bool ParseLangArgs(LangOptions &Opts, ArgList &Args,
Opts.EnableExperimentalStringProcessing |=
Args.hasArg(OPT_enable_experimental_string_processing);

// Whether '/.../' regex literals are enabled. This implies experimental
// string processing.
if (Args.hasArg(OPT_enable_bare_slash_regex)) {
Opts.EnableBareSlashRegexLiterals = true;
Opts.EnableExperimentalStringProcessing = true;
}

Opts.EnableExperimentalBoundGenericExtensions |=
Args.hasArg(OPT_enable_experimental_bound_generic_extensions);

Expand Down Expand Up @@ -1010,9 +1017,6 @@ static bool ParseLangArgs(LangOptions &Opts, ArgList &Args,
if (Args.hasArg(OPT_disable_requirement_machine_reuse))
Opts.EnableRequirementMachineReuse = false;

if (Args.hasArg(OPT_enable_bare_slash_regex))
Opts.EnableForwardSlashRegexLiterals = true;

if (Args.hasArg(OPT_enable_requirement_machine_opaque_archetypes))
Opts.EnableRequirementMachineOpaqueArchetypes = true;

Expand Down
Loading