Skip to content

🍒 [analyzer] NFC: Don't regenerate duplicate HTML reports. #7966

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

haoNoQ
Copy link

@haoNoQ haoNoQ commented Jan 12, 2024

This is a performance optimization for HTML diagnostics output mode.

Currently they're incredibly inefficient:

  • The HTMLRewriter is re-run from scratch on every file on every report. Each such re-run involves re-lexing the entire file and producing a syntax-highlighted webpage of the entire file, with text behind macros duplicated as pop-up macro expansion tooltips. Then, warning and note bubbles are injected into the page. Only the bubble part is different across reports; everything else can theoretically be cached.

  • Additionally, if duplicate reports are emitted (with the same issue hash), HTMLRewriter will be re-run even though the output file is going to be discarded due to filename collision. This is mostly an issue for path-insensitive bug reports because path-sensitive bug reports are already deduplicated by the BugReporter as part of searching for the shortest bug path. But on some translation units almost 80% of bug reports are dry-run here.

We only get away with all this because there are usually very few reports emitted per file. But if loud checkers are enabled, such as webkit.*, this may explode in complexity and even cause the compiler to run over the 32-bit SourceLocation addressing limit. (We're re-lexing everything each time, remember?)

This patch hotfixes the second problem. Adds a FIXME for the first problem, which will require more yak shaving to solve.

rdar://120801986
(cherry picked from commit 721dd3b)

This is a performance optimization for HTML diagnostics output mode.

Currently they're incredibly inefficient:

* The HTMLRewriter is re-run from scratch on every file on every report.
  Each such re-run involves re-lexing the entire file and producing
  a syntax-highlighted webpage of the entire file, with text behind macros
  duplicated as pop-up macro expansion tooltips. Then, warning and note
  bubbles are injected into the page. Only the bubble part is different
  across reports; everything else can theoretically be cached.

* Additionally, if duplicate reports are emitted (with the same issue hash),
  HTMLRewriter will be re-run even though the output file is going to be
  discarded due to filename collision. This is mostly an issue for
  path-insensitive bug reports because path-sensitive bug reports
  are already deduplicated by the BugReporter as part of searching
  for the shortest bug path. But on some translation units almost 80% of
  bug reports are dry-run here.

We only get away with all this because there are usually very few reports
emitted per file. But if loud checkers are enabled, such as `webkit.*`,
this may explode in complexity and even cause the compiler to run over
the 32-bit SourceLocation addressing limit. (We're re-lexing everything
each time, remember?)

This patch hotfixes the *second* problem. Adds a FIXME for the first problem,
which will require more yak shaving to solve.

rdar://120801986
(cherry picked from commit 721dd3b)
@haoNoQ
Copy link
Author

haoNoQ commented Jan 12, 2024

@swift-ci please test

@haoNoQ haoNoQ changed the title [analyzer] NFC: Don't regenerate duplicate HTML reports. 🍒 [analyzer] NFC: Don't regenerate duplicate HTML reports. Jan 12, 2024
@haoNoQ haoNoQ merged commit 7856a3d into swiftlang:stable/20230725 Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant