Skip to content

Ignore backticks in documentation abstracts #936

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 19, 2025

Conversation

natecook1000
Copy link
Member

@natecook1000 natecook1000 commented Feb 17, 2025

The linguistic tagger doesn't do a great job recognizing backticks as opening vs. closing quotes, so some valid abstracts that contain backticks were being rejected by the BeginDocumentationCommentWithOneLineSummary rule. This change removes any InlineCode nodes from the abstract paragraph before processing, since they aren't important to that judgment.

Fixes #924.

Copy link
Member

@ahoppen ahoppen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this cause issues when you have a code block that contains a period and would thus recognize it as a multi-sentence first paragraph?

Eg.

/// The same as `Foo.foo()`.
func bar() {}

If not, could you add a test case for that?

@allevato
Copy link
Member

We can assume that a sentence-ending period will never occur inside an inline code block, so I wonder if we should simply remove the content within backticks before doing this check, instead of removing just the backticks.

Since the briefSummary field is already a Paragraph node, we could do this in a structured way by just locally mutating that tree to remove any InlineCode children (and likewise for SymbolLink children, to handle the DocC-style double-backtick syntax).

The only thing I'm not sure about is how the linguistic tagger would handle a space before the period, because I'm assuming that removing the InlineCode node from this:

Foo bar `baz`.

would leave us with "Foo bar .", and I'm not familiar enough with that API to know if it would handle that. If that's a problem, then we could address that by instead of removing the InlineCode/SymbolLink children, replacing them with a plaintext word like "placeholder", and then run the linguistic tagger on that result.

The linguistic tagger doesn't do a great job recognizing backticks as opening vs. closing
quotes, so some valid abstracts that contain backticks were being rejected by the
`BeginDocumentationCommentWithOneLineSummary` rule. This change removes any backtick-
quoted sections from abstracts before processing, since they aren't important to that
judgment.
@natecook1000
Copy link
Member Author

I would have sworn I updated this commit before opening the PR, but here we are. @allevato, I switched to use a MarkupRewriter to remove InlineCode nodes, and it seems to work great! I added a test case using a double-backtick link (which I don't think should have periods, but just in case) and it seemed to do fine.

Copy link
Member

@allevato allevato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This turned out super elegant. Thanks for the fix!

@allevato allevato merged commit ed9d716 into swiftlang:main Feb 19, 2025
19 checks passed
@natecook1000 natecook1000 deleted the doc-abstract-fixes branch February 25, 2025 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

invalid BeginDocumentationCommentWithOneLineSummary warning
3 participants