-
Notifications
You must be signed in to change notification settings - Fork 14.3k
Clarify use of contractions in diagnostic messages #116803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify use of contractions in diagnostic messages #116803
Conversation
This dissuades contributors from using contractions when writing diagnostic wording for Clang. Contractions should be avoided because of the potential for visual confusion with single quoting syntactic constructs and because they can be harder to understand for non-native English speakers.
@llvm/pr-subscribers-clang Author: Aaron Ballman (AaronBallman) ChangesThis dissuades contributors from using contractions when writing diagnostic wording for Clang. Contractions should be avoided because of the potential for visual confusion with single quoting syntactic constructs and because they can be harder to understand for non-native English speakers. Full diff: https://github.com/llvm/llvm-project/pull/116803.diff 1 Files Affected:
diff --git a/clang/docs/InternalsManual.rst b/clang/docs/InternalsManual.rst
index f189cb4e6a2ac3..39d389b816f129 100644
--- a/clang/docs/InternalsManual.rst
+++ b/clang/docs/InternalsManual.rst
@@ -160,6 +160,10 @@ wording a diagnostic.
named in a diagnostic message. e.g., prefer wording like ``'this' pointer
cannot be null in well-defined C++ code`` over wording like ``this pointer
cannot be null in well-defined C++ code``.
+* Prefer diagnostic wording without contractions whenever possible. The single
+ quote in a contraction can be visually distracting due to its use with
+ syntactic constructs and contractions can be harder to understand for non-
+ native English speakers.
The Format String
^^^^^^^^^^^^^^^^^
|
* Prefer diagnostic wording without contractions whenever possible. The single | ||
quote in a contraction can be visually distracting due to its use with | ||
syntactic constructs and contractions can be harder to understand for non- | ||
native English speakers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps add the special case of cannot
vs can not
? Or is that already here somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the special case of
cannot
vscan not
?
As in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cannot
is a formally 'correct' way of saying it, and we just had a PR committed that changed our uses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cannot
is a formally 'correct' way of saying it
Well, ‘cannot’ and ‘can not’ mean different things, and yeah, usually, ‘cannot’ is what you want. I don’t think ‘can not’ would be too common in a diagnostic because those are typically not about something you’re allowed not to do...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a native speaker, (and looking in a dictionary), they are identical meaning (same as can't).
We DID have plenty of can not
in both comments and diagnostics, but they were recently changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they are identical meaning (same as can't).
So ‘cannot’ is identical to ‘can’t’, yes. ‘can not’ is a bit different in that the ‘can’ itself isn’t negated, but rather, the verb after it is, e.g. ‘I can not do that’ == ‘I am able / allowed to not do that’—which, arguably, this doesn’t come up too often because it’s a bit of an unusual thing to say in most circumstances, but if that meaning is intended, you’re supposed to write ‘can not’ and not ‘cannot’ (of course, from a descriptive point of view, one could argue that if people keep mistaking one for the other, there isn’t much of a point of differentiating the two, but I’m not sure we’re quite there yet).
Sorry for the rambling, but I like linguistics too much to be able to stop myself whenever topics like these come up. ;Þ
We DID have plenty of can not in both comments and diagnostics, but they were recently changed.
I definitely believe that most of those should probably have been ‘cannot’, yeah. ‘can not’ is often a typo for ‘cannot’, but it is a valid syntactic construct—provided that that’s what the writer actually intended to write, of course.
* Prefer diagnostic wording without contractions whenever possible. The single | ||
quote in a contraction can be visually distracting due to its use with | ||
syntactic constructs and contractions can be harder to understand for non- | ||
native English speakers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are also much easier to 'mis'/'mistake', so IDK if we want to point that out?
I don’t have a very strong opinion on this if the consensus is that this is a change for the better, but as someone with a background in linguistics, I’d argue that this seems like a weird thing to discourage—I don’t think the single quote is really distracting at all if it occurs in a common contraction (e.g. isn’t, aren’t, don’t, doesn’t, etc.), because you simply parse that as one word. Of course, I don’t think we should start writing ‘you’dn’t’ve’ or anything absurd like that, but I don’t think there’s anything wrong w/ normal contractions.
I also don’t think this is true: simple contractions are one of the first things we teach people, and I’ve yet to meet someone whose first language isn’t English and doesn’t know what e.g. ‘isn’t’ is supposed to mean. Is there any actual precedent for anyone being confused about this? |
It's not a significant problem, it's more a "scanning the line to see where the syntax is" problem in that it's a visual distraction if you're trying to find the variable name being diagnosed for a complex expression and there are contractions in the wording.
I am not a linguist, but this is something I've heard many times over the years when talking about writing to a multilingual audience. e.g., https://techcomm.nz/Story?Action=View&Story_id=394 That said, I would not be surprised if we could find plenty of sources saying the opposite. |
Hmm, I don’t know if there are any linguistic studies about this off the top of my head (I can only speak from my personal experience of never having encountered someone who’s had problem w/ contractions, despite having talked to a lot of people whose first language wasn’t English), but my reaction to stuff like that there is a lot of nonsensical linguistic ‘advice’ out there... (nonsense in that it is not at all based on how language actually works or on how people actually talk; think things like ‘you shouldn’t end a sentence with a preposition’, which is, to put it bluntly, abject nonsense perpetrated by would-be grammarians who thought it sensible to apply Latin grammar to English, despite the two being completely different languages that diverged millenia ago)–sorry if I sound a bit mean here (also not talking about you here btw; that was mostly directed towards English teachers who don’t actually know English grammar...), but anyone with a background with linguistics will tell you that we have to put up w/ a lot of nonsense... So basically, if we actually get complaints from people that our diagnostics (or documentation, etc.) are confusing because they contain contractions, then sure, it’d make perfect sense to do something about it, but I have a feeling the confusing part about C++ compiler diagnostics are generally not the contractions ;Þ |
Even if there was some subtle distinction between Second, using consistent wording maintains certain "look and feel". Maybe it's my personal experience only, but for me it looks more polished (as in "higher quality", or "more professional"). |
I don't think contractions are the confusing part of diagnostics, but I do think we want consistency between our diagnostics as much as possible and we use a mixture of both contractions and no contractions inconsistently (though that's improving). I fall on the side of avoiding contractions rather than including them. Do you have strong opinions on using contractions? Would you recommend we go the other direction and switch to consistently using contractions? |
I guess that makes sense yeah (I personally don’t care that much about consistency wrt diagnostic wording, but I can also see why that’s something we’d want).
I don’t have strong opinions about this, no; linguistically, imo either way is fine (I’d just be a bad linguist if I didn’t argue against prescriptivism whenever it comes up ;Þ), but I don’t have a problem w/ picking one over the other for non-linguistic reasons. I mean, I would probably prefer it if we could write diagnostic messages w/o having to think too hard as to what the correct style is wrt things like these (because it’s what I think people will just naturally do), but if it’s just a matter of ‘we want to be consistent, so let’s always do X, even though that choice is more or less arbitrary’, then that’s equally valid. So in sum, enforcing one over the other is not what I’d want to do (and I just don’t think it’s all that necessary), but if we decide to go that route, then I’m fine w/ that too ;Þ |
Here's another thing---could there be tools that try to parse the messages (e.g. something that runs clang and presents the messages to the user in some form)? Having a policy such as "single quotes only come in pairs" could make it easier. I don't know if that's something we should even take into consideration, it's just a thought. |
Hmm, I think we have other formats that are better suited for that (don’t we have a flag that makes us print JSON diagnostics?), so I’d hope that no-one tries to just parse the diagnostics from the terminal, and even then, you could definitely hard-code common contractions imo, but that is an interesting question nonetheless. |
Yeah, I think we'd want to push folks towards using
While it is annoying to have to remember a list of rules about diagnostic messages, I think it's important that we aim for consistency because I think we want there to be one "voice" to things like diagnostics, documentation, and other communications with the user. (The docs don't have to be consistent with the diagnostics, but should be consistent with other documentation in Clang, etc.) I think that provides a better user experience than having multiple "voices" throughout the product. Here's where we're at currently for contractions vs long form (looking at sema, parse, and common diagnostics): so I think we have a general preference for long form over contractions. From spot-checking the uses of contractions, it seems that all uses could pretty easily be written just as clearly as the long form and it wouldn't be much churn (about 15-20 messages in total).
I don't see much benefit to having such a lopsided approach as we currently have. That said, the proposal is to "prefer", so it's guiding rather than purely prescriptive. Can you live with that? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that’s fine, yeah
This dissuades contributors from using contractions when writing diagnostic wording for Clang. Contractions should be avoided because of the potential for visual confusion with single quoting syntactic constructs and because they can be harder to understand for non-native English speakers.