Skip to content

Strengthen the sanity checks, and fix some minor style or bugs uncovered #579

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Aug 6, 2022

Conversation

Julian
Copy link
Member

@Julian Julian commented Jul 31, 2022

This looks a bit noisy, especially due to fixing across multiple drafts, but it does a few simple things to make the sanity checks a bit better. Only one is newish (mentioned below), the rest are really things we've always really "wanted" but was apparently missing from the sanity check. Specifically the changes here:

  • makes the sanity check use "subtests", which basically just means that when CI runs and a sanity test fails because one test in the suite is somehow invalid, the rest of the tests will still be sanity checked, and the PR author will see all the invalid tests not just the first one
  • case descriptions are checked for uniqueness too, not just test descriptions (this I could have sworn was already here, but apparently it wasn't, so there were a small number of tests which had non-unique descriptions, including a minor "bug" where some tests were running twice because they were accidentally copy pasted)
  • case descriptions are checked for length too, again same as test descriptions (the reasoning is similar for both, though given there are some long case descriptions, I left the length limit longer for now)
  • Prevent the use of "test that" or "X should do Y" in test descriptions. This is "new" in the sense that I've only mentioned this in PR reviews once or twice, but it's a stylistic thing which I don't like in test cases (in any language, but here too). Some reasoning is in the commit if that seems controversial, but there were only a few of these in the suite, which were easily removed. Essentially these are "noisy" words which usually can be removed without changing the meaning of the description.

A bit more detail is in each commit message if you're even more curious, but hopefully it's clear that each description changed is at least as good as before.

Feel free to ignore reviewing the Python code if you (future reviewer) aren't comfortable reading it (or feel free to press through and review it).

Julian added 13 commits July 31, 2022 21:44
A subtest in Python means that it will continue running the
other examples (tests, cases, whatever) and show an error for
each one, rather than simply stopping on the first invalid
(test, case, whatever).

This should make it easier for seeing what's wrong in CI runs
without needing to repeatedly run each time.

A further improvement would be grouping tests "horizontally" across
versions rather than just within a version, as often authors are copy
pasting tests across each version it applies to. With a subtest, they'll
see N failures for N versions, whereas it'd be better to see 1 failure
for e.g. test description which is too long across all N versions.
The reason we do this is both to force conciseness but also because
runners of the suite likely turn one or more of the combination of
case description and test description into an identifier within the
implementer's test framework. If descriptions are really long, it makes
things like copy pasting test names more annoying or difficult.

150 is quite long -- we use 70 for test descriptions, so in total this
means potentially 220 characters for the combined length. I'd like to
bring this down a bit but this is the number that passes all existing
descriptions at least.
The reason we do this for tests is to allow implementers to use
test descriptions as unique identifiers, and also because there's
a small chance a duplicate description is a copy paste error, or
some other test bug.

The same reasoning applies to cases.

0da2aaa is another good reason (which in theory applies to both test
cases and individual tests) where a case was accidentally duplicated a
second time during backporting (where it was already present).

See e.g. the first file fixed here, where descriptions should really
indicate what validator is being tested for clarity.

We don't however check directly for schema duplication in cases, as one
could imagine distinct scenarios being tested with the same underlying
schema.

Fixing further files will be done in successive commits.
Fixes a minor inefficiency in draft3 where separate
cases were used instead of just 2 tests.
These were fully duplicated accidentally (which the uniqueness
check now finds).
It makes descriptions longer (which we independently check for) but also
is stylistically a bit awkward.

https://jml.io/pages/test-docstrings.html is a good post on the subject,
with a sort of algorithm for removing this and related words.

Fix a first file (anchor) where descriptions are shortened and clarified
a bit.

Further files will be fixed in successive commits.
There aren't any currently (though one was just removed in
a previous 'should' commit).
@Julian Julian requested a review from a team as a code owner July 31, 2022 19:04
@handrews handrews self-assigned this Aug 5, 2022
Copy link
Contributor

@handrews handrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but would have been much easier to review as one PR for the python changes and one for all of the more-or-less trivial description changes.

@Julian
Copy link
Member Author

Julian commented Aug 6, 2022

Thanks!

@Julian Julian merged commit f827640 into main Aug 6, 2022
@Julian Julian deleted the sanity-check-strengthening branch August 6, 2022 07:11
Julian added a commit that referenced this pull request Aug 6, 2022
And fix those which fail.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants