-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
bpo-47131: Speedup AST comparisons in test_unparse by using node traversal #32132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
You appear to only be modifying a test file in this PR — is that deliberate? |
Yes, as stated in the BPO issue. Speeding up the test harness is good for the buildbots. |
Apologies — I read the BPO description but not the BPO title! |
(It would be good to make it clear in the PR title, as well as the BPO title, that this change is specifically related to speeding up the test suite — I assumed it was going to be a performance improvement to a stdlib function.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pablogsal @isidentical: Would you mind to review this AST change?
Lib/test/test_unparse.py
Outdated
@@ -130,7 +130,23 @@ class Foo: pass | |||
|
|||
class ASTTestCase(unittest.TestCase): | |||
def assertASTEqual(self, ast1, ast2): | |||
self.assertEqual(ast.dump(ast1), ast.dump(ast2)) | |||
missing = object() | |||
def compare(a, b): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defining a function at each assertASTEqual() call sounds expensive. Can you pass self parameter to compare() and move it as the class level (method or staticmethod), or even at the module level (only define it once)?
For example:
def _assertASTEqual(self, a, b): # accept AST nodes, lists and objects
... # your code
def assertASTEqual(self, ast1, ast2): # accept AST nodes
self._assertASTEqual(ast1, ast2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I timed three approaches to see exactly the costs involved. All test runs were performed using:
amd64\python_d.exe -m test -uall test_unparse
All three versions, inner function, module function and method come in at 59s +/- 0.2s over 5 runs. Contrasted to the current runtime of 95s.
My chosen approach is modeled loosely after ast.dump()
by using the inner function. An advantage of the inner function is that the missing
value does not need to pollute the function signature nor global namespace. Ultimately, the design is just stylistic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you prefer to redefine the function at each call, would you mind to add a comment explaining why you prefer that?
My worry is mostly about performance. If there is no impact on performance, you can leave it as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do note, however, in micro-benchmarks creating inner functions have a substantial runtime cost. But in this case, that time is dwarfed by time spent calling that function.
Can you provide some comparisons before and after this PRs in test running time, specify exactly what are you running? That way we can try to reproduce it :) |
Lib/test/test_unparse.py
Outdated
traverse_compare(value1, value2) | ||
elif isinstance(a, list): | ||
if len(a) != len(b): | ||
self.fail("len(a) != len(b)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You want an f-string here likely, becase the names "a" and "b" are meaningless in the error message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pablogsal Wouldn't assertNotEqual
be better here?
elif isinstance(a, list):
- if len(a) != len(b):
- self.fail("len(a) != len(b)")
+ self.assertNotEqual(len(a), len(b))
Edit: ... since self.fail
is used instead of assert
anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, because you want also to show a and b somehow in the error, not just that 4 is different than 6
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Lib/test/test_unparse.py
Outdated
elif isinstance(a, list): | ||
if len(a) != len(b): | ||
self.fail("len(a) != len(b)") | ||
for node1, node2 in zip(a, b): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also use the new zip(a,b, strict=True)
and skip the previous test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did originally, but using self.fail()
from an except block results in a nested exception. I don't know of a way to do raise ... from None
without the raise
:)
Lib/test/test_unparse.py
Outdated
for field in a._fields: | ||
value1 = getattr(a, field, missing) | ||
value2 = getattr(b, field, missing) | ||
traverse_compare(value1, value2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can avoid the call if both are missing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but that adds an additonal if
for the successful case, which IMO should be the common path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A function call is extremely more expensive than a conditional, especially if you compare them with is
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I thought, adding a check for missing
ended up with a statistically insignificant slowdown, however it did lead to the realization of singleton checking, which results in a further 2% improvement.
Lib/test/test_unparse.py
Outdated
# the `missing` value from polluting the "real" function | ||
# signature or module namespace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can make the "missing" value a default argument to achieve the same, and the load will be slightly faster than the DEREF. The function is not public so is not a problem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the same commit, I moved missing
to an argument default of the inner function. I added the detail per Victor's request, but I'd gladly strike it just as well.
As stated in the BPO, there is a 33% improvement in runtime. My comment to Victor has exact timings, 95s before, 59s after. |
I've pushed new changes which should resolve all comments |
# An AST comparison routine modeled after ast.dump(), but | ||
# instead of string building, it traverses the two trees | ||
# in lock-step. | ||
def traverse_compare(a, b, missing=object()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you moved missing inside the function definition, now you may move the function definition at the class level (method) of the module level (function, you should pass the testcase as an argument in this case), to avoid the cost of defining a new function at each assertASTEqual() call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are concerned about performance, using an inner function is faster than calling a method. True, the setup cost is greater, but that time is more than recovered due to attribute lookup after a few hundred comparisons. Global (module level) functions suffer the same fate as well but require more comparisons before getting overtaken.
After instrumenting test_unparse to get the number of calls to assertASTEqual and compares, I whipped up a reproducible minimal test script to show the timings. The script is here for you to run yourself.
Using main on Win10:
100 -> inner: 29 usec method: 27.4 usec
200 -> inner: 56.8 usec method: 48 usec
300 -> inner: 76.3 usec method: 86.4 usec
CosmeticTestCase (14 compares)
inner: 88.9 usec
method: 77.9 usec
DirectoryTestCase (400 compares)
inner: 74.1 msec
method: 79.5 msec
UnparseTestCase (400 compares)
inner: 91 msec
method: 97.6 msec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I wrote previously, if there is a good reason to define a "local" (nested?) function, please add a comment here for future readers. Otherwise, someone who doesn't like nested functions can move the function (during a random refactoring) without knowing that you ran benchmarks and the code is faster when written like that. It's very easy to lose information when code is read/modified 5, 10 or 20 years later. And yes, it happens to me often to read 20 years old Python code and I have to dig into Git history (with unpleasant issues about CVS, SVN and HG commit numbers) and very old bug tracker issues to attempt to rebuild the rationale for the existing code before feeling safe to change it.
The performance doesn't make sense (method/module level code) to me. It should be as fast or faster. But my only god are benchmarks: I only trust benchmarks numbers :-)
I'm sorry but I need to comment as this is preventing me from doing further work. This discussion has left a sour taste in my mouth. I've been contributing to Python in some form or another (patches, reviews, PRs, buildbots, monetarily) for 20+ years. Out of the desire to see my buildbots (and by extension, all of the others) run quickly and failure free, I have offered an improvement that reduces runtime by 33% (depending on the machine ~30s), and at least on my buildbot, results in a total runtime reduction of 10%. There have been quibbles over 0.1% differences, overlooked information that was available here or on BPO, and misunderstandings of just how some things work in practice. I personally try to not give uninformed advice and would hope others respond in kind. As gatekeepers to Python's source, I get that its quality must be preserved, however uninformed pushback is not helpful. Or at least, feels to me, as uninformed. Maybe it's just lack time to do more thorough responses, I get that Python development time can be a scarce resource. Would it not, then, be better spent, encouraging more contributors to lessen that burden? One could assume, due to me running buildbots, that I care about testing? Good faith should mean I have no desire to degrade the test suite, quite the opposite. Anyway, I've provided detailed testings of my approach, I would like to see this merged and believe that this implementation is currently is as good as it can be without getting obscure and difficult to reason about. I have no desire to make any more changes. Merge or close, at this point, it makes little matter to me. Thank you for your consideration. If there is any part of this that people would like to respond to, I'm more than happy to have that conversation, public or private. This was not meant to offend anyone, but if I didn't write this, I would not be inclined to continue providing any PRs. |
Thanks for that :-)
Don't give us ill intents. Reviewing takes a lot of time. It's not just about reading the code, but change context at each PR/issue, read the code around, understand what's going on, try to dig into the history, consider better alternatives, etc. I personally have a limited bandwidth to care about the Windows buildbots, but I know that you spent a lot of energy on fixing tons of Windows issues that no one else cares about, so I'm forcing myself to spend more time on this specific PR :-) AST is out of my scope/knowledge, I prefer to leave the local expert, Pablo and Batuhan. Pablo has limited availability and Batuhan simply seems unavailable. Be patient, it will be fine ;-) While you have a laser focus on Windows buildbots, there are always more urgent blocker issues to fix, other bugs to fix, etc.
The PR of merging any PR is that the person who merges it takes the responsibility of regressions and futures issues around the change in the next 5 years. When I don't want to take this responsibility, I prefer to merge, or even not review.
Contributing to Python is very frustrating for contributors, but also for core developers who are usually seen as persons in charge for reviews (whereas in practice anyone is free to review). |
IMO so far, the review was productive, the changes already made the code better. |
Core developers are in charge of the merge and would be remiss if not doing a review (as in making sure things work as they should, not strictly a GH code review) prior to merge anyway. I suppose some may see it as core only, but it really is core finally. I know time is precious, so I have no (well, minimal, TBH) issue with leaving changes sit for weeks, months and yes, even years.
I have no issues with changes that align with my stated goal (speed). Pablo pointed out some of those. The problem with that is a conflation of changes being done. For example, changes are requested to further improve failure messages. Existing failure reporting (comparing string results) is abysmal at best, so anything I added is already an improvement, but just a side-effect. If improved error reporting is desired, that should have been done as a subsequent PR, but now I'm roped into doing that too, just to have any hope of getting my speed-up merged. It took longer to develop the improved error reporting than to do the initial PR. Perfect is the enemy of good. I hope you can see how frustrating it can be to be asked to do things that have nothing to do with your stated goal, just because of "hey, look at what's possible now!" And without stating what are your actual goals. Basically, why you would want a particular change contrasted to the current state of affairs, not the isolation of just changed lines. I'm not here to downplay any effort already put forth, but my hope is that insights into where I've had frustrations will help improve interactions for all others as well. |
Thanks for the PR and for addressing the changes @jkloth ! |
https://bugs.python.org/issue47131