-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Scale test improvements #5607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scale test improvements #5607
Conversation
sum_y = 0 | ||
sum_prod = 0 | ||
sum_x_sq = 0 | ||
for i in range(n): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be simpler as
for a, b in zip(x, y):
sum_x += a
sum_y += b
sum_prod += a * b
sum_x_sq += a * a
Or even just using sum(x)
, sum(y)
, sum(a * b for a, b in zip(x, y))
, since I can't imagine there'll be enough data for doing multiple passes to matter.
As we discussed before, the subtractions below may be subject to catastrophic cancellation/numerical instability, but I think this shouldn't be too bad: it's worst when the numbers are all similar and/or there's a lot of them, but the nature of this script should mean neither of those are true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced the sums as suggested.
@@ -154,6 +199,9 @@ def main(): | |||
'--multi-file', action='store_true', | |||
default=False, help='vary number of input files as well') | |||
parser.add_argument( | |||
'--sum-multi', action='store_true', | |||
default=False, help='simulate a multi-primary run and sum stats') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is "multi-primary"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When swiftc is invoked with N files, say swiftc A.swift B.swift C.swift
, its driver turns around and runs swiftc -frontend --primary-file A.swift A.swift B.swift C.swift
, swiftc -frontend --primary-file B.swift A.swift B.swift C.swift
, swiftc -frontend --primary-file C.swift A.swift B.swift C.swift
We hope that in cases like this, it only does quadratic amounts of work parsing the files N*N times and then typechecks-and-translates each file once (=N units of work) but in some cases it does more. --sum-multi
simulates such a run, and sums the statistics over all the sub-frontend-jobs.
048c7fd
to
c6d38f5
Compare
@swift-ci please smoke test and merge |
Handful of improvements to the scale-test infrastructure, split out of work on SR-2901 / PR #5588
Includes replacement of numpy with manual linear-regression code, as missing numpy caused failure of CI on previous PR.
Each change should be self-describing. None executed as part of the build yet.