Skip to content

Scale test improvements #5607

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Nov 2, 2016
Merged

Conversation

graydon
Copy link
Contributor

@graydon graydon commented Nov 2, 2016

Handful of improvements to the scale-test infrastructure, split out of work on SR-2901 / PR #5588

Includes replacement of numpy with manual linear-regression code, as missing numpy caused failure of CI on previous PR.

Each change should be self-describing. None executed as part of the build yet.

sum_y = 0
sum_prod = 0
sum_x_sq = 0
for i in range(n):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be simpler as

for a, b in zip(x, y):
    sum_x += a
    sum_y += b 
    sum_prod += a * b
    sum_x_sq += a * a

Or even just using sum(x), sum(y), sum(a * b for a, b in zip(x, y)), since I can't imagine there'll be enough data for doing multiple passes to matter.

As we discussed before, the subtractions below may be subject to catastrophic cancellation/numerical instability, but I think this shouldn't be too bad: it's worst when the numbers are all similar and/or there's a lot of them, but the nature of this script should mean neither of those are true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced the sums as suggested.

@@ -154,6 +199,9 @@ def main():
'--multi-file', action='store_true',
default=False, help='vary number of input files as well')
parser.add_argument(
'--sum-multi', action='store_true',
default=False, help='simulate a multi-primary run and sum stats')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "multi-primary"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When swiftc is invoked with N files, say swiftc A.swift B.swift C.swift, its driver turns around and runs swiftc -frontend --primary-file A.swift A.swift B.swift C.swift, swiftc -frontend --primary-file B.swift A.swift B.swift C.swift, swiftc -frontend --primary-file C.swift A.swift B.swift C.swift

We hope that in cases like this, it only does quadratic amounts of work parsing the files N*N times and then typechecks-and-translates each file once (=N units of work) but in some cases it does more. --sum-multi simulates such a run, and sums the statistics over all the sub-frontend-jobs.

@graydon graydon force-pushed the scale-test-improvements branch from 048c7fd to c6d38f5 Compare November 2, 2016 21:05
@graydon
Copy link
Contributor Author

graydon commented Nov 2, 2016

@swift-ci please smoke test and merge

@swift-ci swift-ci merged commit cc9dd47 into swiftlang:master Nov 2, 2016
@graydon graydon deleted the scale-test-improvements branch January 18, 2017 22:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants