Skip to content

[Serialization] Do less work checking if a value can be deserialized. #9666

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

jrose-apple
Copy link
Contributor

@jrose-apple jrose-apple commented May 16, 2017

Previously we recorded the canonical type of the declaration and made sure we could deserialize that, but that's a lot of extra work building up intermediate types that we mostly don't need. Instead, record smaller types that represent the possible points of failure—right now, just the nominal types that are referenced by the value (function, variable/constant, subscript, or initializer). I chose to use types instead of declarations here because types can potentially encode more complicated constraints later (such as generic types checking that their arguments still conform).

This gains us back 20% of type-checking time on a compile-time microbenchmark: let _ = [1, 2]. I expect the effect is less dramatic the more expressions you have, since we only need to deserialize things once.

@jrose-apple jrose-apple requested a review from DougGregor May 16, 2017 22:31
@jrose-apple
Copy link
Contributor Author

@swift-ci Please test

@swift-ci
Copy link
Contributor

Build failed
Jenkins build - Swift Test OS X Platform
Git Commit - 1d905b1fb95801557380c844be2e5dff359d024b
Test requested by - @jrose-apple

@jrose-apple
Copy link
Contributor Author

@swift-ci Please test macOS

@jrose-apple jrose-apple requested a review from graydon May 16, 2017 23:02
Copy link
Member

@DougGregor DougGregor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor suggestions for improvement.

nameAndDependencyIDs.slice(numNameComponentsBiased);
} else {
name = baseName;
nameAndDependencyIDs = nameAndDependencyIDs.drop_front();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change of meaning of nameAndDependencyIDs here really threw me. The code is correct, but kinda hard to read. What about just defining

auto dependencyIDs = nameAndDependencyIDs.slice(numNameComponentsBiased);

?

SmallVector<Type, 2> result;
ty.visit([&](CanType next) {
if (auto *nominal = next->getAnyNominal())
result.push_back(nominal->getDeclaredType());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about getDeclaredInterfaceType here, so we don't have any archetypes in there that would force deserialization of a generic environment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getDeclaredType should already be using UnboundGenericType, no? I wanted the absolute smallest type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

…but that probably doesn't make sense for nested types. Okay, getDeclaredInterfaceType it is.

if (auto *nominal = next->getAnyNominal())
result.push_back(nominal->getDeclaredType());
});
return result;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you unique the types in result?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good idea.

Previously we recorded the canonical type of the declaration and made
sure we could deserialize that, but that's a lot of extra work
building up intermediate types that we mostly don't need. Instead,
record smaller types that represent the possible points of failure---
right now, just the nominal types that are referenced by the value
(function, variable/constant, subscript, or initializer). I chose to
use types instead of declarations here because types can potentially
encode more complicated constraints later (such as generic types
checking that their arguments still conform).

This gains us back 20% of type-checking time on a compile-time
microbenchmark: `let _ = [1, 2]`. I expect the effect is less dramatic
the more expressions you have, since we only need to deserialize
things once.
@jrose-apple jrose-apple force-pushed the deserialization-recovery-without-the-whole-type branch from 1d905b1 to fe4f8d4 Compare May 17, 2017 00:07
@jrose-apple
Copy link
Contributor Author

jrose-apple commented May 17, 2017

macOS full test on prior version (passed!)

@swift-ci Please smoke test

Copy link
Member

@DougGregor DougGregor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@jrose-apple
Copy link
Contributor Author

jrose-apple commented May 17, 2017

https://ci.swift.org/job/swift-PR-osx-smoke-test/8231/

[ RUN      ] TestDateInterval.test_EncodingRoundTrip_JSON
stdout>>> check failed at /Users/buildnode/jenkins/workspace/swift-PR-osx-smoke-test/branch-master/swift/test/stdlib/Inputs/FoundationTestsShared.swift, line 19
stdout>>> expected: 0000-12-30 00:00:00 +0000 to 2017-05-17 01:04:59 +0000 (of type Foundation.DateInterval)
stdout>>> actual: 0000-12-30 00:00:00 +0000 to 2017-05-17 01:04:59 +0000 (of type Foundation.DateInterval)
stdout>>> Decoded DateInterval <0000-12-30 00:00:00 +0000 to 2017-05-17 01:04:59 +0000> not equal to original <0000-12-30 00:00:00 +0000 to 2017-05-17 01:04:59 +0000>
stdout>>> check failed at /Users/buildnode/jenkins/workspace/swift-PR-osx-smoke-test/branch-master/swift/test/stdlib/Inputs/FoundationTestsShared.swift, line 19
stdout>>> expected: 2017-05-17 01:04:59 +0000 to 4001-01-01 00:00:00 +0000 (of type Foundation.DateInterval)
stdout>>> actual: 2017-05-17 01:04:59 +0000 to 4001-01-01 00:00:00 +0000 (of type Foundation.DateInterval)
stdout>>> Decoded DateInterval <2017-05-17 01:04:59 +0000 to 4001-01-01 00:00:00 +0000> not equal to original <2017-05-17 01:04:59 +0000 to 4001-01-01 00:00:00 +0000>
[     FAIL ] TestDateInterval.test_EncodingRoundTrip_JSON

@itaiferber?

@jrose-apple
Copy link
Contributor Author

@swift-ci Please smoke test macOS

@itaiferber
Copy link
Contributor

@jrose-apple Strange... I'll disable the test.

@itaiferber
Copy link
Contributor

#9682

@itaiferber
Copy link
Contributor

@jrose-apple Should be clear now.

@jrose-apple
Copy link
Contributor Author

@swift-ci Please smoke test macOS

@jrose-apple jrose-apple merged commit 4a6fe94 into swiftlang:master May 17, 2017
@jrose-apple jrose-apple deleted the deserialization-recovery-without-the-whole-type branch May 17, 2017 16:02
jrose-apple added a commit to jrose-apple/swift that referenced this pull request May 17, 2017
…swiftlang#9666)

Previously we recorded the canonical type of the declaration and made
sure we could deserialize that, but that's a lot of extra work
building up intermediate types that we mostly don't need. Instead,
record smaller types that represent the possible points of failure---
right now, just the nominal types that are referenced by the value
(function, variable/constant, subscript, or initializer). I chose to
use types instead of declarations here because types can potentially
encode more complicated constraints later (such as generic types
checking that their arguments still conform).

This gains us back 20% of type-checking time on a compile-time
microbenchmark: `let _ = [1, 2]`. I expect the effect is less dramatic
the more expressions you have, since we only need to deserialize
things once.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants