Skip to content

Runtime: Compact the uniquing header for foreign metadata records. #13539

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 21, 2017

Conversation

jckarter
Copy link
Contributor

@jckarter jckarter commented Dec 19, 2017

We can reduce the uniquing header from 3–4 pointer-sized words down to 1–2 32-bit words + one pointer:

  • The initialization function (when present) and name are always emitted into the same binary image, so we can use relative references to shrink these down to 32-bit fields.
  • We don't ever simultaneously need the initialization flags and the initialized uniqued pointer. (Keeping the "initialization function" flag bit theoretically lets us turn a "consume" load into a "relaxed" load, but that makes no practical difference on most contemporary architectures.) 12 flag bits Ought To Be Enough For Anyone and lets us reliably tell a valid pointer from a flag set, so overlap the initialization flags with the eventual invasive cache value.

The invasive cache is left inline, since we've decided we're not going to make the rest of type metadata records ever be true-const, so they'll already be sitting on a dirty page.

rdar://problem/22527141

@jckarter
Copy link
Contributor Author

@swift-ci Please test

@jckarter
Copy link
Contributor Author

@swift-ci Please test source compatibility

@jckarter
Copy link
Contributor Author

@rjmccall @gparker42 Mind checking my work?

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - e2782b752d8d8b2a2e52f4c25eaa704884f30799

@jckarter jckarter force-pushed the squeeze-foreign-metadata-header branch from e2782b7 to ddbfba4 Compare December 19, 2017 23:07
@jckarter
Copy link
Contributor Author

@swift-ci Please test

1 similar comment
@jckarter
Copy link
Contributor Author

@swift-ci Please test

@jckarter
Copy link
Contributor Author

@swift-ci Please test source compatibility

1 similar comment
@jckarter
Copy link
Contributor Author

@swift-ci Please test source compatibility

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - e2782b752d8d8b2a2e52f4c25eaa704884f30799

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - e2782b752d8d8b2a2e52f4c25eaa704884f30799

@gparker42
Copy link
Contributor

"we're not going to make the rest of type metadata records ever be true-const, so they'll already be sitting on a dirty page"

We ought to discuss that elsewhere. dyld's capabilities mean that "not true-const" and "dirty page" are moving farther apart.

return getFlags() & HeaderPrefix::HasInitializationFunction;
}
/// Return the initialization function for this metadata record if it has
/// one and has not been initialized.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment should clarify the behavior if the record has no initialization function or has already been initialized (undefined return, instead of nil return).

@jckarter
Copy link
Contributor Author

@gparker42 I think I see what you're alluding to. I can move the cache out of line; that would allow us to compact the inline header another 32 bits, and fit it into 64 bits in the common case where there's no init function, and we'd still have room in the relative pointer to the out-of-line cache to store the "has init function" bit and have a second bit left over for future expansion. That would allow foreign value type metadata to be constant-after-relocations, in turn for a word of BSS that takes a step longer dependency chain to read in the fast path. Is that reasonable?

@gparker42
Copy link
Contributor

No need to do anything now. I expect to make an "optimize for dyld optimization" pass across the ABI next month.

Also this cache may itself be a candidate for future dyld optimization. In a shared cache or an app cache we may be able to unique the metadata pointers and set the correct cache value in advance. Then this data isn't dirty after all.

@slavapestov
Copy link
Contributor

compact the inline header another 32 bits, and fit it into 64 bits in the common case

Fewer dirty pages is certainly a worthy goal, but should we really be prioritizing bit packing at this level?

@gparker42
Copy link
Contributor

Yes.

@jckarter
Copy link
Contributor Author

OK, I'll leave it as is for now, after updating the comment. (Test failures appear to be unrelated upstream breakage.)

@jckarter jckarter force-pushed the squeeze-foreign-metadata-header branch from ddbfba4 to 7f8adf5 Compare December 20, 2017 17:32
@jckarter
Copy link
Contributor Author

@swift-ci Please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - ddbfba4f2f24675a6ba595a2694ff7adb24a10fd

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - ddbfba4f2f24675a6ba595a2694ff7adb24a10fd

@jckarter
Copy link
Contributor Author

@swift-ci Please test source compatibility

We can reduce the uniquing header from 3–4 pointer-sized words down to 1–2 32-bit words + one pointer:

- The initialization function (when present) and name are always emitted into the same binary image, so we can use relative references to shrink these down to 32-bit fields.
- We don't ever simultaneously need the initialization flags and the initialized uniqued pointer. (Keeping the "initialization function" flag bit theoretically lets us turn a "consume" load into a "relaxed" load, but that makes no practical difference on most contemporary architectures.) 12 flag bits Ought To Be Enough For Anyone and lets us reliably tell a valid pointer from a flag set, so overlap the initialization flags with the eventual invasive cache value.

The invasive cache is left inline, since we've decided we're not going to make the rest of type metadata records ever be true-const, so they'll already be sitting on a dirty page. A dynamic linker that was sufficiently Swift-optimized to precalculate the other load-time-initialized entries in metadata could likely precompute the invasive cache value as well.

rdar://problem/22527141
@jckarter jckarter force-pushed the squeeze-foreign-metadata-header branch from 7f8adf5 to 05fc210 Compare December 20, 2017 22:09
@jckarter
Copy link
Contributor Author

@swift-ci Please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 7f8adf55469c5b0bfe45cb7b250a141ebedd5e8c

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - 7f8adf55469c5b0bfe45cb7b250a141ebedd5e8c

@jckarter jckarter merged commit c283a69 into swiftlang:master Dec 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants