-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Fix an AccessedStorage assert for SIL global variables. #40996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@swift-ci test |
This PR is currently blocked because a locally defined SILGlobalVariable incorrectly has Public linkage:
As defined in This is why I was using the global's VarDecl before, which does have correct linkage, but isn't available when parsing SIL. To reproduce, inspect SILLinkage passed to
This happens because
Then, in
Is the |
Build failed |
Build failed |
I'm not sure I'm following here: why doesn't the global get
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
If the global is defined with an initializer, such as Instead, we walk the AST as follows
And This looks like an obvious bug, but I'm very nervous about changing linkage of all our non-public globals from HiddenExternal to Hidden if this is the way it's been for several years. I'll need to do more investigation to see why we never noticed the issue before. |
@eeckstein I finally tracked this down using watchpoints. CrossModuleOptimization changes SILLinkage. I would never imagine this was possible, because... The semantics of global variables are expressed by the API The foundation of compiler architecture is that all passes agree on program semantics. So the idea that an optimization pass changes semantics in the middle of the pipeline is so fundamentally flawed that it would never occur to me. I honestly don't know what to make of it. I can't proceed without knowing the intended semantics. Does external linkage affect program semantics or not? Those semantics need to be decided during SILGen. The result of Here's a potential example of two well-defined APIs: isVisibleExternally() - Semantics: can the entity be accessed outside this SILModule? hasExternalLinkage() - No program semantics. Mechanically, is the symbol referenced outside this SILModule. This cannot be used to affect optimization of code using the symbol. It can only affect whether the definition is emitted. The implementation will boil down to the supported compilation modes. Will WMO and CMO ever be enabled together? If not, there needs to be a check in ParseSILArgs and assertions in SILModule. If so, then CMO should never affect semantics. Can CMO assume that dependent modules will be recompiled whenever certain whole-module analyses change? That's a very tricky dependence problem to solve. I don't think the infrastructure exists for it. |
140edff
to
eb86d4e
Compare
@atrick I think you are confusing a few things here.
Of course it can - like any other properties in SIL which are modified throughout the pass pipeline. Let me give you an example of why this works and why
The CMO pass is in the middle of the pass pipeline. Let's assume it will decide to serialize
WMO is a pre-requisite for CMO. That's enforced. |
|
I want to confirm that, with or without CMO, any change to the imported module requires recompiling the dependent modules, completely independent from what was actually serialized. So I don't understand how the linkage of a symbol could affect semantics.
This only works under two conditions:
The currently assumed semantics of isPossiblyUsedExternally are:
This has to do with visibility, not linkage! Those semantics are important for all symbols, data or code. Because isPossiblyUsedExternally has semantics that affect compilation (beyond the mechanics of linking symbols), it's result must be determined by the contents of the module, not the decisions made by CMO. In general, semantics can be refined. isPossiblyUsedExternally could start out as true and later be determined to be false. This works because the new semantics are a superset of the old. All assumptions made under the old semantics are still valid under the new semantics. This isn't relvant to CMO, because CMO shouldn't change visibility. I'm just mentioning it for completeness. The point is, we definitely can't "unrefine" semantics. If it's ever allowable for a symbol to be visible externally, then isPossiblyUsedExternally must have always been false. |
Ok. So here's the answer confirmed by @eeckstein . I'm not sure how to make this all clear in the API yet:
Specifically, Public visibility is necessary for correctness only when the externally serialized code contains a reference to that symbol. That reference must have been visible to the module prior to changing linkage. So, the overall program semantics are preserved. My fundamental misunderstanding of visibility semantics was that I had the meaning of "used externally" backward. I thought "external visibility" meant that a symbol could be referenced outside the module and would not be visible when recompiling the defining module. Logically, that is not the case with internal symbols. Rather, the symbol is still semantically restricted to its defining module. However, some of the module's contents have become invisible to the optimizer even though those contents still affect semantics. Those "external contents" will simply happen to be emitted in a separate module. Recompilation of the dependent module will ensure that those "external contents" are updated when any of the defining module's semantics change (even if the definition of those external contents themselves don't change). To be clear: isPossiblyUsedExternally does not mean that external uses exist which may not be recompiled when the defining module changes. isPossiblyUsedExternally must become more conservative when functions are serialized then deleted from the defining module. Otherwise, references to internal symbols may become invisible to the optimizer even though those references still logically exist within the module. |
@eeckstein now on to the next questions... Problem 1: The cross-module optimization pass now disables whole module optimizations. That happens even when WMO has been enabled and CMO has been disabled. There's no option to get back the original WMO behavior. Any WMO optimization that relies on isPossiblyUsedExternally will be broken. So, even though the current PR makes sense, I can't make the change without breaking an optimization that we currently rely on. WMO has been the default for some time now and this is the main technique we recommend for removing things like useless exclusivity checks on internal symbols. Problem 2: Passes, like the one in this PR, AccessEnforcementWMO, check the visibility of VarDecl. Changing the visibility in SILLinkage in a way that contradicts the AST is particularly confusing and dangerous. We need a separate concept for SIL-level external visibility, call it something other than Public, and make it clear that SIL passes can only rely on SIL visibility, not AST visibility. That also avoids the misleading people into thinking SILLinkage Public visibility has the same semantics as AST Public visibility (it took me a long time here to realize they're working under completely different assumptions). AST Public references are visible in a way that will not be discovered when recompiling the defining module. |
To clarify: that's not true in general. Actually CMO requires WMO. I guess you mean that if CMO decides to serialize a function (based on a heuristic), it disables WMO optimizations for the referenced private/internal functions and globals. Yes, that can result in performance regressions. What we can do is to move the CMO and serialization passes later in the pipeline, where all the important optimizations have been done already. We would do that only for CMO builds but not for library-evolution builds.
We can rename the existing SILLinkage and related APIs to make it more clear. And/or add source code comments. |
Right. We could move serialization for "CMO builds" only. I am concerned that serializing later creates a paradox. Serializing after dropping semantic attributes breaks that form of "cross module" optimization. But maybe that's most important for modules that directly inline standard library routines (as opposed to stdlib routines being inlined across two modules). Since the standard library is itself resilient, it will be serialized early, so that might not be a big deal. But what is a "CMO build"? It appears to be any build that is not resilient. The "CMO compiler option" does not actually determine whether CMO is enabled. That's actually the "library evolution" option. That was confusing to me. Short term options for this PR:
I don't think we should ship a release where there's no way to remove exclusivity checks for internal symbols. I think that's a very effective optimization. Manually disabling checks for specific properties is unsafe and doesn't replace that.
If SIL optimizations don't need to make a distinction between public vs. exported by CMO, then we don't need a new SILLinkage mode. It's already disturbingly large and confusing, and I'm not sure how their semantics overlap with their mechanics. I would love to just rename SILLinkage::Public to SILLinkage::Exported and define it as "symbols that are either public, or have been exported via CMO". Then we can explain why it's safe for non-resilient builds to change linkage form Hidden to Exported in the middle of the pipeline. i.e. it does not invalidate any previous compiler assumptions, which may be embedded anywhere in the SIL by that time--this is true precisely because the symbol is never treated as public from the point-of-view of module dependencies which drive recompilation. |
Allow round-tripping access to global variables. Previously, AccessedStorage asserted that global variables were always associated with a VarDecl. This was to ensure that AccessEnforcmentWMO always recognized the global. Failing to recognize access to a global will cause a miscompile. SILGlobalVariable now has all the information needed by SIL. Particularly, the 'isLet' flag. Simply replace VarDecl with SILGlobalVariable in AccessEnforcmentWMO to eliminate the need for the assert.
@eeckstein fixed the issue described above here:
|
@swift-ci test |
Allow round-tripping access to global variables. Previously, AccessedStorage asserted that global variables were always associated with a VarDecl. This was to ensure that AccessEnforcmentWMO always recognized the global. Failing to recognize access to a global will cause a miscompile.
SILGlobalVariable now has all the information needed by SIL. Particularly, the 'isLet' flag. Simply replace VarDecl with SILGlobalVariable in AccessEnforcmentWMO to eliminate the need for the assert.
Note that I didn't create a new class for the DisjointAccessLocationKey because that requires lots of PointerLikeTypeTraits and DenseMap boilerplate. There are only a few simple locally defined functions that operate on the type.