Skip to content

COWArrayOpts: make the optimization work again for two-dimensional arrays. #19714

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 5, 2018

Conversation

eeckstein
Copy link
Contributor

With removing of pinning and with addressors, the pattern matching did not work anymore.
The good thing is that the SIL is now much simpler and we can handle the 2D case without pattern matching at all.
This removes a lot of code from COWArrayOpts.

rdar://problem/43863081

@eeckstein
Copy link
Contributor Author

@swift-ci test

@eeckstein
Copy link
Contributor Author

@swift-ci benchmark

@swift-ci
Copy link
Contributor

swift-ci commented Oct 4, 2018

Build failed
Swift Test Linux Platform
Git Sha - 6671443044b2e70db01190dbd4d0b799d73a324c

@swift-ci
Copy link
Contributor

swift-ci commented Oct 4, 2018

Build comment file:

Performance: -O

TEST OLD NEW DELTA RATIO
Regression
IterateData 1615 1778 +10.1% 0.91x (?)
Improvement
Array2D 12296 6907 -43.8% 1.78x
Walsh 398 352 -11.6% 1.13x

Code size: -O

TEST OLD NEW DELTA RATIO
Improvement
RandomShuffle.o 3856 3521 -8.7% 1.10x
Array2D.o 4320 4000 -7.4% 1.08x
RC4.o 5231 5023 -4.0% 1.04x
Walsh.o 9544 9336 -2.2% 1.02x

Performance: -Osize

TEST OLD NEW DELTA RATIO
Regression
IterateData 1589 1798 +13.2% 0.88x (?)
Improvement
Array2D 11998 6909 -42.4% 1.74x
Walsh 967 884 -8.6% 1.09x

Code size: -Osize

TEST OLD NEW DELTA RATIO
Improvement
RandomShuffle.o 3815 3735 -2.1% 1.02x
Array2D.o 4243 4163 -1.9% 1.02x

Performance: -Onone

TEST OLD NEW DELTA RATIO
Regression
ObjectiveCBridgeStubDateMutation 743 800 +7.7% 0.93x

Code size: Swift libraries

TEST OLD NEW DELTA RATIO
Improvement
libswiftSwiftOnoneSupport.dylib 221184 217088 -1.9% 1.02x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false alarms. Unexpected regressions which are marked with '(?)' are probably noise. If you see regressions which you cannot explain you can try to run the benchmarks again. If regressions still show up, please consult with the performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

@eeckstein
Copy link
Contributor Author

@swift-ci smoke test

2 similar comments
@eeckstein
Copy link
Contributor Author

@swift-ci smoke test

@eeckstein
Copy link
Contributor Author

@swift-ci smoke test

Copy link
Contributor

@atrick atrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look good to me except for the points below. I'd also feel more comfortable if @aschwaighofer takes a look at it too.

makeMutableCalls.push_back(MakeMutableCall);
}

for (ArraySemanticsCall MakeMutableCall : makeMutableCalls) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[style] it makes a little more sense to "pop" the values out of makeMutableCalls so it doesn't contain dangling references. Or at least comment that the loop consumes the array.

The other option that I like, especially when it's not so easy to collect all the input, is to have any transformation that deletes instructions return a valid iterator to the next instruction, possibly pointing to new replacement instructions, or to the end of the block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed to do it this way (for good reasons) in an earlier version. But not anymore in this version. I'll change it back to the original code.

}
if (auto *IA = dyn_cast<IndexAddrInst>(V)) {
SILBasicBlock *IndexBlock = IA->getIndex()->getParentBlock();
if (IndexBlock && !DomTree->dominates(IndexBlock, Preheader))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[code clarity] It's not clear why there needs to be a dominance check here. (I assume it has something to do with bounds checking but I'm not sure why).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because all projections have to be loop invariant. index_addr is different from other projection instructions because it has a second operand which has to be checked separately.
I'll add a comment

Preheader->getTerminator(), DomTree);
/// Return the underlying Array address after stripping off all address
/// projections. Returns an invalid SILValue if the array base does not dominate
/// the loop.
Copy link
Contributor

@atrick atrick Oct 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[question] getArrayAddressBase is only called for the self of a make_mutable call. How could an Array self be produced by any of the projection instructions handled below? Is this helper entirely meant to handle 2d arrays? Is there a test case that exercises getArrayAddressBase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's especially important for 2d arrays. But it can an appear whenever the array is contained in another struct/class.
There are test cases in cowarray_opt.sil for this.

LLVM_DEBUG(llvm::dbgs() << " Skipping Array: does not dominate loop!\n");
return false;
}

SmallVector<unsigned, 4> AccessPath;
SILValue ArrayContainer = getAccessPath(CurrentArrayAddr, AccessPath);
bool arrayContainerIsUnique = checkUniqueArrayContainer(ArrayContainer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[follow-up] checkUniqueArrayContainer should work in a lot more places now that we have exclusive access markers. Should we file a bug for that or just add a TODO here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a todo

bool arrayContainerIsUnique = checkUniqueArrayContainer(ArrayContainer);

StructUseCollector StructUses;
StructUses.collectUses(ArrayContainer, AccessPath);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[design] I don't understand the motivation for moving collectUses early. Now, for large, complicated loops, a lot of work is done up front, and when the array is not unique, we throw the work away. I supposed you could just guard the call to collectUses with a check fro arrayContainerIsUnique.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The motivation was that it is now also needed in case of hasLoopOnlyDestructorSafeArrayOperations() == true.
But you are right, I can move it past the arrayContainerIsUnique check.

This option is very old. The same effect can now be achieved with pass-manager options, like -sil-print-before
…rays.

With removing of pinning and with addressors, the pattern matching did not work anymore.
The good thing is that the SIL is now much simpler and we can handle the 2D case without pattern matching at all.
This removes a lot of code from COWArrayOpts.

rdar://problem/43863081
@eeckstein
Copy link
Contributor Author

@swift-ci smoke test

1 similar comment
@eeckstein
Copy link
Contributor Author

@swift-ci smoke test

Copy link
Contributor

@aschwaighofer aschwaighofer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code changes look good to me.

But there seem to be no 2d array sil tests anymore. They all look like they test one dimensional cases to me?

@atrick
Copy link
Contributor

atrick commented Oct 5, 2018

LGTM. I agree with Arnold that's it's nice to have SIL tests whenever there's a fair amount of extra complexity in the pass to handle that SIL.

@eeckstein
Copy link
Contributor Author

The 2d tests start at 'sil @hoist_array2d'

@eeckstein
Copy link
Contributor Author

They should (more or less) test all the extra logic which I added (which was not much)

@eeckstein
Copy link
Contributor Author

@swift-ci smoke test OS X platform

1 similar comment
@eeckstein
Copy link
Contributor Author

@swift-ci smoke test OS X platform

@eeckstein eeckstein merged commit ab980ff into swiftlang:master Oct 5, 2018
@eeckstein eeckstein deleted the fix-cowopt branch October 5, 2018 17:16
rposts pushed a commit to linux-on-ibm-z/swift that referenced this pull request Oct 5, 2018
COWArrayOpts: make the optimization work again for two-dimensional arrays.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants