Skip to content

[sycl-post-link][NFC] Small cleanups and cosmetic changes #4986

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

mlychkov
Copy link
Contributor

@mlychkov mlychkov commented Nov 18, 2021

This is first part of major tool refactoring.
I created a separate review for it to simplify review of the whole change set.

Signed-off-by: Mikhail Lychkov [email protected]

@mlychkov mlychkov changed the title [sycl-post-link][NFC] Small cleanups and cosmetic changes [sycl-post-link] Refactor modules splitting sequence Nov 18, 2021
@mlychkov mlychkov changed the title [sycl-post-link] Refactor modules splitting sequence [sycl-post-link][NFC] Small cleanups and cosmetic changes Nov 18, 2021
@mlychkov mlychkov marked this pull request as ready for review November 18, 2021 14:52
@mlychkov
Copy link
Contributor Author

This PR is a part of changes reordering in PR #4828.

Copy link
Contributor

@kbobrovs kbobrovs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my taste, std::vector<const Function *> is more descriptive than FuncPtrVector, yet short enough. The same applies to ModuleUPtr and StringRefVector, couple others.
But my preference is not strong enough to hold this.

kbobrovs
kbobrovs previously approved these changes Nov 20, 2021
Copy link
Contributor

@kbobrovs kbobrovs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. LGTM.

I should've mentioned that renaming is OK in case it specializes the meaning of the container for some specific purpose. While std::vector<const Function *> -> FuncPtrVector does not add clarity in my subjective view, e.g. std::vector<const Function *> -> EntryPointGroup would.

As a side note, map is generally slower than unordered_map. Besides, I don't even see any real use of the map which would prevent replacing it by a vector of <Name, EntryPointGroup> pairs and avoiding log(N) insertion/lookup penalty. You mentioned that there are thousands of entry points, so since you are refactoring anyway, this could be another 2 cents (separate PR OK).

@kbobrovs
Copy link
Contributor

Besides, I don't even see any real use of the map which would prevent replacing it by a vector of <Name, EntryPointGroup> pairs and avoiding log(N) insertion/lookup penalty.

I now do see it :) This is entry point grouping by source ID or kernel name. So my note about using vector should be disregarded. map vs unordered_map can still be considered.

Copy link
Contributor

@kbobrovs kbobrovs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just found a memory leak:
auto *SDLReqMaskLegacyPass = new SYCLDeviceLibReqMaskPass();

@mlychkov, I actually went ahead and implemented the refactoring we talked about, which is needed to enable single-module SYCL and ESIMD mix (ESIMD part needs to undergo different set of optimizations today, which is TODO for future), yet greatly simplifies the logic. I've fixed the leak as well.
Using your suggestion, I made Module splitting sequential not to keep all modules in memory. I will push the PR soon, we can discuss and converge on next steps on Monday.

@kbobrovs
Copy link
Contributor

Just found a memory leak:
auto *SDLReqMaskLegacyPass = new SYCLDeviceLibReqMaskPass();

I take it back, pass manager takes care of deallocation

@mlychkov
Copy link
Contributor Author

I now do see it :) This is entry point grouping by source ID or kernel name. So my note about using vector should be disregarded. map vs unordered_map can still be considered.

Thank you, it is a good catch for improvement. However, I agree that it would be better to do in a separate PR, because it may change order of modules splitting and so it may require to update sequence of checks in some LIT tests. I'll leave a TODO comment about this improvement.

@kbobrovs
Copy link
Contributor

@mlychkov, here is the PR which I mentioned - #5001. Will try to reach you tomorrow to discuss.

@mlychkov
Copy link
Contributor Author

mlychkov commented Nov 23, 2021

I now do see it :) This is entry point grouping by source ID or kernel name. So my note about using vector should be disregarded. map vs unordered_map can still be considered.

Thank you, it is a good catch for improvement. However, I agree that it would be better to do in a separate PR, because it may change order of modules splitting and so it may require to update sequence of checks in some LIT tests. I'll leave a TODO comment about this improvement.

I've just realized that there is no guarantee for std::unordered_map that the order of elements is the same between application executions. At least on different platforms. Output ordering doesn't matter in practical use, but may require to rewrite tests so that they do not depend on sycl-post-link tool output order. I think that this improvement hardly will accelerate execution significantly (vast majority of time is spent on input IR parsing, splitting with processing and saving IR into a file).
Thus, I think that container for these data should not be changed.

@bader bader merged commit 73ec27f into intel:sycl Nov 24, 2021
@kbobrovs
Copy link
Contributor

I now do see it :) This is entry point grouping by source ID or kernel name. So my note about using vector should be disregarded. map vs unordered_map can still be considered.

Thank you, it is a good catch for improvement. However, I agree that it would be better to do in a separate PR, because it may change order of modules splitting and so it may require to update sequence of checks in some LIT tests. I'll leave a TODO comment about this improvement.

I've just realized that there is no guarantee for std::unordered_map that the order of elements is the same between application executions. At least on different platforms. Output ordering doesn't matter in practical use, but may require to rewrite tests so that they do not depend on sycl-post-link tool output order. I think that this improvement hardly will accelerate execution significantly (vast majority of time is spent on input IR parsing, splitting with processing and saving IR into a file). Thus, I think that container for these data should not be changed.

I see, makes sense. Maybe a comment that map is used for execution determinism would make sense.

@mlychkov mlychkov deleted the private/mlychkov/sycl_post_link_1mod_refactoring_v2 branch November 24, 2021 06:51
@mlychkov
Copy link
Contributor Author

mlychkov commented Nov 24, 2021

I now do see it :) This is entry point grouping by source ID or kernel name. So my note about using vector should be disregarded. map vs unordered_map can still be considered.

Thank you, it is a good catch for improvement. However, I agree that it would be better to do in a separate PR, because it may change order of modules splitting and so it may require to update sequence of checks in some LIT tests. I'll leave a TODO comment about this improvement.

I've just realized that there is no guarantee for std::unordered_map that the order of elements is the same between application executions. At least on different platforms. Output ordering doesn't matter in practical use, but may require to rewrite tests so that they do not depend on sycl-post-link tool output order. I think that this improvement hardly will accelerate execution significantly (vast majority of time is spent on input IR parsing, splitting with processing and saving IR into a file). Thus, I think that container for these data should not be changed.

I see, makes sense. Maybe a comment that map is used for execution determinism would make sense.

On the other hand, I think that key value in this map is required only on grouping step only.
We might group entry points in map and at the end just move values into std::vector<EntryPointGroup>.
Using vector container may also help in parallelization of splitting input module.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants