Skip to content

[sycl-post-link] Refactor cloned modules processing #4828

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

mlychkov
Copy link
Contributor

@mlychkov mlychkov commented Oct 27, 2021

ValueToValueMapTy map is not fully release on destruction the data that is
allocated on CloneModule call. Also split modules may utilize tens of megabytes
and their amount may be significant. It all may lead to overflow of RAM memory.

Rewrite module splitting in a way that each split module is processed and
immediately saved into a file without temporary collecting all modules in RAM.
Also try to share one ValueToValueMapTy instance for all cloned modules. It
may reduce memory utilization, but in some cases it also may loose sensitive data.
Add an option "-reduce-memory-usage" that turns on value map sharing mode.
This mode is off by default.

Signed-off-by: Mikhail Lychkov [email protected]

@mlychkov mlychkov marked this pull request as draft October 27, 2021 10:24
@mlychkov
Copy link
Contributor Author

Temporary moved to draft mode to check with users that new solution doesn't break result applications functionality.

@mlychkov mlychkov force-pushed the private/mlychkov/sycl_post_link_1mod_refactoring branch from 012fa96 to 05a4b07 Compare October 27, 2021 10:37
ValueToValueMapTy map is not fully release on destruction the data that is
allocated on CloneModule call. Also split modules may utilize tens of megabytes
and their amount may be significant. It all may lead to overflow of RAM memory.

Rewrite module splitting in a way that each split module is processed and
immediately saved into a file without temporary collecting all modules in RAM.

Signed-off-by: Mikhail Lychkov <[email protected]>
@mlychkov mlychkov force-pushed the private/mlychkov/sycl_post_link_1mod_refactoring branch from 05a4b07 to 5d80d3b Compare October 27, 2021 10:37
Signed-off-by: Mikhail Lychkov <[email protected]>
@mlychkov mlychkov requested a review from sndmitriev October 28, 2021 11:10
std::string ResModuleFile{};
bool CanReuseInputModule = !SyclAndEsimdCode && !IsEsimd &&
!IsLLVMUsedRemoved && !SpecConstsMet &&
(!DoSplit || GlobalsSet.size() <= 1);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last condition is equal to ResultModules.size() == 1 in previous implementation.
I'd like just to clarify if this is correct to reuse input module when split is requested and GlobalsSet.size() == 1. This means that split scope is global, or split scope is per_source and all entry points are in 1 source file, or split is per_kernel and there is only 1 entry point. Anyway in such a situation CloneModule is called and some code optimizations are performed during splitting.
May we reuse input module in this case or this case should be excluded from check?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May we reuse input module in this case or this case should be excluded from check?

I think it is ok to reuse input module in this case, because absolutely the same situation as one of you described here can happen with the current implementation as well - and we will reuse input module in that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then there are 2 concerns:

  1. If we can reuse input module when splitting is used and GlobalsSet.size() == 1 then splitting is useless because the result is not saved. We might save resources and just use input module instead of split one.
  2. The results when we can reuse input module and when for example llvm.used or spec const is added into the same input module will differ a lot because in the first case we will split module and just drop the result reusing input module, but in second case we will split module, optimize it and save into a file.

Comment on lines 504 to 521
// This map is shared between calls to fix memory leaks after its deletion and
// reduce number of allocations and deallocations of memory for Module objects
// during llvm::CloneModule call.
static ValueToValueMapTy VMap;

// During cloning of Module ValueMap tries to reuse already created old to new
// Value mappings and asserts if a new Value pointer is nullptr. It may happen
// after one cloned module is destroyed and we try to reuse ValueToValueMapTy
// object for the next module to clone.
// Proper solution is to improve CloneModule to support multiple clonings from
// one source Module.
// Current workaround is to drop all records that contain nullptr values.
for (auto It = VMap.begin(), End = VMap.end(); It != End;) {
if (It->second)
++It;
else
VMap.erase(It++);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still looks like a hack/workaround to me. I don't want to block the commit with this, but it would be great if we find root cause of ValueToValueMapTy not being deallocated properly to check if there is a more elegant solution to the problem

@mlychkov
Copy link
Contributor Author

mlychkov commented Nov 1, 2021

/summary:run

@mlychkov
Copy link
Contributor Author

mlychkov commented Nov 3, 2021

/summary:run

@mlychkov
Copy link
Contributor Author

/summary:run

@mlychkov
Copy link
Contributor Author

mlychkov commented Nov 17, 2021

Looks like test failure is a flaky known issue with llvm_test_suite_sycl/reduction_placeholder test abort after pass on L0 GPU target.

@kbobrovs
Copy link
Contributor

Also try to share one ValueToValueMapTy instance for all cloned modules. It
may reduce memory utilization, but in some cases it also may loose sensitive data.
Add an option "-reduce-memory-usage" that turns on value map sharing mode.
This mode is off by default.

Does loosing sensitive data lead to incorrect results/behavior in some cases? If so I don't think we should add this option.
Is it possible to automatically see if it is safe and apply the memory reduction only in that case (w/o option)?

using TableFiles = std::map<StringRef, string_vector>;

TableFiles processOneModule(std::unique_ptr<Module> M, bool IsEsimd,
bool SyclAndEsimdCode) {
struct SplitModeInfo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is quite hard to reverse engineer the overall algorithm/design. Can it be discussed online?

How much memory saving ReduceMemUsage brings?

@mlychkov
Copy link
Contributor Author

The patch is reorganized in a set of newer PRs to simplify review process.

@mlychkov mlychkov closed this Nov 24, 2021
@mlychkov mlychkov deleted the private/mlychkov/sycl_post_link_1mod_refactoring branch November 24, 2021 13:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants