Skip to content

[SYCL][L0] Fix the Pi Platform Cache memory leak #2881

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

bso-intel
Copy link
Contributor

Previously we created piPlatformCache from the heap and never deallocated it because this cache should persist until the program ends.
This caused memory leak.
This PR fix this memory leak by turning it into a static vector which will be automatically deallocated when the user program finishes execution.

@@ -811,11 +811,11 @@ pi_result piPlatformsGet(pi_uint32 NumEntries, pi_platform *Platforms,
// runtime from a global destructor, and such a call could eventually
// access these variables. Therefore, there is no safe time when
// "PiPlatformsCache" and "PiPlatformsCacheMutex" could be deleted.
static auto PiPlatformsCache = new std::vector<pi_platform>;
static auto PiPlatformsCacheMutex = new std::mutex;
static std::vector<pi_platform> PiPlatformsCache;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read the comment above why we used "new". To avoid the leak we need to implement proper tear-down process, similar to how SYCL RT does it now: #2516

Copy link
Contributor Author

@bso-intel bso-intel Dec 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think I misunderstood you.
This PR is only fixing the memory leak for the PiPlatformCache and PiPlatformCacheMutex objects.
PiPlatformCache is a vector of pi_platform pointers which are allocated on the heap.
This PR is not intended to fix the leak of pi_platforms that are allocated somewhere else.
If there is an API called piPlatformRelease, I can easily implement a deallocation.
But, there is no such PI API, so SYCL RT never informs to plugins that platform_impl objects are deallocated so the underlying pi_platform can be deallocated, too.
Anyway, I am still working on the memory leak of pi_platform objects, which is not intended to fix in this PR.
This PR is intended only to fix the memory leak of vector itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please read lines 809-813 to see why it is not OK to use static globals for the cache (non-trivial data object): the order of static objects destruction is undefined.

Copy link
Contributor Author

@bso-intel bso-intel Dec 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. then I did not misunderstand you.
Let me explain what is going on with this static variable.
First of all PiPlatformsCache is already a "static" variable.
static auto PiPlatformsCache = new std::vector<pi_platform>;
Secondly, this is not a global variable. It is a local static variable. It seems like misunderstanding.
This "static" pointer variable points to the vector that is allocated on the heap.
As you may know, the heap allocated variable is not automatically deallocated unless we call "delete" for this pointer variable, right?
My PR changes this static pointer variable to a static local variable.
The static local variable persists in the static memory region (which is not on stack or heap) even after the function returns, but it will be automatically deallocated when the whole program process terminates.
This way the static local variable is deallocated when the program terminates.
As I said, this PR is not intended to deallocate the memory allocated for the pi_platform vector itself.
Hope this makes sense.

Copy link
Contributor Author

@bso-intel bso-intel Dec 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me help you understand how my PR fixes the leak in PiPlatformsCache with an example.
Here is a simple code that shows the memory leaks for the local static variable.

#include <vector>

int main() {
  int *a = new int;    // leak
  int *b = new int;    // leak
  int *c = new int;    // leak
  static auto PiPlatformsCache = new std::vector<int*>;   // leak
  PiPlatformsCache->push_back(a);
  PiPlatformsCache->push_back(b);
  PiPlatformsCache->push_back(c);
  return 0;
}

This code allocates 3 integers, a, b, c on the heap, and then create a new vector, vec, on the heap.
Since there was no “delete” for these heap variables, all those heap memory are leaked.

==26717== HEAP SUMMARY:
==26717==     in use at exit: 68 bytes in 5 blocks
==26717==   total heap usage: 8 allocs, 3 frees, 72,796 bytes allocated
==26717== LEAK SUMMARY:
==26717==    definitely lost: 0 bytes in 0 blocks
==26717==    indirectly lost: 0 bytes in 0 blocks
==26717==      possibly lost: 0 bytes in 0 blocks
==26717==    still reachable: 68 bytes in 5 blocks
==26717==         suppressed: 0 bytes in 0 blocks

Now I changed the variable ‘PiPlatformsCache’ to a local static vector.

#include <vector>

int main() {
  int *a = new int;    // leak
  int *b = new int;    // leak
  int *c = new int;     // leak
  static std::vector<int*> PiPlatformsCache;    // NO leak
  PiPlatformsCache.push_back(a);
  PiPlatformsCache.push_back(b);
  PiPlatformsCache.push_back(c);
  return 0;
}

Now, valgrind only detects memory leak for a, b, c.

==26709== HEAP SUMMARY:
==26709==     in use at exit: 12 bytes in 3 blocks
==26709==   total heap usage: 7 allocs, 4 frees, 72,772 bytes allocated
==26709==
==26709== 4 bytes in 1 blocks are definitely lost in loss record 1 of 3
==26709==    at 0x4C305B3: operator new(unsigned long) (vg_replace_malloc.c:342)
==26709==    by 0x10923A: main (simple2.cpp:4)
==26709==
==26709== 4 bytes in 1 blocks are definitely lost in loss record 2 of 3
==26709==    at 0x4C305B3: operator new(unsigned long) (vg_replace_malloc.c:342)
==26709==    by 0x109248: main (simple2.cpp:5)
==26709==
==26709== 4 bytes in 1 blocks are definitely lost in loss record 3 of 3
==26709==    at 0x4C305B3: operator new(unsigned long) (vg_replace_malloc.c:342)
==26709==    by 0x109256: main (simple2.cpp:6)
==26709==
==26709== LEAK SUMMARY:
==26709==    definitely lost: 12 bytes in 3 blocks
==26709==    indirectly lost: 0 bytes in 0 blocks
==26709==      possibly lost: 0 bytes in 0 blocks
==26709==    still reachable: 0 bytes in 0 blocks
==26709==         suppressed: 0 bytes in 0 blocks

Does it makes sense to you now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The static local variable persists in the static memory region (which is not on stack or heap) even after the function returns, but it will be automatically deallocated when the whole program process terminates.
This way the static local variable is deallocated when the program terminates.

That's the problem. The order of destruction of such static object (at program termination) is undefined. There might be other globals that depend on this one, but are deallocated later than this one. Allocating with new "resolves" this ordering problem by introducing an informed leak.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the problem. The order of destruction of such static object (at program termination) is undefined.

Could you tell me where you got this info from?
As far as I know, the static variables have a well-defined construction and destruction order.
The destruction order of static variables is the reverse of the construction order of them.
And, all global variables are constructed before main() starts, whereas static variables are constructed only when the specific scope is reached (i.e., the defining function is called).
I know "global" objects have undefined destruction order, but static variables are not global variables.

@bso-intel
Copy link
Contributor Author

I am redoing the fix in #2942. So abandon this PR.

@bso-intel bso-intel closed this Dec 23, 2020
@bso-intel bso-intel deleted the platform-leak branch June 15, 2022 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants