Skip to content

[libc][stdlib] Remove LIBC_THREAD_LOCAL from rand_next #96245

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

PiJoules
Copy link
Contributor

srand and rand do not need to be threadsafe, so it should be permissible to remove thise from rand_next whose only users are srand and rand. This is helpful for the baremetal runtimes where we still need to setup the TLS machinery.

@llvmbot
Copy link
Member

llvmbot commented Jun 20, 2024

@llvm/pr-subscribers-libc

Author: None (PiJoules)

Changes

srand and rand do not need to be threadsafe, so it should be permissible to remove thise from rand_next whose only users are srand and rand. This is helpful for the baremetal runtimes where we still need to setup the TLS machinery.


Full diff: https://github.com/llvm/llvm-project/pull/96245.diff

2 Files Affected:

  • (modified) libc/src/stdlib/rand_util.cpp (+1-1)
  • (modified) libc/src/stdlib/rand_util.h (+1-1)
diff --git a/libc/src/stdlib/rand_util.cpp b/libc/src/stdlib/rand_util.cpp
index 1f3dbce9c7860..4aa03bfe3eba9 100644
--- a/libc/src/stdlib/rand_util.cpp
+++ b/libc/src/stdlib/rand_util.cpp
@@ -18,7 +18,7 @@ ThreadLocal<unsigned long> rand_next;
 #else
 // C standard 7.10p2: If 'rand' is called before 'srand' it is to proceed as if
 // the 'srand' function was called with a value of '1'.
-LIBC_THREAD_LOCAL unsigned long rand_next = 1;
+unsigned long rand_next = 1;
 #endif
 
 } // namespace LIBC_NAMESPACE
diff --git a/libc/src/stdlib/rand_util.h b/libc/src/stdlib/rand_util.h
index cadd6b5cdcbb8..b32302f486dce 100644
--- a/libc/src/stdlib/rand_util.h
+++ b/libc/src/stdlib/rand_util.h
@@ -34,7 +34,7 @@ template <typename T> class ThreadLocal {
 
 extern ThreadLocal<unsigned long> rand_next;
 #else
-extern LIBC_THREAD_LOCAL unsigned long rand_next;
+extern unsigned long rand_next;
 #endif
 
 } // namespace LIBC_NAMESPACE

@nickdesaulniers nickdesaulniers requested a review from jhuber6 June 20, 2024 22:46
Copy link
Member

@nickdesaulniers nickdesaulniers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like you can change this for GPU, too?

@jhuber6
Copy link
Contributor

jhuber6 commented Jun 20, 2024

You know, looking into this it's baffling that POSIX made rand_r to be reentrant but made it take an unsigned *r instead of some opaque type defined by the implementation, so it's effectively unusable.

For the GPU, it's unlikely people would be able to use this seriously if it's not thread safe, but the thread local hack isn't really serviceable either. It's prefer a functional reentrant version above anything else. I thought LIBC_THREAD_LOCAL was just empty on platforms that didn't support it (baremetal) so this is a no-op?

@petrhosek petrhosek requested a review from frobtech June 20, 2024 22:53
@nickdesaulniers
Copy link
Member

I thought LIBC_THREAD_LOCAL was just empty on platforms that didn't support it (baremetal) so this is a no-op?

This code is also the only user of ThreadLocal, so we could probably delete src/stdlib/rand_util.h if we remove ThreadLocal.

@jhuber6
Copy link
Contributor

jhuber6 commented Jun 20, 2024

I thought LIBC_THREAD_LOCAL was just empty on platforms that didn't support it (baremetal) so this is a no-op?

This code is also the only user of ThreadLocal, so we could probably delete src/stdlib/rand_util.h if we remove ThreadLocal.

I thought errno used it?

@michaelrj-google
Copy link
Contributor

thread local is used in a couple places: errno, strerror, and strsignal

@nickdesaulniers
Copy link
Member

nickdesaulniers commented Jun 20, 2024

I'm talking about the class ThreadLocal we have in src/stdlib/rand_util.h, of which rand is the only user. I think that can get deleted, too, in this PR.

@SchrodingerZhu
Copy link
Contributor

Thread-safety

All interfaces defined by this specification will be thread-safe, except that the following interfaces need not be thread-safe:
POSIX Interfaces

asctime()
ctime()
getc_unlocked()
getchar_unlocked()
getgrgid()
getgrnam()
getlogin()
getopt()
getpwnam()
getpwuid()
gmtime()
localtime()
putc_unlocked()
putchar_unlocked()
rand()
readdir()
strtok()
ttyname()

X/Open Interfaces

basename()
catgets()
dbm_clearerr()
dbm_close()
dbm_delete()
dbm_error()
dbm_fetch()
dbm_firstkey()
dbm_nextkey()
dbm_open()
dbm_store()
dirname()
drand48()
ecvt()
encrypt()
endgrent()
endpwent()
endutxent()
fcvt()
gamma()
gcvt()
getdate()
getenv()
getgrent()
getpwent()
getutxent()
getutxid()
getutxline()
getw()
l64a()
lgamma()
lrand48()
mrand48()
nl_langinfo()
ptsname()
putenv()
pututxline()
setgrent()
setkey()
setpwent()
setutxent()
strerror()

The interfaces ctermid() and tmpnam() need not be thread-safe if passed a NULL argument.

The interfaces in the Legacy Feature Group need not be thread-safe.

Implementations will provide internal synchronisation as necessary in order to satisfy this requirement.

@SchrodingerZhu
Copy link
Contributor

SchrodingerZhu commented Jun 20, 2024

I am curious about what MT-Safe means exactly here.

For example, this function https://android.googlesource.com/platform/bionic/+/android-4.2.2_r1/libc/bionic/tdestroy.c is clearly not to be called concurrently on the same tree, while POSIX and this manpage claims it is MT-safe. I assume it is safe in the sense that it can be invoked concurrently on different trees in the same time, but there is no clarification on this.

srand and rand do not need to be threadsafe, so it should be permissible
to remove thise from rand_next whose only users are srand and rand. This
is helpful for the baremetal runtimes where we still need to setup the
TLS machinery.
@PiJoules PiJoules force-pushed the remove-thread_local-from-rand_next branch from 0aef51f to 97dda8d Compare June 25, 2024 18:48
@PiJoules
Copy link
Contributor Author

I'm talking about the class ThreadLocal we have in src/stdlib/rand_util.h, of which rand is the only user. I think that can get deleted, too, in this PR.

Done

@PiJoules PiJoules requested a review from nickdesaulniers June 25, 2024 18:49
@jhuber6
Copy link
Contributor

jhuber6 commented Jun 25, 2024

Looking at some other implementations, it seems like ISO C states that the random seed is per-process, but that multiple threads should be not able to interfere with it? I was looking at glibc and this seems to be the case https://github.com/bminor/glibc/blob/master/stdlib/random.c#L287.

So, this to me seems like we'd need to implement some kind of RMW atomic on the underlying state, not just get rid of the thread_local.

Honestly, I might try to write it this way since it would solve some issues on the GPU and it's easier than implementing THREAD_LOCAL. After that the only thread local thing is errno, but I'm just going to state that errno doesn't exist on the GPU.

@nickdesaulniers
Copy link
Member

Looking at some other implementations, it seems like ISO C states that the random seed is per-process, but that multiple threads should be not able to interfere with it? I was looking at glibc and this seems to be the case https://github.com/bminor/glibc/blob/master/stdlib/random.c#L287.

So, this to me seems like we'd need to implement some kind of RMW atomic on the underlying state, not just get rid of the thread_local.

FWIW, Bionic uses NetBSD's srandom, which uses a mutex_t to guard access to the random seed.
https://android.googlesource.com/platform/bionic/+/refs/heads/main/libc/upstream-netbsd/common/lib/libc/stdlib/random.c#304

@jhuber6
Copy link
Contributor

jhuber6 commented Jun 25, 2024

I feel like updating this with a CAS loop is a good middle-ground #96692. This also lets the GPU implementation actually work, since previously I couldn't statically initialize it to 1. Though, it's not like CAS loops are cheap, but I didn't want to use actual locks since they're a PITA to make work on the GPU (see the RPC implementation).

@frobtech
Copy link
Contributor

I feel like updating this with a CAS loop is a good middle-ground #96692. This also lets the GPU implementation actually work, since previously I couldn't statically initialize it to 1. Though, it's not like CAS loops are cheap, but I didn't want to use actual locks since they're a PITA to make work on the GPU (see the RPC implementation).

I concur that the single atomic is the best approach for this case.

@jhuber6
Copy link
Contributor

jhuber6 commented Jun 27, 2024

Thanks for bringing this to attention, I think we ended up with a good solution.

@jhuber6 jhuber6 closed this Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants