Skip to content
This repository was archived by the owner on May 28, 2025. It is now read-only.

Commit 0f14bea

Browse files
authored
Optimize char_try_from_u32
The optimization was proposed by @falk-hueffner in https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/Micro-optimizing.20char.3A.3Afrom_u32/near/272146171, and I simplified it a bit and added an explanation of why the optimization is correct.
1 parent 73a7423 commit 0f14bea

File tree

1 file changed

+14
-1
lines changed

1 file changed

+14
-1
lines changed

library/core/src/char/convert.rs

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -271,7 +271,20 @@ impl FromStr for char {
271271

272272
#[inline]
273273
const fn char_try_from_u32(i: u32) -> Result<char, CharTryFromError> {
274-
if (i > MAX as u32) || (i >= 0xD800 && i <= 0xDFFF) {
274+
// This is an optimized version of the check
275+
// (i > MAX as u32) || (i >= 0xD800 && i <= 0xDFFF),
276+
// which can also be written as
277+
// i >= 0x110000 || (i >= 0xD800 && i < 0xE000).
278+
//
279+
// The XOR with 0xD800 permutes the ranges such that 0xD800..0xE000 is
280+
// mapped to 0x0000..0x0800, while keeping all the high bits outside 0xFFFF the same.
281+
// In particular, numbers >= 0x110000 stay in this range.
282+
//
283+
// Subtracting 0x800 causes 0x0000..0x0800 to wrap, meaning that a single
284+
// unsigned comparison against 0x110000 - 0x800 will detect both the wrapped
285+
// surrogate range as well as the numbers originally larger than 0x110000.
286+
//
287+
if (i ^ 0xD800).wrapping_sub(0x800) >= 0x110000 - 0x800 {
275288
Err(CharTryFromError(()))
276289
} else {
277290
// SAFETY: checked that it's a legal unicode value

0 commit comments

Comments
 (0)