[libc++] Fix bitset conversion functions and refactor constructor #121348

winner245 · 2024-12-30T17:21:36Z

This patch addresses several implementation issues in bitset's conversion functions to_ullong and to_ulong, and refactors its converting constructor __bitset(unsigned long long __v) to a more generic and elegant implementation.

to_ullong:

llvm-project/libcxx/include/bitset

Lines 384 to 385 in b7637a8

    
           for (size_t __i = 1; __i < sizeof(unsigned long long) / sizeof(__storage_type); ++__i) 
        
             __r |= static_cast<unsigned long long>(__first_[__i]) << (sizeof(__storage_type) * CHAR_BIT);

The existing implementation incorrectly concatenates multiple words using a fixed shifting amount __bits_per_word inside a loop. The correct shifting amount should be __i * __bits_per_word, which depends on the loop index __i. This is now correctly implemented in this patch.

Additionally, the previous implementation relied on tag-dispatching, which became difficult to follow due to the combination of multiple tags. This patch improves readability by replacing tag-dispatching with constexpr if, placing conditions in-line for a more straightforward and maintainable approach.

to_ulong:

llvm-project/libcxx/include/bitset

Lines 527 to 530 in b7637a8

    
           template <size_t _Size> 
        
           inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 unsigned long long __bitset<1, _Size>::to_ullong() const { 
        
             return __first_; 
        
           }

The to_ulong(unsigned long ) function in the one-word specialization template __bitset<1, _Size> is not standard-conforming as it unconditionally returns __first_, while the standard requires that it throws std::overflow_error if the value cannot fit within unsigned long (i.e., sizeof(size_t) > sizeof(unsigned long)). On LLP64 (e.g., Windows and MinGW), we have sizeof(size_t) = 64 and sizeof(unsigned long) = 32 (according to: https://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models), and in this case, we should throw std::overflow_error. This is now fixed in this PR.

bitset(unsigned long long v):

The converting constructor is now refactored using a variadic template and an internal implementation of integer_sequence (to support C++03). This approach eliminates conditional preprocessing statements, improving readability and maintainability.

frederick-vs-ja

Ah. I think the change is actually "Fix possible out of range access in bitset::to_ullong implementation". The current title and PR description are indescriptive to me.

The intented change look good to me. But I guess we need far more changes to properly support 16-bit platforms. Has this been discussed before?

libcxx/include/bitset

github-actions · 2025-01-02T22:22:42Z

✅ With the latest revision this PR passed the C/C++ code formatter.

winner245 · 2025-06-11T15:57:48Z

This patch is not just addressing issue for sizeof(size_t) == 2 only. It actually addresses all cases where the ratio sizeof(unsigned long long) / sizeof(size_t) != 1. It also fixes several issues in the previous implementation. The previous implementation, for example to_ullong, already considered the cases sizeof(unsigned long long) / sizeof(size_t) > 1, which includes sizeof(size_t) == 2 as a special case. However, its implementation is incorrect in that it concatenate the words with a fixed shifting amount __bits_per_word inside a for loop, where the correct shifting amount is i * __bits_per_word. Also, the previous implementation of __bitset<1, _Size>::to_ulong is incorrect for LLP64 with sizeof(size_t) = 64 and sizeof(unsigned long) = 32, where we should throw std::overflow_error.

This patch fixes the previous implementation for different scenarios with sizeof(unsigned long long) / sizeof(size_t) != 1. It also refactors the implementation to be more readable. Previously, the implementation was based on tag-dispatching, which is very misleading given that we have more that one tag in to_ullong. For example, it's really hard to understand what the different combinations of the tags to_ullong(false_type | true_type, false_type | true_type) mean, and we have to jump around to figure out. This patch makes it easier to read by using constexpr if with the conditions placed in-line.

philnik777 · 2025-06-11T16:06:13Z

@winner245 You're still adding branches here which are effectively dead code AFAICT. I'm not against fixing bugs, but I am against introducing effectively dead code. Given that your commit message starts with that case, it seems like it is the most important part of the patch. If it's not, I'd rip these parts out to make the actual intention clear.

ldionne · 2025-06-11T16:26:34Z

@winner245 You're still adding branches here which are effectively dead code AFAICT. I'm not against fixing bugs, but I am against introducing effectively dead code. Given that your commit message starts with that case, it seems like it is the most important part of the patch. If it's not, I'd rip these parts out to make the actual intention clear.

I think I'm not following fully here. Looking at the code after the patch, I don't see what part of it is dead code. We do have platforms where size_t is 32 bits, which means the ratio between unsigned long long (64 bits) and size_t is 2. It is true that we don't officially support any platform where size_t is 16 bits, but I don't think this patch introduces any specific code to handle this. It just handles all sizes generically in a correct way, and this happens to also work for a 16 bit size_t. Perhaps the commit message needs to be adjusted to explain the motivation more clearly?

If I missed something and there is indeed dead code being introduced, then I agree that we shouldn't be checking in dead code -- but I'm not seeing anything dead in the latest version of the patch.

philnik777

If I missed something and there is indeed dead code being introduced, then I agree that we shouldn't be checking in dead code -- but I'm not seeing anything dead in the latest version of the patch.

I've left a comment where I think we're introducing dead code.

Edit:

Perhaps the commit message needs to be adjusted to explain the motivation more clearly?

I think so. Starting with something that happens to work now and explain extensively why isn't exactly screaming "falls out of this patch" but rather "this is the main goal of this patch".

philnik777 · 2025-06-11T16:36:03Z

libcxx/include/bitset

+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 unsigned long to_ulong() const { return 0UL; }
+  _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 unsigned long long to_ullong() const { return 0ULL; }


Unrelated change?

philnik777 · 2025-06-11T16:37:00Z

libcxx/test/std/utilities/template.bitset/bitset.members/to_ullong.pass.cpp

+// UNSUPPORTED: no-exceptions
+


We should only guard the cases where exceptions are required. Otherwise we lose all coverage for to_ullong with exceptions disabled.

Thanks! I agree we should only guard the test cases which actually check exceptions. I am now using #ifndef TEST_HAS_NO_EXCEPTIONS around the test cases.

philnik777 · 2025-06-11T16:37:32Z

libcxx/include/bitset

+
+  // unsigned long may span multiple words which are concatenated to form the result
+  template <typename _StorageType                                            = __storage_type,
+            __enable_if_t<sizeof(_StorageType) < sizeof(unsigned long), int> = 0>


This is never true AFAICT. On all 32 bit platforms sizeof(unsigned long) is at most 32 bits, and sizeof(_StorageType) (i.e. sizeof(size_t)) should always be 32 bits as well.

On 64 bit platforms sizeof(size_t) is always eight, and I'm quite certain there is no platform where unsigned long is larger than that.

As suggested, I've removed this code branch, and I am now focusing only on the case where sizeof(size_t) >= sizeof(unsigned long).

philnik777 · 2025-06-11T16:39:58Z

libcxx/include/bitset

@@ -80,7 +80,7 @@ public:
    constexpr bool operator[](size_t pos) const;
    reference operator[](size_t pos);            // constexpr since C++23
    unsigned long to_ulong() const;              // constexpr since C++23
-    unsigned long long to_ullong() const;        // constexpr since C++23
+    unsigned long long to_ullong() const;        // since C++11, constexpr since C++23


This seems incorrect?

This is LWG694. I don't think LWG694 retroactively applied to C++03 because long long was added in C++11 as a new feature. But since libc++ supports many C++11 features in C++03 mode, perhaps it's also fine not to mention "since C++11".

Yeah, we don't mention "sine C++11" in most places, since we back-port a humongous amount of code from C++11.

Yeah, this function was in C++11 standard mode, but we backported it to C++03. So I've now removed the "since C++11" description.

ldionne · 2025-06-11T16:56:08Z

libcxx/include/bitset

+    // TODO: This is a workaround for a gdb test failure (gdb_pretty_printer_test.sh.cpp) in
+    // stage1 CI (generic-gcc, gcc-14, g++-14), due to the __bits_per_word name lookup failure
+    // if not referenced in the constructor initializer list.
+    // See: https://github.com/llvm/llvm-project/actions/runs/15071518915/job/42368867929?pr=121348#logs


Suggested change

// TODO: This is a workaround for a gdb test failure (gdb_pretty_printer_test.sh.cpp) in

// stage1 CI (generic-gcc, gcc-14, g++-14), due to the __bits_per_word name lookup failure

// if not referenced in the constructor initializer list.

// See: https://github.com/llvm/llvm-project/actions/runs/15071518915/job/42368867929?pr=121348#logs

// TODO: We must refer to __bits_per_word in order to work around an issue with the GDB pretty-printers.

// Without it, the pretty-printers complain about a missing __bits_per_word member. This must

// be investigated.

libcxx/include/bitset

philnik777

This resolves my concerns, thanks! I'll leave the rest of the review to Louis.

philnik777 · 2025-06-13T15:57:22Z

libcxx/include/bitset

+                  "bitset only supports platforms where sizeof(size_t) >= sizeof(unsigned long), such as 32-bit and "
+                  "64-bit platforms");


Suggested change

"bitset only supports platforms where sizeof(size_t) >= sizeof(unsigned long), such as 32-bit and "

"64-bit platforms");

"libc++ only supports platforms where sizeof(size_t) >= sizeof(unsigned long), such as 32-bit and "

"64-bit platforms. If you're interested in supporting a platform where that is not the case, please contact the libc++ developers.");

Otherwise people might think that we have no interest in supporting their platform.

winner245 · 2025-06-13T15:59:06Z

Perhaps the commit message needs to be adjusted to explain the motivation more clearly?

I think so. Starting with something that happens to work now and explain extensively why isn't exactly screaming "falls out of this patch" but rather "this is the main goal of this patch".

I've completely rewritten the commit message, which now has a much clearer motivation.

libcxx/test/std/utilities/template.bitset/bitset.members/to_ullong.pass.cpp

libcxx/test/std/utilities/template.bitset/bitset.members/to_ulong.pass.cpp

libcxx/include/bitset

ldionne

LGTM, thanks!

…vm#121348) This patch addresses several implementation issues in `bitset`'s conversion functions `to_ullong` and `to_ulong`, and refactors its converting constructor `__bitset(unsigned long long __v)` to a more generic and elegant implementation.

winner245 force-pushed the improve-to_ullong branch from 4d20d1e to 5751c13 Compare December 30, 2024 19:14

frederick-vs-ja reviewed Dec 31, 2024

View reviewed changes

libcxx/include/bitset Outdated Show resolved Hide resolved

winner245 changed the title ~~[libc++] Improve bitset::to_ullong Implementation~~ [libc++] Fix possible out of range access in bitset::to_ullong implementation Dec 31, 2024

winner245 force-pushed the improve-to_ullong branch 10 times, most recently from 94e34bb to f85d6e7 Compare January 2, 2025 22:19

winner245 force-pushed the improve-to_ullong branch 16 times, most recently from 67e021d to c86557e Compare January 3, 2025 19:13

philnik777 reviewed Jun 11, 2025

View reviewed changes

ldionne reviewed Jun 11, 2025

View reviewed changes

winner245 changed the title ~~[libc++] Fix possible out of range access in bitset~~ [libc++] Fix bitset conversion functions and refactor constructor Jun 13, 2025

winner245 force-pushed the improve-to_ullong branch from cf28025 to 958a1c3 Compare June 13, 2025 15:52

philnik777 reviewed Jun 13, 2025

View reviewed changes

winner245 force-pushed the improve-to_ullong branch from 958a1c3 to 746e44b Compare June 13, 2025 16:31

winner245 added 11 commits June 16, 2025 10:46

Improve bitset::to_ullong Implementation

d48457d

Make variable constant expression

cdf3f5a

Apply @frederick-vs-ja suggestions to support 16-bit platforms

d34f4b6

Fix gdb.error

958f952

Fix to_ulong to throw overflow_error as expected

53283ec

Address ldionne's review comments

82297b4

Fix preprocessing

e5c0ef0

Address ldionne's comments

4073667

Rebase

a6bbda6

Address philnik777's comments

c85dcc9

Add <stdexcept> header

db4565b

winner245 force-pushed the improve-to_ullong branch from 746e44b to db4565b Compare June 16, 2025 14:46

ldionne requested changes Jun 18, 2025

View reviewed changes

Address further comments by ldionne

3700be1

winner245 force-pushed the improve-to_ullong branch from 3f7d57b to 3700be1 Compare June 18, 2025 21:45

ldionne approved these changes Jun 24, 2025

View reviewed changes

ldionne merged commit d807661 into llvm:main Jun 24, 2025
268 of 279 checks passed

winner245 deleted the improve-to_ullong branch June 24, 2025 21:57

	for (size_t __i = 1; __i < sizeof(unsigned long long) / sizeof(__storage_type); ++__i)
	__r \|= static_cast<unsigned long long>(__first_[__i]) << (sizeof(__storage_type) * CHAR_BIT);

	template <size_t _Size>
	inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 unsigned long long __bitset<1, _Size>::to_ullong() const {
	return __first_;
	}

		_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 unsigned long to_ulong() const { return 0UL; }
		_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX23 unsigned long long to_ullong() const { return 0ULL; }

		"bitset only supports platforms where sizeof(size_t) >= sizeof(unsigned long), such as 32-bit and "
		"64-bit platforms");

[libc++] Fix bitset conversion functions and refactor constructor #121348

[libc++] Fix bitset conversion functions and refactor constructor #121348

Uh oh!

Conversation

winner245 commented Dec 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

to_ullong:

to_ulong:

__bitset(unsigned long long __v):

Uh oh!

frederick-vs-ja left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Jan 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

winner245 commented Jun 11, 2025

Uh oh!

philnik777 commented Jun 11, 2025

Uh oh!

ldionne commented Jun 11, 2025

Uh oh!

philnik777 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

philnik777 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

winner245 commented Jun 13, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ldionne left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

winner245 commented Dec 30, 2024 •

edited

Loading

bitset(unsigned long long v):

frederick-vs-ja left a comment •

edited

Loading

github-actions bot commented Jan 2, 2025 •

edited

Loading

philnik777 left a comment •

edited

Loading