-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[libc++] Use __wrap_iter in string_view and array in the unstable ABI #74482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-libcxx Author: Louis Dionne (ldionne) Changesstd::string_view and std::array iterators don't have to be raw pointers, and in fact other implementations don't represent them as raw pointers. Them being raw pointers in libc++ makes it easier for users to write non-portable code. This is bad in itself, but this is even worse when considering efforts like hardening where we want an easy ability to swap for a different iterator type. If users depend on iterators being raw pointers, this becomes a build break. Hence, this patch enables the use of __wrap_iter in the unstable ABI, creating a long term path towards making this the default. This patch may break code that assumes these iterators are raw pointers for people compiling with the unstable ABI. Full diff: https://github.com/llvm/llvm-project/pull/74482.diff 3 Files Affected:
diff --git a/libcxx/include/__config b/libcxx/include/__config
index 3de4d610e8cde..636f46b048554 100644
--- a/libcxx/include/__config
+++ b/libcxx/include/__config
@@ -168,6 +168,12 @@
// pointer from 16 to 8. This changes the output of std::string::max_size,
// which makes it ABI breaking
# define _LIBCPP_ABI_STRING_8_BYTE_ALIGNMENT
+// Define std::array iterators to be __wrap_iters instead of raw pointers, which
+// prevents people from relying on a non-portable implementation detail.
+# define _LIBCPP_ABI_USE_WRAP_ITER_IN_STD_ARRAY
+// Define std::basic_string_view iterators to be __wrap_iters instead of raw pointers,
+// which prevents people from relying on a non-portable implementation detail.
+# define _LIBCPP_ABI_USE_WRAP_ITER_IN_STD_STRING_VIEW
# elif _LIBCPP_ABI_VERSION == 1
# if !(defined(_LIBCPP_OBJECT_FORMAT_COFF) || defined(_LIBCPP_OBJECT_FORMAT_XCOFF))
// Enable compiling copies of now inline methods into the dylib to support
diff --git a/libcxx/include/array b/libcxx/include/array
index fc5371ebae21a..988464844f866 100644
--- a/libcxx/include/array
+++ b/libcxx/include/array
@@ -120,6 +120,7 @@ template <size_t I, class T, size_t N> const T&& get(const array<T, N>&&) noexce
#include <__config>
#include <__fwd/array.h>
#include <__iterator/reverse_iterator.h>
+#include <__iterator/wrap_iter.h>
#include <__tuple/sfinae_helpers.h>
#include <__type_traits/conditional.h>
#include <__type_traits/is_array.h>
@@ -168,10 +169,15 @@ struct _LIBCPP_TEMPLATE_VIS array
typedef _Tp value_type;
typedef value_type& reference;
typedef const value_type& const_reference;
- typedef value_type* iterator;
- typedef const value_type* const_iterator;
typedef value_type* pointer;
typedef const value_type* const_pointer;
+#if defined(_LIBCPP_ABI_USE_WRAP_ITER_IN_STD_ARRAY)
+ typedef __wrap_iter<pointer> iterator;
+ typedef __wrap_iter<const_pointer> const_iterator;
+#else
+ typedef pointer iterator;
+ typedef const_pointer const_iterator;
+#endif
typedef size_t size_type;
typedef ptrdiff_t difference_type;
typedef _VSTD::reverse_iterator<iterator> reverse_iterator;
diff --git a/libcxx/include/string_view b/libcxx/include/string_view
index 881e321c507f6..8af67d5e5bb11 100644
--- a/libcxx/include/string_view
+++ b/libcxx/include/string_view
@@ -215,6 +215,7 @@ namespace std {
#include <__iterator/concepts.h>
#include <__iterator/iterator_traits.h>
#include <__iterator/reverse_iterator.h>
+#include <__iterator/wrap_iter.h>
#include <__memory/pointer_traits.h>
#include <__ranges/concepts.h>
#include <__ranges/data.h>
@@ -279,10 +280,12 @@ public:
using const_pointer = const _CharT*;
using reference = _CharT&;
using const_reference = const _CharT&;
-#ifdef _LIBCPP_ABI_BOUNDED_ITERATORS
+#if defined(_LIBCPP_ABI_BOUNDED_ITERATORS)
using const_iterator = __bounded_iter<const_pointer>;
+#elif defined(_LIBCPP_ABI_USE_WRAP_ITER_IN_STD_STRING_VIEW)
+ using const_iterator = __wrap_iter<const_pointer>;
#else
- using const_iterator = const_pointer; // See [string.view.iterators]
+ using const_iterator = const_pointer;
#endif
using iterator = const_iterator;
using const_reverse_iterator = _VSTD::reverse_iterator<const_iterator>;
|
CC @llvm/libcxx-vendors since this can break some code. CC @EricWF for awareness in particular. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
e59fbd2
to
c4df4d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I very much like this change for its hardening improvements. I'm going to check how much code this impacts internally right now: would it be possible to hold off merging this for a day or so, so that I can get details on what we need to do, please?
Yup, of course, the intent was to upload this and let folks see how much impact it has. Please let me know once you have some information. I think Google is going to be the main way we can get information here since most vendors ship the stable ABI. |
Excellent, thank you! I'll give you an answer as soon as possible (likely tomorrow). |
f329084
to
582203c
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That got a lot bigger. Maybe we should split the array
and string_view
changes?
This patch removes assumptions that std::array's iterators are raw pointers in the source code and in our test suite. While this is true right now, this doesn't have to be true and ion the future we might want to enable bounded iterators in std::array, which would require this change. This is a pre-requisite for landing llvm#74482
You're right, I debated with myself and decided to keep it in the same patch, but it's cleaner not to. This is now #74624, I'll rebase this after landing that other patch (please take a look!). |
As requested in llvm/llvm-project#74482. NOKEYCHECK=True GitOrigin-RevId: 953ac952eb17ef3a370699958f760ee81acdbe1a
#74624) This patch removes assumptions that std::array's iterators are raw pointers in the source code and in our test suite. While this is true right now, this doesn't have to be true and ion the future we might want to enable bounded iterators in std::array, which would require this change. This is a pre-requisite for landing #74482
582203c
to
596c73d
Compare
@cjdb I think this should be good to test now! Make sure to apply it on top of |
596c73d
to
a58530a
Compare
Pinging reviewers. |
@cjdb Do you think it'd be possible to take this for a run internally to see the impact? This is not a must for LLVM 18 but I'd like to unblock this. |
Sorry, I think this got lost in the haze that is December. I'll get an answer ASAP. |
Oh yeah, no worries! This is not an emergency, I just wanted to avoid forgetting this forever :) |
a58530a
to
bace6cd
Compare
Gentle ping @cjdb , did you have any luck running this internally? |
I spoke to @cjdb offline who told me there was a non-insignificant number of build failures when enabling this. However, this change can easily be disabled temporarily (while the code base migrates) by not defining Our ABI flags have that benefit that one can pick and choose what's opted into, so let's use this here. Hence, I think it is reasonable to move forward with the change, and vendors who envision breakage should simply disable these ABI macros until they are ready to transition. CC @llvm/libcxx-vendors again |
std::string_view and std::array iterators don't have to be raw pointers, and in fact other implementations don't represent them as raw pointers. Them being raw pointers in libc++ makes it easier for users to write non-portable code. This is bad in itself, but this is even worse when considering efforts like hardening where we want an easy ability to swap for a different iterator type. If users depend on iterators being raw pointers, this becomes a build break. Hence, this patch enables the use of __wrap_iter in the unstable ABI, creating a long term path towards making this the default. This patch may break code that assumes these iterators are raw pointers for people compiling with the unstable ABI.
bace6cd
to
ba38951
Compare
Thanks for the migration pathway! |
@rnk @ldionne could you please provide a more detailed way to work around this? Is the suggestion to modify __config and undefine _LIBCPP_ABI_USE_WRAP_ITER_IN_STD_ARRAY and _LIBCPP_ABI_USE_WRAP_ITER_IN_STD_STRING_VIEW? Unfortunately this cannot be done in __config_site because the macros are defined after its inclusion. |
You can either modify |
If it's causing lots of pain, would it be possible to make this opt-in for two releases, then switch it to opt-out (and document that we'll be doing this no later than two releases time)? That'll give folks time to address this issue without being broken today. To avoid having users need to either modify #if defined(_LIBCPP_ABI_BOUNDED_ITERATORS)
using const_iterator = __bounded_iter<const_pointer>;
#elif defined(_LIBCPP_ABI_USE_WRAP_ITER_IN_STD_STRING_VIEW)
using const_iterator = __wrap_iter<const_pointer>;
#else
using const_iterator = const_pointer;
#endif to #if defined(_LIBCPP_ABI_BOUNDED_ITERATORS)
using const_iterator = __bounded_iter<const_pointer>;
#elif !defined(_LIBCPP_ABI_NO_USE_WRAP_ITER_IN_STD_STRING_VIEW) // macro adds "NO"
using const_iterator = const_pointer;
#else
using const_iterator = __wrap_iter<const_pointer>;
#endif Apologies for not thinking of this when we chatted last week. |
I have a patch that's slightly different to the forward-fix above, and I think lets us all win here. |
llvm/llvm-project#74482 changes the iterator type for std::string_view and std::array so that they're no longer raw pointers. This CL fixes builds for ChromeOS, MacOS, Android, and iOS. This CL does *not* fix Windows and Fuchsia - Windows is fixed in https://crrev.com/c/5362998, and Fuchsia is being fixed upstream. These changes were tested in a sample libc++ roll in https://crrev.com/c/5362421. AX-Relnotes: n/a Bug: 328308661 Change-Id: I7681d93b8b59a389c6c349821d25c3c93cb87bc1 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5365796 Reviewed-by: Nico Weber <[email protected]> Owners-Override: Nico Weber <[email protected]> Commit-Queue: Alan Zhao <[email protected]> Cr-Commit-Position: refs/heads/main@{#1272559}
llvm/llvm-project#74482 changes the iterator type for std::string_view and std::array so that they're no longer raw pointers. This CL fixes builds for ChromeOS, MacOS, Android, and iOS. This CL does *not* fix Windows and Fuchsia - Windows is fixed in https://crrev.com/c/5362998, and Fuchsia is being fixed upstream. These changes were tested in a sample libc++ roll in https://crrev.com/c/5362421. AX-Relnotes: n/a Bug: 328308661 Change-Id: I7681d93b8b59a389c6c349821d25c3c93cb87bc1 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5365796 Reviewed-by: Nico Weber <[email protected]> Owners-Override: Nico Weber <[email protected]> Commit-Queue: Alan Zhao <[email protected]> Cr-Commit-Position: refs/heads/main@{#1272559} NOKEYCHECK=True GitOrigin-RevId: 9b60fb150439bf1170ab74f6230169b48aff1c07
Do you have a PR for something like this? I think it would also be nice to have a macro we can set in our build that doesn't involve editing the |
@PiJoules Please see #84226 (comment). |
llvm/llvm-project#74482 changes the iterator type for std::string_view and std::array so that they're no longer raw pointers. This CL fixes builds for ChromeOS, MacOS, Android, and iOS. This CL does *not* fix Windows and Fuchsia - Windows is fixed in https://crrev.com/c/5362998, and Fuchsia is being fixed upstream. These changes were tested in a sample libc++ roll in https://crrev.com/c/5362421. AX-Relnotes: n/a Bug: 328308661 Change-Id: I7681d93b8b59a389c6c349821d25c3c93cb87bc1 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5365796 Reviewed-by: Nico Weber <[email protected]> Owners-Override: Nico Weber <[email protected]> Commit-Queue: Alan Zhao <[email protected]> Cr-Commit-Position: refs/heads/main@{#1272559} CrOS-Libchrome-Original-Commit: 9b60fb150439bf1170ab74f6230169b48aff1c07
…BI_VERSION for Fuchsia With llvm/llvm-project#74482, the unstable ABI has been changed such that std::array::iterator and std::string_view::iterator are no longer raw pointers. However, parts of the Fuchsia SDK still rely on this behavior (fxbug.dev/328282937). To temporarily unblock the roll, we explicitly set the individual ABI flags for Fuchsia so we have all the unstable ABI features except for that change. Bug: chromium:328308661 Change-Id: I2a43bc64b9cca9907363943410273332c83e7384 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5347218 Commit-Queue: Alan Zhao <[email protected]> Reviewed-by: Hans Wennborg <[email protected]> Reviewed-by: Nico Weber <[email protected]> Cr-Commit-Position: refs/heads/main@{#1275860}
llvm/llvm-project#74482 changes the iterator type for std::string_view and std::array so that they're no longer raw pointers. Bug:chromium:328308661 Change-Id: I943a5e62c0ec075345c73d2651e0739b399dca12
…ters llvm/llvm-project#74482 changes the iterator type for std::string_view and std::array so that they're no longer raw pointers. Bug: 328308661 Change-Id: I75d4084ec7f2e4915595d6c334f3b7151b2a4011 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5383638 Auto-Submit: Alan Zhao <[email protected]> Reviewed-by: Lily Chen <[email protected]> Reviewed-by: Rune Lillesveen <[email protected]> Commit-Queue: Lily Chen <[email protected]> Cr-Commit-Position: refs/heads/main@{#1276269}
…BI_VERSION for Fuchsia With llvm/llvm-project#74482, the unstable ABI has been changed such that std::array::iterator and std::string_view::iterator are no longer raw pointers. However, parts of the Fuchsia SDK still rely on this behavior (fxbug.dev/328282937). To temporarily unblock the roll, we explicitly set the individual ABI flags for Fuchsia so we have all the unstable ABI features except for that change. Bug: chromium:328308661 Change-Id: I2a43bc64b9cca9907363943410273332c83e7384 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5347218 Commit-Queue: Alan Zhao <[email protected]> Reviewed-by: Hans Wennborg <[email protected]> Reviewed-by: Nico Weber <[email protected]> Cr-Commit-Position: refs/heads/main@{#1275860} NOKEYCHECK=True GitOrigin-RevId: 39b8d508e03a109d05787683aa25ee6ce03b790e
llvm/llvm-project#74482 in upstream libc++ changed the return type of various iterator functions (*.begin(), *.end(), etc) from raw pointers to iterator objects. The return types for these functions aren't guaranteed to be raw pointers, but we have a bunch of c++ code that assumes they are. We can get pointers from these iterators by simply dereferencing them and taking their address (`&*iter`). Bug: 328282937 Reviewed-on: https://fuchsia-review.googlesource.com/c/fuchsia/+/1010227 Owners-Override: James Robinson <[email protected]> API-Review: Adam Barth <[email protected]> GitOrigin-RevId: 89c40478c5fd4dfd5e64196c364da1b026fb3fde Change-Id: I3ff0576ca2bee11f2ff9fb845c5eb0815d374d75 Reviewed-on: https://pigweed-review.googlesource.com/c/pigweed/pigweed/+/221014 Reviewed-by: Faraaz Sareshwala <[email protected]> Lint: Lint 🤖 <[email protected]> Pigweed-Auto-Submit: Faraaz Sareshwala <[email protected]> Commit-Queue: Rob Mohr <[email protected]> Reviewed-by: Leonard Chan <[email protected]> Pigweed-Auto-Submit: Jason Graffius <[email protected]>
…rom iterators to raw pointers llvm/llvm-project#74482 in upstream libc++ changed the return type of various iterator functions (*.begin(), *.end(), etc) from raw pointers to iterator objects. The return types for these functions aren't guaranteed to be raw pointers, but we have a bunch of c++ code that assumes they are. We can get pointers from these iterators by simply dereferencing them and taking their address (`&*iter`). Original-Bug: 328282937 Original-Reviewed-on: https://fuchsia-review.googlesource.com/c/fuchsia/+/1010227 Owners-Override: James Robinson <[email protected]> GitOrigin-RevId: 89c40478c5fd4dfd5e64196c364da1b026fb3fde Original-Reviewed-on: https://pigweed-review.googlesource.com/c/pigweed/pigweed/+/221014 Lint: Lint 🤖 <[email protected]> https://pigweed.googlesource.com/pigweed/pigweed pigweed, pw_toolchain Rolled-Commits: 2a5f339c40c1236..97bdfeb645a92f7 Roller-URL: https://ci.chromium.org/b/8742552018064493681 GitWatcher: ignore CQ-Do-Not-Cancel-Tryjobs: true Change-Id: I05e1edd181db20ddb6109eb12b466672ecda551b Reviewed-on: https://pigweed-review.googlesource.com/c/pigweed/quickstart/bazel/+/222862 Bot-Commit: Pigweed Roller <[email protected]> Commit-Queue: Pigweed Roller <[email protected]> Lint: Lint 🤖 <[email protected]>
…rom iterators to raw pointers llvm/llvm-project#74482 in upstream libc++ changed the return type of various iterator functions (*.begin(), *.end(), etc) from raw pointers to iterator objects. The return types for these functions aren't guaranteed to be raw pointers, but we have a bunch of c++ code that assumes they are. We can get pointers from these iterators by simply dereferencing them and taking their address (`&*iter`). Original-Bug: 328282937 Original-Reviewed-on: https://fuchsia-review.googlesource.com/c/fuchsia/+/1010227 Owners-Override: James Robinson <[email protected]> GitOrigin-RevId: 89c40478c5fd4dfd5e64196c364da1b026fb3fde Original-Reviewed-on: https://pigweed-review.googlesource.com/c/pigweed/pigweed/+/221014 Lint: Lint 🤖 <[email protected]> https://pigweed.googlesource.com/pigweed/pigweed pigweed, pw_toolchain Rolled-Commits: 2a5f339c40c1236..97bdfeb645a92f7 Roller-URL: https://ci.chromium.org/b/8742552022802041249 GitWatcher: ignore CQ-Do-Not-Cancel-Tryjobs: true Change-Id: I3a065d4b1c1052ec16d4c3f89b8ad6cecf82b669 Reviewed-on: https://pigweed-review.googlesource.com/c/pigweed/showcase/rp2/+/222874 Commit-Queue: Pigweed Roller <[email protected]> Lint: Lint 🤖 <[email protected]> Bot-Commit: Pigweed Roller <[email protected]>
std::string_view and std::array iterators don't have to be raw pointers,
and in fact other implementations don't represent them as raw pointers.
Them being raw pointers in libc++ makes it easier for users to write
non-portable code. This is bad in itself, but this is even worse when
considering efforts like hardening where we want an easy ability to
swap for a different iterator type. If users depend on iterators being
raw pointers, this becomes a build break.
Hence, this patch enables the use of __wrap_iter in the unstable ABI,
creating a long term path towards making this the default. This patch
may break code that assumes these iterators are raw pointers for
people compiling with the unstable ABI.
This patch also removes several assumptions that array iterators are
raw pointers in the code base and in the test suite.