Skip to content

Stacked Borrows: raw pointers inherit the tag from their parent pointer #142170

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

RalfJung
Copy link
Member

@RalfJung RalfJung commented Jun 7, 2025

The biggest design mistake of SB in my opinion was the decision to have raw pointers be distinct from the reference they are derived from. This has multiple undesirable consequences:

  • Within a function, a let mut x and &raw mut x are not allowed to be used in an interleaving way, which is quite surprising and requires dedicated work-arounds e.g. in c2rust.
  • Casting a &mut T to a *const T results in a read-only pointer, which regularly startles people.
  • On the implementation side, this means we have to recognize all places where "safe" pointers turn into raw pointers, which turns out to be tricky for Box.

TB fully mitigates this by having a raw pointer inherit the tag from its parent pointer. I think we should do the same for SB. However, this requires a bit of hackery to prevent code like this from becoming UB:

fn foo(x: &mut [i32]) {
  let ptr = x.as_mut_ptr();
  let len = x.len();
  assert!(len > 0);
  ptr.write(0);
}

Under SB, x.len() causes a read of the entire slice which invalidates ptr since that is derived from a child of x (created as part of the implicit retag when calling as_mut_ptr).

So this experiment adds some special magic hackery to len to avoid x.len() from retagging anything. len is already special in invisible ways by having the function call be replaced by a MIR op; this extends that magic for the purposes of our alias tracking.

We also do one further change to the SB access logic: writing to a Unique does not invalidate SRW above this item. This, too, matches Tree Borrows. It does not affect local reasoning since owners of a Unique know whether any SRW have been pushed on top.

This is definitely not the final answer, but it is a minimal step that lets us make SB slightly cleaner, thus unlocking some patterns in the ecosystem that Miri has so far rejected, and some cleanup in the compiler. Longer-term plans include a new attribute one can put on functions to avoid retags -- putting that attribute on len or as_mut_ptr would fix the above example.

The main potential downside of this PR is that it will lead Miri to accept code like let x = 5; (&raw const x).cast_mut().write(0); by default. This is discussed in rust-lang/unsafe-code-guidelines#400. Depending on how likely we are to make this UB in the future, we may want Miri to keep people away from those patterns. However, rejecting such code in an operational way is non-trivial; the only proposal we have is arguably a hack. Also, note that destructors will receive an &mut self pointing to such an immutable local variable, so operationally, it seems a bit futile to pretend that they are immutable.

TODO:

  • Check which further TB tests should be moved to "both borrows" now.
  • Entirely remove RetagKind::Raw.
  • Add mir-opt tests checking the MIR around x.len() looks the way we want it to.

try-job: *gnu*aux

@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jun 7, 2025
@RalfJung
Copy link
Member Author

RalfJung commented Jun 7, 2025

@bors2 try

@rust-bors
Copy link

rust-bors bot commented Jun 7, 2025

⌛ Trying commit 65c6154 with merge 3942c14

To cancel the try build, run the command @bors2 try cancel.

rust-bors bot added a commit that referenced this pull request Jun 7, 2025
Stacked Borrows: raw pointers inherit the tag from their parent pointer

The biggest design mistake of SB in my opinion was the decision to have raw pointers be distinct from the reference they are derived from. This has multiple undesirable consequences:
- Within a function, a `let mut x` and `&raw mut x` are not allowed to be used in an interleaving way, which is quite surprising and requires dedicated work-arounds e.g. in c2rust.
- Casting a `&mut T` to a `*const T` results in a read-only pointer, which regularly startles people.
- On the implementation side, this means we have to recognize all places where "safe" pointers turn into raw pointers, which turns out to be tricky for `Box`.

TB fully mitigates this by having a raw pointer inherit the tag from its parent pointer. I think we should do the same for SB. However, this requires a bit of hackery to prevent code like this from becoming UB:
```rust
fn foo(x: &mut [i32]) {
  let ptr = x.as_mut_ptr();
  let len = x.len();
  assert!(len > 0);
  ptr.write(0);
}
```
Under SB, `x.len()` causes a read of the entire slice which invalidates `ptr` since that is derived from a child of `x` (created as part of the implicit retag when calling `as_mut_ptr`).

So this experiment adds some special magic hackery to `len` to avoid `x.len()` from retagging anything. `len` is already special in invisible ways by having the function call be replaced by a MIR op; this extends that magic for the purposes of our alias tracking.

This is definitely not the final answer, but it is a minimal step that lets us make SB slightly cleaner, thus unlocking some patterns in the ecosystem that Miri has so far rejected, and some cleanup in the compiler. Longer-term plans include a new attribute one can put on functions to avoid retags -- putting that attribute on `len` *or* `as_mut_ptr` would fix the above example.

TODO:
- Check which further TB tests should be moved to "both borrows" now.
- Entirely remove `RetagKind::Raw`.
- Add mir-opt tests checking the MIR around `x.len()` looks the way we want it to.

try-job: `*gnu*aux`
@rust-log-analyzer

This comment has been minimized.

@bjorn3
Copy link
Member

bjorn3 commented Jun 7, 2025

So this experiment adds some special magic hackery to len to avoid x.len() from retagging anything. len is already special in invisible ways by having the function call be replaced by a MIR op; this extends that magic for the purposes of our alias tracking.

It seems to me like a cleaner way to handle this would be to add an attribute that can be used to indicate that a given function argument should not be retagged (and thus also should lose noalias on the LLVM side). Kinda like how we already have #[may_dangle] for drop impls (as unstable attribute).

@RalfJung RalfJung force-pushed the sb-raw-retag branch 2 times, most recently from ba46876 to 7816394 Compare June 7, 2025 17:19
@RalfJung
Copy link
Member Author

RalfJung commented Jun 7, 2025

It seems to me like a cleaner way to handle this would be to add an attribute that can be used to indicate that a given function argument should not be retagged (and thus also should lose noalias on the LLVM side). Kinda like how we already have #[may_dangle] for drop impls (as unstable attribute).

That's the longer-term solution I mentioned. I don't know how to implement the caller-side behavior of this, since the retags come from the reborrows that are inserted during MIR building. (The callee-side retags of function arguments are easy to suppress, comparatively.)

Another form this attribute could take is that we could have functions that take a raw pointer receiver, and we arrange things so that they get called even if the receiver is actually a reference, implicitly coercing that to a raw pointer. That would neatly prevent the implicit reborrows during MIR building (I hope), but it needs cooperation of method resolution which I really don't want to touch.^^ This restricts the attribute to functions that can safely work on raw pointers, but that does not seem entirely unreasonable -- it definitely applies to our main candidates (as_mut_ptr, as_ptr, len).

@rust-bors
Copy link

rust-bors bot commented Jun 7, 2025

💔 Test failed

@RalfJung
Copy link
Member Author

RalfJung commented Jun 7, 2025

That failure was interesting. Previously,we happened to accept code that does something like this on an interior mutable local variable

  • &mut local as *mut _, store this pointer
  • (&local).get()
  • use the previously stored pointer

This used to work thanks to the "quirk" where invalidating a mutable reference with a "foreign read" in SB does not necessarily invalidate all its derived raw pointers. With this PR, that no longer works since the raw pointer is identical to the mutable reference, so the code needs to be adjusted a little. Arguably this code was highly sketchy before...
(In TB, this worked fine since the variable is never written to.)

@bors2 try

@rust-bors
Copy link

rust-bors bot commented Jun 7, 2025

⌛ Trying commit 8ae194b with merge afbfda1

To cancel the try build, run the command @bors2 try cancel.

rust-bors bot added a commit that referenced this pull request Jun 7, 2025
Stacked Borrows: raw pointers inherit the tag from their parent pointer

The biggest design mistake of SB in my opinion was the decision to have raw pointers be distinct from the reference they are derived from. This has multiple undesirable consequences:
- Within a function, a `let mut x` and `&raw mut x` are not allowed to be used in an interleaving way, which is quite surprising and requires dedicated work-arounds e.g. in c2rust.
- Casting a `&mut T` to a `*const T` results in a read-only pointer, which regularly startles people.
- On the implementation side, this means we have to recognize all places where "safe" pointers turn into raw pointers, which turns out to be tricky for `Box`.

TB fully mitigates this by having a raw pointer inherit the tag from its parent pointer. I think we should do the same for SB. However, this requires a bit of hackery to prevent code like this from becoming UB:
```rust
fn foo(x: &mut [i32]) {
  let ptr = x.as_mut_ptr();
  let len = x.len();
  assert!(len > 0);
  ptr.write(0);
}
```
Under SB, `x.len()` causes a read of the entire slice which invalidates `ptr` since that is derived from a child of `x` (created as part of the implicit retag when calling `as_mut_ptr`).

So this experiment adds some special magic hackery to `len` to avoid `x.len()` from retagging anything. `len` is already special in invisible ways by having the function call be replaced by a MIR op; this extends that magic for the purposes of our alias tracking.

We also do one further change to the SB access logic: writing to a `Unique` does not invalidate SRW above this item. This, too, matches Tree Borrows. It does not affect local reasoning since owners of a `Unique` know whether any SRW have been pushed on top.

This is definitely not the final answer, but it is a minimal step that lets us make SB slightly cleaner, thus unlocking some patterns in the ecosystem that Miri has so far rejected, and some cleanup in the compiler. Longer-term plans include a new attribute one can put on functions to avoid retags -- putting that attribute on `len` *or* `as_mut_ptr` would fix the above example.

The main potential downside of this PR is that it will lead Miri to accept code like `let x = 5; (&raw const x).cast_mut().write(0);` by default. This is discussed in rust-lang/unsafe-code-guidelines#400. Depending on how likely we are to make this UB in the future, we may want Miri to keep people away from those patterns. However, rejecting such code in an operational way is non-trivial; the only proposal we have is [arguably a hack](rust-lang/unsafe-code-guidelines#400 (comment)). Also, note that destructors will receive an `&mut self` pointing to such an immutable local variable, so operationally, it seems a bit futile to pretend that they are immutable.

TODO:
- Check which further TB tests should be moved to "both borrows" now.
- Entirely remove `RetagKind::Raw`.
- Add mir-opt tests checking the MIR around `x.len()` looks the way we want it to.

try-job: `*gnu*aux`
@rust-bors
Copy link

rust-bors bot commented Jun 7, 2025

☀️ Try build successful (CI)
Build commit: afbfda1 (afbfda1c386d2ff220c491e3e3d479a6dbb18002)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants