Skip to content

Commit a500351

Browse files
Auto merge of #142170 - RalfJung:sb-raw-retag, r=<try>
Stacked Borrows: raw pointers inherit the tag from their parent pointer The biggest design mistake of SB in my opinion was the decision to have raw pointers be distinct from the reference they are derived from. This has multiple undesirable consequences: - Within a function, a `let mut x` and `&raw mut x` are not allowed to be used in an interleaving way, which is quite surprising and requires dedicated work-arounds e.g. in c2rust. - Casting a `&mut T` to a `*const T` results in a read-only pointer, which regularly startles people. - On the implementation side, this means we have to recognize all places where "safe" pointers turn into raw pointers, which turns out to be tricky for `Box`. TB fully mitigates this by having a raw pointer inherit the tag from its parent pointer. I think we should do the same for SB. However, this requires a bit of hackery to prevent code like this from becoming UB: ```rust fn foo(x: &mut [i32]) { let ptr = x.as_mut_ptr(); let len = x.len(); assert!(len > 0); ptr.write(0); } ``` Under SB, `x.len()` causes a read of the entire slice which invalidates `ptr` since that is derived from a child of `x` (created as part of the implicit retag when calling `as_mut_ptr`). So this experiment adds some special magic hackery to `len` to avoid `x.len()` from retagging anything. `len` is already special in invisible ways by having the function call be replaced by a MIR op; this extends that magic for the purposes of our alias tracking. We also do one further change to the SB access logic: writing to a `Unique` does not invalidate SRW above this item. This, too, matches Tree Borrows. It does not affect local reasoning since owners of a `Unique` know whether any SRW have been pushed on top. This is definitely not the final answer, but it is a minimal step that lets us make SB slightly cleaner, thus unlocking some patterns in the ecosystem that Miri has so far rejected, and some cleanup in the compiler. Longer-term plans include a new attribute one can put on functions to avoid retags -- putting that attribute on `len` *or* `as_mut_ptr` would fix the above example. The main potential downside of this PR is that it will lead Miri to accept code like `let x = 5; (&raw const x).cast_mut().write(0);` by default. This is discussed in rust-lang/unsafe-code-guidelines#400. Depending on how likely we are to make this UB in the future, we may want Miri to keep people away from those patterns. However, rejecting such code in an operational way is non-trivial; the only proposal we have is [arguably a hack](rust-lang/unsafe-code-guidelines#400 (comment)). Also, note that destructors will receive an `&mut self` pointing to such an immutable local variable, so operationally, it seems a bit futile to pretend that they are immutable. TODO: - Check which further TB tests should be moved to "both borrows" now. - Entirely remove `RetagKind::Raw`. - Add mir-opt tests checking the MIR around `x.len()` looks the way we want it to. try-job: `*gnu*aux`
2 parents 0b65d0d + add928f commit a500351

File tree

62 files changed

+372
-373
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

62 files changed

+372
-373
lines changed

compiler/rustc_middle/src/mir/mod.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1108,7 +1108,7 @@ pub enum LocalInfo<'tcx> {
11081108
/// A temporary created during evaluating `if` predicate, possibly for pattern matching for `let`s,
11091109
/// and subject to Edition 2024 temporary lifetime rules
11101110
IfThenRescopeTemp { if_then: HirId },
1111-
/// A temporary created during the pass `Derefer` to avoid it's retagging
1111+
/// A temporary created during the pass `Derefer` to avoid its retagging
11121112
DerefTemp,
11131113
/// A temporary created for borrow checking.
11141114
FakeBorrow,

compiler/rustc_mir_transform/src/add_retag.rs

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,6 @@
44
//! of MIR building, and only after this pass we think of the program has having the
55
//! normal MIR semantics.
66
7-
use rustc_hir::LangItem;
87
use rustc_middle::mir::*;
98
use rustc_middle::ty::{self, Ty, TyCtxt};
109

@@ -28,7 +27,6 @@ fn may_contain_reference<'tcx>(ty: Ty<'tcx>, depth: u32, tcx: TyCtxt<'tcx>) -> b
2827
// References and Boxes (`noalias` sources)
2928
ty::Ref(..) => true,
3029
ty::Adt(..) if ty.is_box() => true,
31-
ty::Adt(adt, _) if tcx.is_lang_item(adt.did(), LangItem::PtrUnique) => true,
3230
// Compound types: recurse
3331
ty::Array(ty, _) | ty::Slice(ty) => {
3432
// This does not branch so we keep the depth the same.
@@ -153,6 +151,13 @@ impl<'tcx> crate::MirPass<'tcx> for AddRetag {
153151
}
154152
}
155153
Rvalue::Ref(..) => None,
154+
// Also skip coercions whose source is already a local. This is crucial
155+
// to avoid unnecessary retags in `my_array.len()` calls.
156+
Rvalue::Cast(
157+
CastKind::PointerCoercion(_, CoercionSource::Implicit),
158+
Operand::Copy(p) | Operand::Move(p),
159+
_,
160+
) if p.as_local().is_some() => None,
156161
_ => {
157162
if needs_retag(place) {
158163
Some(RetagKind::Default)

compiler/rustc_mir_transform/src/lib.rs

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -628,6 +628,7 @@ fn run_runtime_lowering_passes<'tcx>(tcx: TyCtxt<'tcx>, body: &mut Body<'tcx>) {
628628
fn run_runtime_cleanup_passes<'tcx>(tcx: TyCtxt<'tcx>, body: &mut Body<'tcx>) {
629629
let passes: &[&dyn MirPass<'tcx>] = &[
630630
&lower_intrinsics::LowerIntrinsics,
631+
&lower_slice_len::LowerSliceLenCalls,
631632
&remove_place_mention::RemovePlaceMention,
632633
&simplify::SimplifyCfg::PreOptimizations,
633634
];
@@ -671,9 +672,6 @@ pub(crate) fn run_optimization_passes<'tcx>(tcx: TyCtxt<'tcx>, body: &mut Body<'
671672
&check_null::CheckNull,
672673
// Before inlining: trim down MIR with passes to reduce inlining work.
673674

674-
// Has to be done before inlining, otherwise actual call will be almost always inlined.
675-
// Also simple, so can just do first.
676-
&lower_slice_len::LowerSliceLenCalls,
677675
// Perform instsimplify before inline to eliminate some trivial calls (like clone
678676
// shims).
679677
&instsimplify::InstSimplify::BeforeInline,

compiler/rustc_mir_transform/src/lower_slice_len.rs

Lines changed: 86 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,18 @@
22
//! It should run before inlining!
33
44
use rustc_hir::def_id::DefId;
5+
use rustc_index::IndexVec;
56
use rustc_middle::mir::*;
6-
use rustc_middle::ty::TyCtxt;
7+
use rustc_middle::ty::adjustment::PointerCoercion;
8+
use rustc_middle::ty::{Ty, TyCtxt};
79

810
pub(super) struct LowerSliceLenCalls;
911

1012
impl<'tcx> crate::MirPass<'tcx> for LowerSliceLenCalls {
1113
fn is_enabled(&self, sess: &rustc_session::Session) -> bool {
12-
sess.mir_opt_level() > 0
14+
// This pass also removes some otherwise-problematic retags, so we
15+
// always enable it when retags matter.
16+
sess.mir_opt_level() > 0 || sess.opts.unstable_opts.mir_emit_retag
1317
}
1418

1519
fn run_pass(&self, tcx: TyCtxt<'tcx>, body: &mut Body<'tcx>) {
@@ -23,7 +27,7 @@ impl<'tcx> crate::MirPass<'tcx> for LowerSliceLenCalls {
2327
let basic_blocks = body.basic_blocks.as_mut_preserves_cfg();
2428
for block in basic_blocks {
2529
// lower `<[_]>::len` calls
26-
lower_slice_len_call(block, slice_len_fn_item_def_id);
30+
lower_slice_len_call(tcx, &mut body.local_decls, block, slice_len_fn_item_def_id);
2731
}
2832
}
2933

@@ -32,7 +36,12 @@ impl<'tcx> crate::MirPass<'tcx> for LowerSliceLenCalls {
3236
}
3337
}
3438

35-
fn lower_slice_len_call<'tcx>(block: &mut BasicBlockData<'tcx>, slice_len_fn_item_def_id: DefId) {
39+
fn lower_slice_len_call<'tcx>(
40+
tcx: TyCtxt<'tcx>,
41+
local_decls: &mut IndexVec<Local, LocalDecl<'tcx>>,
42+
block: &mut BasicBlockData<'tcx>,
43+
slice_len_fn_item_def_id: DefId,
44+
) {
3645
let terminator = block.terminator();
3746
if let TerminatorKind::Call {
3847
func,
@@ -50,11 +59,12 @@ fn lower_slice_len_call<'tcx>(block: &mut BasicBlockData<'tcx>, slice_len_fn_ite
5059
// perform modifications from something like:
5160
// _5 = core::slice::<impl [u8]>::len(move _6) -> bb1
5261
// into:
53-
// _5 = PtrMetadata(move _6)
62+
// _5 = PtrMetadata(move _6);
5463
// goto bb1
5564

5665
// make new RValue for Len
57-
let r_value = Rvalue::UnaryOp(UnOp::PtrMetadata, arg.node.clone());
66+
let arg = arg.node.clone();
67+
let r_value = Rvalue::UnaryOp(UnOp::PtrMetadata, arg.clone());
5868
let len_statement_kind = StatementKind::Assign(Box::new((*destination, r_value)));
5969
let add_statement =
6070
Statement { kind: len_statement_kind, source_info: terminator.source_info };
@@ -64,5 +74,75 @@ fn lower_slice_len_call<'tcx>(block: &mut BasicBlockData<'tcx>, slice_len_fn_ite
6474

6575
block.statements.push(add_statement);
6676
block.terminator_mut().kind = new_terminator_kind;
77+
78+
if tcx.sess.opts.unstable_opts.mir_emit_retag {
79+
// If we care about retags, we want there to be *no retag* for this `len` call.
80+
// That means we have to clean up the MIR a little: the MIR will now often be:
81+
// _6 = &*_4;
82+
// _5 = PtrMetadata(move _6);
83+
// Change that to:
84+
// _6 = &raw const *_4;
85+
// _5 = PtrMetadata(move _6);
86+
let len = block.statements.len();
87+
if len >= 2
88+
&& let &mut StatementKind::Assign(box (rebor_dest, ref mut rebor_r)) =
89+
&mut block.statements[len - 2].kind
90+
&& let Some(retag_local) = rebor_dest.as_local()
91+
&& !local_decls[retag_local].is_user_variable()
92+
// The LHS of this previous assignment is the `arg` from above.
93+
&& matches!(arg, Operand::Copy(p) | Operand::Move(p) if p.as_local() == Some(retag_local))
94+
// The RHS is a reference-taking operation.
95+
&& let Rvalue::Ref(_, BorrowKind::Shared, rebor_orig) = *rebor_r
96+
{
97+
let orig_inner_ty = rebor_orig.ty(local_decls, tcx).ty;
98+
// Change the previous statement to use `&raw` instead of `&`.
99+
*rebor_r = Rvalue::RawPtr(RawPtrKind::FakeForPtrMetadata, rebor_orig);
100+
// Change the type of the local to match.
101+
local_decls[retag_local].ty = Ty::new_ptr(tcx, orig_inner_ty, Mutability::Not);
102+
}
103+
// There's a second pattern we need to recognize for `array.len()` calls. There,
104+
// the MIR is:
105+
// _4 = &*_3;
106+
// _6 = move _4 as &[T] (PointerCoercion(Unsize, Implicit));
107+
// StorageDead(_4);
108+
// _5 = PtrMetadata(move _6);
109+
// Change that to:
110+
// _4 = &raw const *_3;
111+
// _6 = move _4 as *const [T] (PointerCoercion(Unsize, Implicit));
112+
// StorageDead(_4);
113+
// _5 = PtrMetadata(move _6);
114+
if len >= 4
115+
&& let [reborrow_stmt, unsize_stmt, storage_dead_stmt] =
116+
block.statements.get_disjoint_mut([len - 4, len - 3, len - 2]).unwrap()
117+
// Check that the 3 statements have the right shape.
118+
&& let &mut StatementKind::Assign(box (rebor_dest, ref mut rebor_r)) = &mut reborrow_stmt.kind
119+
&& let Some(retag_local) = rebor_dest.as_local()
120+
&& !local_decls[retag_local].is_user_variable()
121+
&& let &mut StatementKind::Assign(box (unsized_dest, ref mut unsize_r)) = &mut unsize_stmt.kind
122+
&& let Some(unsized_local) = unsized_dest.as_local()
123+
&& !local_decls[unsized_local].is_user_variable()
124+
&& storage_dead_stmt.kind == StatementKind::StorageDead(retag_local)
125+
// And check that they have the right operands.
126+
&& matches!(arg, Operand::Copy(p) | Operand::Move(p) if p.as_local() == Some(unsized_local))
127+
&& let &mut Rvalue::Cast(
128+
CastKind::PointerCoercion(PointerCoercion::Unsize, CoercionSource::Implicit),
129+
ref cast_op,
130+
ref mut cast_ty,
131+
) = unsize_r
132+
&& let Operand::Copy(unsize_src) | Operand::Move(unsize_src) = cast_op
133+
&& unsize_src.as_local() == Some(retag_local)
134+
&& let Rvalue::Ref(_, BorrowKind::Shared, rebor_orig) = *rebor_r
135+
{
136+
let orig_inner_ty = rebor_orig.ty(local_decls, tcx).ty;
137+
let unsize_inner_ty = cast_ty.builtin_deref(true).unwrap();
138+
// Change the reborrow to use `&raw` instead of `&`.
139+
*rebor_r = Rvalue::RawPtr(RawPtrKind::FakeForPtrMetadata, rebor_orig);
140+
// Change the unsize coercion in the same vein.
141+
*cast_ty = Ty::new_ptr(tcx, unsize_inner_ty, Mutability::Not);
142+
// Change the type of the locals to match.
143+
local_decls[retag_local].ty = Ty::new_ptr(tcx, orig_inner_ty, Mutability::Not);
144+
local_decls[unsized_local].ty = *cast_ty;
145+
}
146+
}
67147
}
68148
}

library/alloctests/tests/sync.rs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -332,15 +332,15 @@ fn weak_self_cyclic() {
332332
#[test]
333333
fn drop_arc() {
334334
let mut canary = AtomicUsize::new(0);
335-
let x = Arc::new(Canary(&mut canary as *mut AtomicUsize));
335+
let x = Arc::new(Canary(&raw mut canary));
336336
drop(x);
337337
assert!(canary.load(Acquire) == 1);
338338
}
339339

340340
#[test]
341341
fn drop_arc_weak() {
342342
let mut canary = AtomicUsize::new(0);
343-
let arc = Arc::new(Canary(&mut canary as *mut AtomicUsize));
343+
let arc = Arc::new(Canary(&raw mut canary));
344344
let arc_weak = Arc::downgrade(&arc);
345345
assert!(canary.load(Acquire) == 0);
346346
drop(arc);

0 commit comments

Comments
 (0)