-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Prefer GEP instructions over weird pointer casting #21115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
r? @huonw (rust_highfive has picked a reviewer for you, use r? to override) |
@@ -630,8 +630,7 @@ fn bind_subslice_pat(bcx: Block, | |||
let vec_datum = match_datum(val, vec_ty); | |||
let (base, len) = vec_datum.get_vec_base_and_len(bcx); | |||
|
|||
let slice_byte_offset = Mul(bcx, vt.llunit_size, C_uint(bcx.ccx(), offset_left)); | |||
let slice_begin = tvec::pointer_add_byte(bcx, base, slice_byte_offset); | |||
let slice_begin = InBoundsGEP(bcx, base, &[C_uint(bcx.ccx(), offset_left)]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, doesn't this change things: previously it was using size == 1 for ()
, but this new code appears to be using 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And using 0 is correct here, otherwise we get a slice that actually points to data past the original slice. Not so good. I'll add a test for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it doesn't matter where pointers to zero-sized types point to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least the previous code could cause UB AFAICT. We used an inbounds GEP that created out of bounds addresses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tbu- it's likely that we can regard any pointer-to-zero-sized-type as valid, but that doesn't mean that it is valid/sensible for this specific pointer to move outside the object from which it supposedly came; for one, how we did it was UB, as @dotdash said, and, Rust code can care about the actual real addresses of pointers so this is incorrect in that respect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@huonw Rust code can care about addresses of zero-sized-type pointers, however the whole standard library assumes that it doesn't. It's inserting fake values everywhere, it often casts some numerical values to such a pointer. E.g. like that:
mem::transmute(1us)
There are two places left where we used to only know the byte size of/offset into an array and had to cast to i8 and back to get the right addresses. But by now, we always know the sizes in terms of the number of elements in the array. In fact we have to add an extra Mul instruction so we can use the weird cast-to-u8 code. So we should really just embrace our new knowledge and use simple GEPs to do the address calculations. Additionally, the pointer calculations in bind_subslice_pat don't handle zero-sized types correctly, producing slices that point outside the array that is being matched against. Using GEP fixes that as well. Fixes rust-lang#3729
There are two places left where we used to only know the byte size of/offset into an array and had to cast to i8 and back to get the right addresses. But by now, we always know the sizes in terms of the number of elements in the array. In fact we have to add an extra Mul instruction so we can use the weird cast-to-u8 code. So we should really just embrace our new knowledge and use simple GEPs to do the address calculations. Fixes rust-lang#3729
There are two places left where we used to only know the byte
size of/offset into an array and had to cast to i8 and back to get the
right addresses. But by now, we always know the sizes in terms of the
number of elements in the array. In fact we have to add an extra Mul
instruction so we can use the weird cast-to-u8 code. So we should really
just embrace our new knowledge and use simple GEPs to do the address
calculations.
Fixes #3729