Skip to content

improve usage of unsafe in gix-pack #1233

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion gix-pack/src/cache/delta/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,11 @@ pub mod traverse;
///
pub mod from_offsets;

/// An item stored within the [`Tree`]
/// An item stored within the [`Tree`] whose data is stored in a pack file, identified by
/// the offset of its first (`offset`) and last (`next_offset`) bytes.
///
/// It represents either a root entry, or one that relies on a base to be resolvable,
/// alongside associated `data` `T`.
pub struct Item<T> {
/// The offset into the pack file at which the pack entry's data is located.
pub offset: crate::data::Offset,
Expand Down
17 changes: 10 additions & 7 deletions gix-pack/src/cache/delta/traverse/resolve.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ use crate::{
data::EntryRange,
};

mod node {
mod root {
use crate::cache::delta::{traverse::util::ItemSliceSync, Item};

/// An item returned by `iter_root_chunks`, allowing access to the `data` stored alongside nodes in a [`Tree`].
Expand All @@ -26,7 +26,10 @@ mod node {
impl<'a, T: Send> Node<'a, T> {
/// SAFETY: The child_items must be unique among between users of the `ItemSliceSync`.
#[allow(unsafe_code)]
pub(crate) unsafe fn new(item: &'a mut Item<T>, child_items: &'a ItemSliceSync<'a, Item<T>>) -> Self {
pub(in crate::cache::delta::traverse) unsafe fn new(
item: &'a mut Item<T>,
child_items: &'a ItemSliceSync<'a, Item<T>>,
) -> Self {
Node { item, child_items }
}
}
Expand Down Expand Up @@ -68,7 +71,7 @@ mod node {
}
}

pub(crate) struct State<'items, F, MBFN, T: Send> {
pub(in crate::cache::delta::traverse) struct State<'items, F, MBFN, T: Send> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can do pub(super)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think the invariant should be mentioned on the fields that have them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and it's those fields that should be visibility restricted

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, I tried to apply this feedback in 696cbb8 (shoehorned into my current feature-PR).

pub delta_bytes: Vec<u8>,
pub fully_resolved_delta_bytes: Vec<u8>,
pub progress: Box<dyn Progress>,
Expand All @@ -78,7 +81,7 @@ pub(crate) struct State<'items, F, MBFN, T: Send> {
}

#[allow(clippy::too_many_arguments)]
pub(crate) fn deltas<T, F, MBFN, E, R>(
pub(in crate::cache::delta::traverse) fn deltas<T, F, MBFN, E, R>(
objects: gix_features::progress::StepShared,
size: gix_features::progress::StepShared,
item: &mut Item<T>,
Expand Down Expand Up @@ -118,9 +121,9 @@ where
// each node is a base, and its children always start out as deltas which become a base after applying them.
// These will be pushed onto our stack until all are processed
let root_level = 0;
// SAFETY: The child items are unique
// SAFETY: The child items are unique, as `item` is the root of a tree of dependent child items.
#[allow(unsafe_code)]
let root_node = unsafe { node::Node::new(item, child_items) };
let root_node = unsafe { root::Node::new(item, child_items) };
let mut nodes: Vec<_> = vec![(root_level, root_node)];
while let Some((level, mut base)) = nodes.pop() {
if should_interrupt.load(Ordering::Relaxed) {
Expand Down Expand Up @@ -240,7 +243,7 @@ fn deltas_mt<T, F, MBFN, E, R>(
objects: gix_features::progress::StepShared,
size: gix_features::progress::StepShared,
progress: &dyn Progress,
nodes: Vec<(u16, node::Node<'_, T>)>,
nodes: Vec<(u16, root::Node<'_, T>)>,
resolve: F,
resolve_data: &R,
modify_base: MBFN,
Expand Down
22 changes: 19 additions & 3 deletions gix-pack/src/cache/delta/traverse/util.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,22 @@
use std::marker::PhantomData;

pub(crate) struct ItemSliceSync<'a, T>
/// SAFETY: This type is used to allow access to a size-optimized vec of items that form a
/// tree, and we need to access it concurrently with each thread taking its own root node,
/// and working its way through all the reachable leaves.
///
/// The tree was built by decoding a pack whose entries refer to its bases only by OFS_DELTA -
/// they are pointing backwards only which assures bases have to be listed first, and that each entry
/// only has a single parent.
///
/// REF_DELTA entries aren't supported here, and cause immediate failure - they are expected to have
/// been resolved before as part of the thin-pack handling.
///
/// If we somehow would allow REF_DELTA entries to point to an in-pack object, then in theory malicious packs could
/// cause all kinds of graphs as they can point anywhere in the pack, but they still can't link an entry to
/// more than one base. And that's what one would really have to do for two threads to encounter the same child.
///
/// Thus I believe it's impossible for this data structure to end up in a place where it violates its assumption.
pub(in crate::cache::delta::traverse) struct ItemSliceSync<'a, T>
where
T: Send,
{
Expand All @@ -14,7 +30,7 @@ impl<'a, T> ItemSliceSync<'a, T>
where
T: Send,
{
pub fn new(items: &'a mut [T]) -> Self {
pub(in crate::cache::delta::traverse) fn new(items: &'a mut [T]) -> Self {
ItemSliceSync {
items: items.as_mut_ptr(),
#[cfg(debug_assertions)]
Expand All @@ -25,7 +41,7 @@ where

// SAFETY: The index must point into the slice and must not be reused concurrently.
#[allow(unsafe_code)]
pub unsafe fn get_mut(&self, index: usize) -> &'a mut T {
pub(in crate::cache::delta::traverse) unsafe fn get_mut(&self, index: usize) -> &'a mut T {
#[cfg(debug_assertions)]
if index >= self.len {
panic!("index out of bounds: the len is {} but the index is {index}", self.len);
Expand Down