Skip to content

Relax self-borrow lifetime in Branch.upstream() #400

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

emlun
Copy link
Contributor

@emlun emlun commented Feb 12, 2019

The lifetime requirement on &self in the Branch.upstream() method causes the self-borrow to be extended to the lifetime of the repository, which makes it hard to use. It seems like this lifetime can be relaxed.

However, I'm quite new to Rust and even newer to this codebase, so I could very well be missing something here. But at least cargo test still succeeds after this. 🙂

Below is a motivating code sample along with more details about the issue, copied from the commit message.


The following code currently does not compile:

fn get_upstreams<'repo>(repo: &'repo git2::Repository) ->
  Vec<(git2::Branch<'repo>, Option<git2::Branch<'repo>>)>
{
  repo
    .branches(None)   // pub fn branches(&self, filter: Option<git2::BranchType>) -> Result<git2::Branches, git2::Error>
    .unwrap()         // git2::Branches<'a>: Iterator<Item = Result<(git2::Branch<'repo>, git2::BranchType), git2::Error>>
    .flatten()        // Iterator<Item = (git2::Branch<'repo>, git2::BranchType)>
    .map(|(branch, _): (git2::Branch<'repo>, _)| {
      let upstream: Option<git2::Branch<'repo>> =
        branch
        .upstream()   // pub fn upstream<'a>(&'a self) -> Result<Branch<'a>, Error>
        .ok();
      (branch, upstream)
    })                // Iterator<Item = (git2::Branch<'repo>, git2::Branch<'repo>)>
  .collect()
}

Error:

error[E0597]: `branch` does not live long enough
  --> src/config/data/project.rs:49:55
   |
49 |           let upstream: Option<git2::Branch<'repo>> = branch.upstream().ok();
   |                                                       ^^^^^^ borrowed value does not live long enough
...
58 |         })
   |         - borrowed value only lives until here
   |
note: borrowed value must be valid for the lifetime 'repo as defined on the method body at 39:30...
  --> src/config/data/project.rs:39:30
   |
39 |   fn local_branches_internal<'repo>(&self, repo: &'repo git2::Repository) -> Result<Vec<(Branch, git2::Branch<'repo>, Option<git2::Branch<'repo>>)>, git2::Error> {

This is because the .upstream() method is declared with the same self-borrow lifetime as the return value:

    pub fn upstream<'a>(&'a self) -> Result<Branch<'a>, Error> { /* ... */ }

which means that the .upstream() call is still borrowing self even
after it returns, even though it returns an owned value and not a reference. Relaxing the self-borrow lifetime allows the above code sample to compile successfully. The lifetime of the (maybe) returned upstream Branch will also be that of the repository, but otherwise unrelated to the lifetime of the Branch that constructed it.

The following code currently does not compile:

```rust
fn get_upstreams<'repo>(repo: &'repo git2::Repository) ->
  Vec<(git2::Branch<'repo>, Option<git2::Branch<'repo>>)>
{
  repo
    .branches(None)   // pub fn branches(&self, filter: Option<git2::BranchType>) -> Result<git2::Branches, git2::Error>
    .unwrap()         // git2::Branches<'a>: Iterator<Item = Result<(git2::Branch<'repo>, git2::BranchType), git2::Error>>
    .flatten()        // Iterator<Item = (git2::Branch<'repo>, git2::BranchType)>
    .map(|(branch, _): (git2::Branch<'repo>, _)| {
      let upstream: Option<git2::Branch<'repo>> =
        branch
        .upstream()   // pub fn upstream<'a>(&'a self) -> Result<Branch<'a>, Error>
        .ok();
      (branch, upstream)
    })                // Iterator<Item = (git2::Branch<'repo>, git2::Branch<'repo>)>
  .collect()
}
```

Error:

```
error[E0597]: `branch` does not live long enough
  --> src/config/data/project.rs:49:55
   |
49 |           let upstream: Option<git2::Branch<'repo>> = branch.upstream().ok();
   |                                                       ^^^^^^ borrowed value does not live long enough
...
58 |         })
   |         - borrowed value only lives until here
   |
note: borrowed value must be valid for the lifetime 'repo as defined on the method body at 39:30...
  --> src/config/data/project.rs:39:30
   |
39 |   fn local_branches_internal<'repo>(&self, repo: &'repo git2::Repository) -> Result<Vec<(Branch, git2::Branch<'repo>, Option<git2::Branch<'repo>>)>, git2::Error> {
```

This is because the `.upstream()` method is declared with the same
self-borrow lifetime as the return value:

```rust
    pub fn upstream<'a>(&'a self) -> Result<Branch<'a>, Error> { /* ... */ }
```

which means that the `.upstream()` call is still borrowing `self` even
after it returns, even though it returns an owned value and not a
reference. Relaxing the self-borrow lifetime allows the above code
sample to compile successfully. The lifetime of the (maybe) returned
upstream `Branch` will also be that of the repository, but otherwise
unrelated to the lifetime of the `Branch` that constructed it.
@emlun emlun force-pushed the relax-branch-upstream-self-borrow-lifetime branch from c3a84e2 to f4b5516 Compare February 13, 2019 00:01
@alexcrichton
Copy link
Member

@joshtriplett can you take a look at this?

@emlun
Copy link
Contributor Author

emlun commented Apr 12, 2019

Any movement on this?

@alexcrichton
Copy link
Member

Sorry for the delay, but I can at least add my thoughts here. This new signature is memory unsafe since it returns an arbitrary lifetime that isn't connected to anything else. I don't have a chance to investigate this more closely, but that'll probably need fixing before merging.

@emlun
Copy link
Contributor Author

emlun commented May 16, 2019

Oh right, of course. Looks like I mixed things up when I said

The lifetime of the (maybe) returned upstream Branch will also be that of the repository, but otherwise unrelated to the lifetime of the Branch that constructed it.

I suppose the signature to represent that would be

pub fn upstream(&self) -> Result<Branch<'repo>, Error>

Does that make sense? The lifetime of the returned Branch doesn't need to be related to the lifetime of self, does it?

@alexcrichton
Copy link
Member

That would tie it to something yeah but the C code needs to be reviewed to make sure that's the actual lifetime of the object returned.

@emlun
Copy link
Contributor Author

emlun commented May 17, 2019

Ok. I'm afraid I don't know enough C (namely, close to none at all) to do that, but here's the beginning of the call stack as far as I can tell. It looks to me like a git_reference is probably valid for as long as its parent git_refdb * is alive, whatever that is.

Is this useful? I would gladly dig further into this, but I would need some guidance on how to go about tracing the memory management.


  • git2::Branch::upstream(&self) calls raw::git_branch_upstream

  • git_branch_upstream calls git_reference_lookup:

    int git_branch_upstream(
            git_reference **tracking_out,
            const git_reference *branch)
    {
            int error;
            git_buf tracking_name = GIT_BUF_INIT;
    
            if ((error = git_branch_upstream_name(&tracking_name,
                    git_reference_owner(branch), git_reference_name(branch))) < 0)
                            return error;
    
            error = git_reference_lookup(
                    tracking_out,
                    git_reference_owner(branch),
                    git_buf_cstr(&tracking_name));
    
            git_buf_free(&tracking_name);
            return error;
    }
  • git_reference is defined in refs.h:

    struct git_reference {
        git_refdb *db;
        git_ref_t type;
    
        union {
            git_oid oid;
            char *symbolic;
        } target;
    
        git_oid peel;
        char name[GIT_FLEX_ARRAY];
    };
  • git_reference_lookup calls git_reference_lookup_resolved:

    int git_reference_lookup(git_reference **ref_out,
            git_repository *repo, const char *name)
    {
            return git_reference_lookup_resolved(ref_out, repo, name, 0);
    }
  • git_reference_lookup_resolved calls git_refdb_lookup:

    int git_reference_lookup_resolved(
            git_reference **ref_out,
            git_repository *repo,
            const char *name,
            int max_nesting)
    {
            git_refname_t scan_name;
            git_ref_t scan_type;
            int error = 0, nesting;
            git_reference *ref = NULL;
            git_refdb *refdb;
    
            /* code omitted for brevity... */
    
                    if ((error = git_refdb_lookup(&ref, refdb, scan_name)) < 0)
                            return error;
    
            /* code omitted for brevity... */
    
            *ref_out = ref;
            return 0;
    }
  • git_refdb_lookup calls a polymorphic lookup method:

    int git_refdb_lookup(git_reference **out, git_refdb *db, const char *ref_name)
    {
            git_reference *ref;
            int error;
    
            assert(db && db->backend && out && ref_name);
    
            error = db->backend->lookup(&ref, db->backend, ref_name);
            if (error < 0)
                    return error;
    
            GIT_REFCOUNT_INC(db);
            ref->db = db;
    
            *out = ref;
            return 0;
    }
  • lookup is declared in struct git_refdb_backend in refdb_backend.h:

            /**
            * Queries the refdb backend for a given reference.  A refdb
            * implementation must provide this function.
            */
            int (*lookup)(
                    git_reference **out,
                    git_refdb_backend *backend,
                    const char *ref_name);
  • refdb_fs.c seems to be the only implementation in the standard distribution, and implements the method with refdb_fs_backend__lookup:

    int git_refdb_backend_fs(
            git_refdb_backend **backend_out,
            git_repository *repository)
    {
            int t = 0;
            git_buf gitpath = GIT_BUF_INIT;
            refdb_fs_backend *backend;
    
            backend = git__calloc(1, sizeof(refdb_fs_backend));
            GITERR_CHECK_ALLOC(backend);
    
            backend->repo = repository;
    
            /* code omitted for brevity... */
    
            backend->parent.lookup = &refdb_fs_backend__lookup;
    
            /* code omitted for brevity... */
    
            *backend_out = (git_refdb_backend *)backend;
            return 0;
    
    fail:
            git_buf_free(&gitpath);
            git__free(backend->gitpath);
            git__free(backend->commonpath);
            git__free(backend);
            return -1;
    }
  • refdb_fs_backend__lookup calls loose_lookup and packed_lookup:

    static int refdb_fs_backend__lookup(
            git_reference **out,
            git_refdb_backend *_backend,
            const char *ref_name)
    {
            refdb_fs_backend *backend = (refdb_fs_backend *)_backend;
            int error;
    
            assert(backend);
    
            if (!(error = loose_lookup(out, backend, ref_name)))
                    return 0;
    
            /* only try to lookup this reference on the packfile if it
            * wasn't found on the loose refs; not if there was a critical error */
            if (error == GIT_ENOTFOUND) {
                    giterr_clear();
                    error = packed_lookup(out, backend, ref_name);
            }
    
            return error;
    }
  • loose_lookup calls git_reference__alloc_symbolic or git_reference__alloc:

    static int loose_lookup(
        git_reference **out,
        refdb_fs_backend *backend,
        const char *ref_name)
    {
        git_buf ref_file = GIT_BUF_INIT;
        int error = 0;
        const char *ref_dir;
    
        /* code omitted for brevity... */
    
        else if (git__prefixcmp(git_buf_cstr(&ref_file), GIT_SYMREF) == 0) {
            const char *target;
    
            git_buf_rtrim(&ref_file);
    
            if (!(target = loose_parse_symbolic(&ref_file)))
                error = -1;
            else if (out != NULL)
                *out = git_reference__alloc_symbolic(ref_name, target);
        } else {
            git_oid oid;
    
            if (!(error = loose_parse_oid(&oid, ref_name, &ref_file)) &&
                out != NULL)
                *out = git_reference__alloc(ref_name, &oid, NULL);
        }
    
        git_buf_free(&ref_file);
        return error;
    }
  • packed_lookup also calls git_reference__alloc

  • git_reference__alloc_symbolic and git_reference__alloc call alloc_ref:

    git_reference *git_reference__alloc_symbolic(
        const char *name, const char *target)
    {
        git_reference *ref;
    
        assert(name && target);
    
        ref = alloc_ref(name);
        if (!ref)
            return NULL;
    
        ref->type = GIT_REF_SYMBOLIC;
    
        if ((ref->target.symbolic = git__strdup(target)) == NULL) {
            git__free(ref);
            return NULL;
        }
    
        return ref;
    }
    
    git_reference *git_reference__alloc(
        const char *name,
        const git_oid *oid,
        const git_oid *peel)
    {
        git_reference *ref;
    
        assert(name && oid);
    
        ref = alloc_ref(name);
        if (!ref)
            return NULL;
    
        ref->type = GIT_REF_OID;
        git_oid_cpy(&ref->target.oid, oid);
    
        if (peel != NULL)
            git_oid_cpy(&ref->peel, peel);
    
        return ref;
    }
  • alloc_ref calls git_calloc and memcpy:

    static git_reference *alloc_ref(const char *name)
    {
        git_reference *ref = NULL;
        size_t namelen = strlen(name), reflen;
    
        if (!GIT_ADD_SIZET_OVERFLOW(&reflen, sizeof(git_reference), namelen) &&
            !GIT_ADD_SIZET_OVERFLOW(&reflen, reflen, 1) &&
            (ref = git__calloc(1, reflen)) != NULL)
            memcpy(ref->name, name, namelen + 1);
    
        return ref;
    }

@alexcrichton
Copy link
Member

Ok I think that roughly looks good to me, want to update the PR?

@emlun
Copy link
Contributor Author

emlun commented May 22, 2019

Done!

@alexcrichton alexcrichton merged commit 9fbec03 into rust-lang:master May 22, 2019
@emlun emlun deleted the relax-branch-upstream-self-borrow-lifetime branch May 26, 2019 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants