Skip to content

Fix for issue #607 [git clean removes content of symlinks] #1414

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

halldorfannar
Copy link

@halldorfannar halldorfannar commented Dec 29, 2017

The problem this PR solves has been reported in issue #607.

There was already existing code in the readlink function of mingw.c that correctly handled mount points (AKA 'junctions') see line 2813 of mingw.c:

/* get target path for symlinks or mount points (aka 'junctions') */
       switch (b->ReparseTag) {
       case IO_REPARSE_TAG_SYMLINK:
              wbuf = (WCHAR*) (((char*) b->SymbolicLinkReparseBuffer.PathBuffer)
                           + b->SymbolicLinkReparseBuffer.SubstituteNameOffset);
              *(WCHAR*) (((char*) wbuf)
                           + b->SymbolicLinkReparseBuffer.SubstituteNameLength) = 0;
              break;
       case IO_REPARSE_TAG_MOUNT_POINT:
              wbuf = (WCHAR*) (((char*) b->MountPointReparseBuffer.PathBuffer)
                           + b->MountPointReparseBuffer.SubstituteNameOffset);
              *(WCHAR*) (((char*) wbuf)
                           + b->MountPointReparseBuffer.SubstituteNameLength) = 0;
              break;
       default:
              errno = EINVAL;
              return -1;
       }

It was therefore rather straight-forward to replicate that check into the two other locations where links are handled.

It should also be noted that junctions (mount points) work like symbolic links (they are soft). That is, if you rename the target directory (junctions only work for directories) the link doesn't update - i.e. it points to a path that has vanished. This is exactly like symbolic links. If this was a hard link the link would automatically update because hard links point to the underlying file data (inode), they don't point to a path.

Let's say you agree that this should be fixed then it's still valid to to reject the patch if the fix isn't correctly implemented. For instance, how do we know that the check for junction (mount point) that is being used in this pull request is correct? To support that claim I offer this: https://msdn.microsoft.com/en-us/library/aa363940.aspx. I can also point back to the code in mingw.c that already exists in Git for Windows and I cited above.

Testing

Use a standard Windows command prompt for the following (except the Git commands)

  1. Make directories in temp to act as link targets, i.e. mkdir %temp%\junction-target and mkdir %temp%\symlink-target%
  2. Put files into both directories, e.g. echo Testing >%temp%\junction-target\test.txt and echo Testing >%temp%\symlink-target\test.txt.
  3. Change current directory into the root of a git repo (you can use your git for windows repo for this)
  4. Make a directory name test-links, i.e. mkdir test-links.
  5. Change current directory to test-links, i.e. cd test-links.
  6. Create a junction soft link, i.e. mklink /j junction %temp%\junction-target
  7. Create a symbolic soft link, i.e. mklink /d symlink %temp%\symlink-target (you need admin privileges to perform this operation so run an elevated command prompt)
  8. Perform a directory listing to see that everything looks as expected, dir
  9. Now switch to your Git command line console. Inside the test-links folder execute git clean -xdfn
  10. If you are running Git prior to this patch you will notice that junction is listed as a directory (notice the trailing slash) while symlink is listed as a file/link (no trailing slash). If you are running with this patch you will notice that both are treated the same.
  11. Now execute git clean -xdf.
  12. If you are using git prior to this patch you will notice that the contents of %temp%\junction-target have been deleted but the contents of %temp%\symlink-target are intact. If you are running git with this patch you will notice that both target folders in %temp% have been left intact.

…nks]

There was already code in the readlink function of mingw.c that correctly handled mount points (AKA 'junctions') so it was rather straight-forward to replicate that code into the two other locations where links are handled. I tested before and after these changes and git clean now handles junctions like other soft links.
@BurtHarris
Copy link

BurtHarris commented Dec 29, 2017

@halldorfannar thanks for the effort you've put into this. This PR clearly puts the fix in the win32 translation layer where it belongs.

I'd like to test in the scenario I ran into the issue under. I'm still such a git beginner that I don't understand how to merge this change into my local repo. Can anyone enlighten me as to the git commands needed to do that?

@halldorfannar
Copy link
Author

@BurtHarris It may be easiest for you to just apply it as a patch. You can always turn a GitHub commit into a patch by adding .patch to the end. In this case the URL is:

https://github.com/git-for-windows/git/commit/bcbe608bc020bcc47845e3cdf81f1fe8883eebf0.patch

Download this file and you can then apply it to your current branch by doing:

git am <path to the downloaded .patch file>

Hope this helps.

@BurtHarris
Copy link

BurtHarris commented Dec 29, 2017

Relevant authoritative MSDN documentation at Mounted Folders and Reparse Point Tags. As far as I know, the later has this is the most comprehensive list of possible types of reparse points out there.

It might make sense to use the IsReparseTagNameSurrogate macro to check a reparse point is semantically equivalant to a symlink, but it has been so long since I've done this kind of code I'm not sure.

@halldorfannar
Copy link
Author

halldorfannar commented Dec 29, 2017

@BurtHarris Yes, good point about the IsReparseTagNameSurrogate. This seems to have been used as a replacement for the same checks in Python: https://bugs.python.org/issue23407. There is one thing that is bothering me though. I just discovered this pull request: #437. Looks like it is exactly what I have made. Based on that I worry that we may have to come up with a specific solution that only applies to git clean since that is the problem that has lead us all to issue #607. @hypersw made a suggestion to change the deletion behavior in git clean that would lead to junction links being removed but the contents left unharmed. That may have a greater chance of getting accepted.

@BurtHarris
Copy link

BurtHarris commented Dec 29, 2017

Yes, I tried implementing @hypersw's suggestion myself, but I've had trouble getting the tests to pass with that change. I'll do a little cleanup and post if you can take a look.

@BurtHarris
Copy link

BurtHarris commented Dec 29, 2017

@halldorfannar Please have a look at #1415. I don't know the development workflow here well, so it's probably not good as stands, but the idea was to build a proof of concept of what @hypersw suggested.

@dscho
Copy link
Member

dscho commented Jan 2, 2018

@halldorfannar thank you for your contribution!

I am not quite convinced, though, that it does the right thing, as junction points are just too different from symbolic links to count as the same.

In fact, I did think that this was the right direction to take, and implemented essentially the same change as you did, opening this Pull Request for discussion: #437

As you can see in particular from this insightful comment by @kblees, there were issues with this approach, and I tend towards trying to find a different solution, still, that treats junctions as mount points instead of symbolic links. See my comment here: #1415 (review)

(Essentially, I'd like to introduce a new function is_mount_point(const char *path) and use that to treat them specially.)

@halldorfannar
Copy link
Author

@dscho Thanks for reviewing my PR. I agree with you. After going deeper into this rabbit hole I have come to the same conclusion. I think the is_mount_point strategy is a good one and it allows us to address the problem where it's causing users pain right now (git clean and potentially git rm) and we can avoid or later decide what we want to do with general versioning behavior for junction points. Who will create a PR for is_mount_point and make the relevant changes to git clean?

@dscho
Copy link
Member

dscho commented Jan 3, 2018

Who will create a PR for is_mount_point and make the relevant changes to git clean?

Ideally, you ;-)

If you get it done before January 17th, it will even make it into Git for Windows v2.16.0!

@dscho
Copy link
Member

dscho commented Jan 3, 2018

@halldorfannar BTW you do not even need to create another Pull Request; just force-push the fix-607 branch and the Pull Request will be updated.

@halldorfannar
Copy link
Author

halldorfannar commented Jan 3, 2018

@dscho Great! I'm happy to help with this. I'm on the road currently but will be back in my office on the 10th where I have macOS, Linux and Windows so I can test the changes locally before pushing them.

@pwagland
Copy link

pwagland commented Feb 6, 2018

I see that this missed the 2.16.0 cutoff… is there any chance that this will be possible in the 2.17.0 timeframe? I am happy to test out any fixes that are made if that is required and/or helpful.

@halldorfannar
Copy link
Author

@pwagland What is the date for 2.17.0 cutoff?

@dscho
Copy link
Member

dscho commented Feb 7, 2018

What is the date for 2.17.0 cutoff?

There is not really any cut-off for Git for Windows versions. If there are critical bugs, I try to release a fixed version as quickly as possible.

In this case, I would not necessarily consider it a critical bug, but big enough to warrant including it into the next Git for Windows version (which will be any of v2.17.0, v2.16.2 or v2.16.1(4)). (I do want to slow down, though, as bugs in v2.16.1 and v2.16.1(2) required quick hot-fixes, and I really do not want to overload Git for Windows users with updates.)

In short: whenever you get to it will be fine.

@BurtHarris
Copy link

Does this obsolete my PR #1415? If so I'll go ahead and close it.

@dscho
Copy link
Member

dscho commented Feb 20, 2018

Does this obsolete my PR #1415? If so I'll go ahead and close it.

@BurtHarris to the contrary! Starting with #1414 (comment), I think that @halldorfannar and I agreed that the is_mount_point() idea (as outlined in this comment on your PR) is the way to go.

If you have time and work on this, that would be awesome! I just assumed that you were no longer interested because you did not answer at least my latest question.

@dscho
Copy link
Member

dscho commented Dec 11, 2018

Superseded by #1976.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants