File Rename Pain

Jan 22, 2013 at 2:57 PM
Edited Jan 22, 2013 at 4:12 PM


I have been testing out Git-TF to see if it will serve our purposes as we are currently having significant issues with TFS in a distributed environment. As such I've been trialling various merge style issues that we come up against in TFS from time to time. One of these is the file rename, and here is where I am having problems.

What I want to be able to achieve, is to maintain the commit history from the Git side, treating each commit to the main branch as a TFS checkin at the end of the day. The Git merge is problematic here, as Git-TF does not add the commits from the automatic merge branch into TFS as separate change sets, you only end up with the final merge commit, hence losing the history from TFS. As a result, the workflow I am proposing is based around git pull --rebase into local repositories and also git-tf pull --rebase from TFS. I am using this structure.

  Joe---->[TFS]      [Shared Git repo]
            |         ^ (2)  |       \
            |        /       |        \
            |       /        |         \
            V (1)  /         V (3)      V (4)
       [Git-TF Repo]   [Bob's Repo]   [Charlie's Repo]

 The issue comes about when Bob renames and modifies a file, and Joe modifies the same file. Assume Bob has pushed the change up into the 'Shared Git Repo'. If I pull the change from the 'Shared Git Repo' into the Git-TF repo, then do a git tf pull --deep --rebase, Git does not realise that the two files are a match and does not provide the option to merge. If instead, I pull the changes from TFS first and then from the Shared Git Repo, then I do get the option to merge the changes, but I can't then push the result back into TFS as the rebase has changed the SHA1 hash. If I don't use --rebase, then when I push the result into TFS, I lose the commit history.

Is there any way out of this conundrum? 

I think the ideal solution would be for Git-TF to be a little bit smarter about Git merge changes, walk the 'generated branches' and checkin those changes before checking in the merge result. Then I wouldn't need to use --rebase everywhere. Either that, or to actually detect the change is in a renamed file when doing the pull,. 

Or am I just missing something? I am new to git.

Jan 23, 2013 at 10:08 PM
Edited Jan 23, 2013 at 10:14 PM

Hi l_o_l,

Thanks for the interesting conundrum. :-) Unfortunately, because of the TFS nature the Git-TF has to maintain a liniar history of changes. But it seems to me that I've found a workaround for your scenario. I played a bit with it and it seems to work for me. Could you please give it a try and check if it's satisfactory for you?

Let me explain the woraround in few postulates.

1. You have to support the liniar history not only in the [Git-TF Repo] but in the [Shared Git repo] as well.

2. The [Git-TF Repo] has to be 'a mirror' of the [TFS], that means that, in the former, you should use only git tf pull and git tf checkin commands (NB! without --rebase option).

3. You will do git pull --rebase in the [Shared Git repo], downloading fresh changes from the [Git-TF Repo]. and in local repositories, downloading fresh changes from the [Shared Git repo].

4. To be able to pull changes from the [Git-TF Repo] into the [Shared Git repo] (and rebase) the latter should be 'a normal' (NB! not a bare) repository.

5. To be able to push changes from local direcories into the [Shared Git Repo], the master branch in it should not be the current one, i.e. you should create some auxiliary branch in it (e.g. _hidden_) which you will use only to switch to after you finish repository synchronization.

6. For convinience you migh define
    i. The [Shared Git Repo] as origin for loal repositories, i.e. the local repository config file could contain:

[remote "origin"]
	url = Shared Git Repo
	fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
	remote = origin
	merge = refs/heads/master

     ii. The [Git-TF Repo] as origin for the [Shared Git Repo], i.e. the [Shared Git Repo] config file could contain:

[remote "origin"]
	url = Git-TF Repo
	fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
	remote = origin
	merge = refs/heads/master

     iii. The [Shared Git Repo] as origin for the [Git-TF Repo], i.e. the [Git-TF Repo] config file could contain:

[remote "origin"]
	url = Shared Git Repo
	fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
	remote = origin
	merge = refs/heads/master

  With this repositoies setup you could excercize a workflow like the following:

A) During the day the [Shared Git Repo] is checked out from its _hidden_ branch
     - switch to the [Shared Git Repo]
     - git checkout _hidden_

B) When a developer wishes to push his changes to the [Shared Git Repo], (s)he does the following in the local epository
     - git pull --rebase
     - git push

C) At the end of the day the repository admin does the following
     - switch to the [Git_TF Repo]
     - git tf pull --deep (after that the [Git_TF Repo] contains all [TFS] changes with the full history)
     - switch to the [Shared Git Repo]
     - git checkout master
     - git pull --rebase
     - resoleve conflicts if needed (after that the [Shared Git Repo] contains all [Git_TF Repo] changes with the full history and all local commits on the top)
     - git checkout _hidden_
     - switch to the [Git_TF Repo]
     - git pull  (after that the [Git-TF Repo] contains all [Shared Git Repo] changes with the full history)
     - git tf checkin (after that the [TFS] contains all [Git-TF Repo] changes with the full history)

I hope this approach might help you.

Jan 24, 2013 at 4:37 PM

Hi arukhlin,

Thank you very much for your help. You solution certainly solves the problem I had as described. With all the re-basing, though, I can see that we will need to be careful to avoid hitting the issues described in this article:

For example, when I created a "Topic Branch" in Git, and attempted to merge the changes into master, the fact that the master had been re-based caused a world of pain. The solution, of course, is to re-base the master onto the branch before doing any merge down. With clear process, I think this will be a workable solution.

Thanks again for your help.




Jul 10, 2013 at 3:30 PM
Just to update:

This does not really work in practise. The reason for this is that for this approach to work, the code needs to be checked in from Git to TFS every time we want to take changes from TFS, or else the rebaseing merge conflicts must be re-resolved (soon leading to an unmanageable amount of overhead). One key reason we wanted to use Git, was to allow downstream to pull and merge changes more frequently than changes are committed to the parent trunk. In essence, the Git repository is a separate branch, but this work flow demands that every time we take a change from the trunk, we must also push our changes back to the trunk! We could, of course, simply create branches in TFS and be done, but we have networking and permission issues between sites that were the initial driver for using Git.

So the approach we have adopted instead is to never rebase the master branch. This means we lose individual commits into TFS, but the history is stored within Git itself. The mind-set shift is that Git is now a second source control, not just a view into the master. Not ideal, and not what we originally intended, but I do not believe it is possible to do this any other way.

Fundamentally, you cannot rebase the master branch, but to maintain linear history, you must. Perhaps it is just not possible to use Git as a "view" into TFS as the "master" repository. This was how I understood Perforce Fusion was to work, and I had hoped the Microsoft solution would be similar in function.