Monday, August 13, 2018

Migration from SVN to GIT

Step 0 : Define migration strategy



Option 1 [recommended] : If possible, do not bother to migrate the SVN history to git but rather keep the old repositories archived and start over in git from the tip of the trunk.

Option 2 : Migrate trunk and tags only.

Option 3 [can be complex] : Migrate the entire repository.

I picked option 3 because my colleagues wanted to keep all of the history. IMO this is not necessary as long as the old repositories are still accessible. Plus, the migrated history is not perfectly identical to SVN and can be more confusing than helpful.


Step 1 : Create a file with authors


We should convert authors' svn identifiers to git format (Firstname Name <mail>).
To do this, run the following command from your SVN repo

svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <"$2">"}' | sort -u > authors.txt


Now edit the file and fill the missing information to finally have something like that:

MMO = Mickey Mouse <mickey.mouse@disney.com>
DDU = Donald Duck <donald.duck@disney.com>

Step 2 : Migrate dependencies first


If your main project is using externals, those will need to be converted to submodules (other options are possible like subtree or third party git externals but in most cases, submodule will do just fine).

To do that, repeat the following commands for each dependency repository:

git svn clone --trunk=<trunk_dir> --branches=<branches_subdir> --tags=<tags_subdir> --no-metadata --authors-file=authors.txt <svn_repo_url> .


Note: if the svn repository follows the standard 'trunk,branches,tags' structure, replace '--trunk=<trunk_dir> --branches=<branches_subdir> --tags=<tags_subdir>' with --stdlayout

Now export the branch list to a file.

git branch -r > branches.txt


Filter the list of branches in branches.txt and keep only those you want to migrate.
Copy tags to a separate file for now (tags.txt).

Now for each entry in branches.txt, do:

git checkout -b <local_branch_name> <remote_branch_name>


Use the branch name in branches.txt for <remote_branch_name>.

For each tag, do the following:

git checkout -b <local_branch_name> <remote_branch_name>
git tag <local_branch_name>
git checkout master
git branch -D <local_branch_name> 

Tags are exported as branches by git-svn so we are forcing a checkout and we manually add the tag in git repository.

Now run the following command to clean history from empty commits (migration side-effect):

git filter-branch --prune-empty -f -- --all


Delete trunk branch

git branch -D trunk


Configure remote git repository

Here I suppose that you already have created a remote git repository to host your code.

git remote add origin <git_remote_repo_url>


Finally push the repo

git push origin --mirror


Step 3 : Migrate the main project repository


Repeat steps from Step 2 until trunk deletion.
From there, we will need to resolve externals.

We will start with master branch, but these steps need to be repeated for each branch.

git checkout master


Run the following command to export the list of externals to a file.

git svn show-externals --id=origin/trunk > externals.txt


Note: In my case, this command wouldn't work for other branches and it is particularily slow. I managed to achieve the same thing from svn command line in the old repository, for each branch.

svn propget svn:externals -R <externals_dir> > externals.txt


Suppose our svn:externals were in a folder named Externals, here is what should be done for each submodule:

git submodule add --force <dependency_git_remote_repo_url> ./Externals/<submodule_dir>


Then we need to fix the submodule to the commit that matches the version used in svn.

cd ./Externals/<submodule_dir>


For tags:
git checkout <commit_id_or_tag_label>


For branches:
git checkout -b <local_name> <remote_git_branch_name>


cd ../..


Repeat these steps for all submodules. Once done, commit:

git commit -m "Migrated svn:externals properties to .gitmodules"


Repeat these steps for all branches. Before starting, for each branch, do cleanup to avoid confusing error messages:

rm -rf ./Externals


Finally, when all branches have been processed, push the repository to remote server:

git remote add origin <git_remote_repo_url>

git push origin --mirror

 
biz.