Importing legacy code from Subversion to Git

Atlassian has a very good tutorial on this. I suggest you read it first. Here is a simple example.

Install git-svn

A modern git installation may not include git-svn by default. On Ubuntu you can easily install it:

sudo apt-get install git-svn

Authors text file

To preserve the history readably, you’ll need to tell git svn the real names of the authors. There are tools to generate authors.txt, but for a small-ish team it might be easier to create it manually. Here’s a sample.

sarah = Sarah Woodall <sarah.woodall@mycompany.com>
otheruser = Other User <other.user@mycompany.com>

Subtree import recipe

This is how I imported a subdirectory of an old Subversion repo as a subtree of an existing Git repo, bringing in all relevant history with it. This assumes we want to import to the target repo’s default branch.

  1. Use git svn to create a temporary local git repo containing the data you want from Subversion
  2. Move the files into their own subdirectory
  3. Tag it: we’ll use this tag to define what gets exported
  4. Separately, clone the target Git repo, and move into it to do the rest
  5. Make a new subdirectory within it, as a home for the new subtree
  6. Export the files as they stand today (no history, just the files) from the temporary repo into the new subdirectory of the target repo
  7. Add and commit these new files to Git locally
  8. Bring in just the history, without affecting the files, by using git pull -s ours
  9. Finally, push the result
git svn clone --authors-file=authors.txt --trunk=mysubtree/ http://svnrepolocationURL/svn/mysvnrepo/
cd mysvnrepo
mkdir mysubtree
mv *.cpp *.hpp .cproject .project mysubtree
git commit -m"Moved mysubtree into own directory"
git tag -m"Tag current state of mysubtree ready for migration" MIGRATION_MYSUBTREE
cd ..
git clone https://myuser@myrepolocationURL/mytargetrepo.git
cd mytargetrepo
mkdir mysubtree
cd mysubtree
(cd ../../mysvnrepo && git archive MIGRATION_MYSUBTREE) | tar -xf -
cd ..
git add mysubtree
git commit -m"Initial commit of mysubtree, no history"
git pull -s ours ../mysvnrepo MIGRATION_MYSUBTREE
git push

Snag

My first go at the above (before I added Step 2) produced an error mesage from Git about unrelated history, at the “pull” stage. I attempted to force the issue by using “–allow-unrelated-histories”. This got the job done, but Git didn’t see the new files and the old ones as being the same, so the history, although present in the repo and browsable using Sourcetree, isn’t relatecd to the files (so “git log <filename>” doesn’t show what it should).