Notes on using multiple git repositories in a build. Having a good way to do this is a requirement for implementing a clean layered approach.
Collected from various emails:
- Looks, feels and behaves like a standard git repo for the end user, no special tools needed.
- Can be checked out by the end user with one easy command.
- Contains full history for all the components in bisectable commits.
- The user can submit changes back easily with standard git workflow.
- tool should make it easy to add or change the pull URL for a repo.
- its part of git, no extra tools, languages, etc are required to be installed.
- locks submodules to a specific commit, so that subprojects can proceed at their own pace, while the superproject can move them forward as they are verified, and tested.
- Regarding git submodules, I haven't used them in a couple of years but the last time I did, the workflow was terrible. It kept rolling back submodule versions because someone forgot to do a full "git submodule update" and then did a "git commit -a" in the superproject; we were forever forgetting to commit and push the superproject after changing submodules; it seemed to delight in finding new ways to create merge conflicts; etc.
- http://book.git-scm.com/5_submodules.html (pitfalls, about 2/3 way through page)
- Submodule was committed and pushed, but superproject wasn't committed and/or pushed, so other developers kept using old submodule.
- Superproject was committed and pushed, but subproject wasn't committed and/or pushed. As a result other developers couldn't clone/update.
- Submodules are checked out as detached HEADs. I can't count the number of times I accidentally committed on top of that damned HEAD and had to go back later to create a new branch and cherry-pick my change over.
- If you accidentally commit on top of the detached HEAD, "git submodule update" will silently eat your changes. This led to people being afraid to run "git submodule update", which made the following problem more frequent.
- (Note, git does not eat your changes, they are still there and can be found with git reflog. It is agreed this is a pain, but it is not too bad to recover. See http://bec-systems.com/site/696/git-submodules-what-to-do-when-you-commit-to-no-branch)
- Developer 1 changed subproject A and properly pushed the superproject. Developer 2 did a "git pull" on the superproject but not a "git submodule update". Developer 2 then changed subproject B, committed, pushed, and did a "git commit -a" in the superproject not realizing this rolls back the superproject version of submodule A.
- The record of subproject revisions, and to a lesser extent the .gitmodules file, are essentially hot-spots for unrelated changes, just like the old checksums.ini. It just means more merge commits if people pull instead of rebase.
- I don't know if this is still true, but git just couldn't handle conflicts in the subproject revisions. It would abort the merge, hard,
- Most of the problems would be solved by rigorously using the tools the way they were designed, but I didn't meet anyone who was capable of doing this to the extent required, and recovering from problems was painful and time consuming.
- requires python
- very capable tool, allows submodules to be specified by branch or by hash
- Most projects are using XML configuration, but it seems to support using git submodules as well, with fewer pitfalls than using git submodules natively
- Tied to Gerrit (code review tool) for upload, i.e. plain "git push" not supported
- written in ruby
- Pull script: http://jrz.cbnco.com/git/?p=toastix/toastix.git;a=blob;f=update-proj.sh
- Push script: http://jrz.cbnco.com/git/?p=toastix/toastix.git;a=blob;f=push.sh
- Sample config: http://jrz.cbnco.com/git/?p=toastix/toastix.git;a=blob;f=projects.list
- half-baked pair of shell scripts
- uses flat file for configuration
- requires different config file for committers and non-committers