At the request of ihrhove I’ve decided to talk a little bit about using git and LaTeX together. I currently have two private git repositories; one for the Finnish paper and the other for all of my thesis work. I’ve talked previously about the Finnish paper so I’ll give a brief overview of how I use it with my thesis but you’ll need to keep in mind that I don’t have it shared with anyone because my supervisors don’t use git and nor do they edit the documents I work on directly (two print out draft papers and write on them, the third (who has used CVS/SVN in the past) uses Foxit to annotate PDFs directly and send them back to me.
To start (and possibly end, if you’re easily convinced) with, LaTeX is just code. So to me there’s no reason why you can’t use any service you’d normally use for code for LaTeX. Everything that is directly being used in a paper comes under my version control with git.
Each paper in my thesis repository has its own folder. Within that folder there is a LaTeX subfolder, where I keep everything needed for the writing of the paper, and an R or MATLAB folder depending on what program I’m using to do the modelling (and all the code goes into the repository). Within the LaTeX folder I have a whole bunch of .tex files and a folder where I store the images to be included in the paper.
One of my favourite commands in LaTeX is \input. Every section in a paper has its own LaTeX source file. I find that this helps me navigate my work when I’m writing, especially when making corrections. Each file gets worked on separately and I save frequently. If I’m finished dealing with a section or I’m heading off for a break I will save everything and commit the current changes with a note about which section I’ve been focussing on. I picked this \input based writing up in my Honours degree when I got sick of having screen after screen of text. If I want to omit a section in a draft I can just comment out the \input line. Reorganising sections and maybe even subsections, becomes an issue of swapping two or three lines of LaTeX rather than copying and pasting giant blocks of text.
I’m a sucker for vector graphics so I will use PDF graphs and pdflatex wherever I can. Occasionally I succumb to using PGF/TikZ for a while but usually have to generate so many different styles of plots that I don’t bother. So anyway, PDF graphics. These are really quite small and can be stored in git no trouble at all. I know git’s more or less useless for version control and revision of binary files (but PDF and EPS files are quite different) but I find it useful to be able to overwrite my graphs and still have the older versions available through reverting to a previous commit rather than making endless folders called “oldgraphics”.
The root of my thesis repository has a folder called “Bibliography” which is where a monolithic bibtex file called “allpapers.bib” is stored. Because I will cite the same references across multiple papers I find the idea of having separate bibliography databases a bit silly. I use JabRef to edit this, by the way. All my \bibliography commands point to ../../Bibliography/allpapers.bib. I’ve even got a template for papers with that line in it so that I don’t even have to think about how I do my referencing.
With regards to the Finnish paper, this compartmentalisation reduces, even further, the risk of conflicts. Committing changes to one section at a time means the commit messages are often quite descriptive without having to be quite long. The mixture of a few lines of changes and a brief summary means it’s easy to see what’s happened in the changelog.
I also use git to keep track of side projects that have popped up during my thesis. Coworkers will often come to me with a question about some data analysis or if I can write a script to make a certain repetitive task as automatic as possible. Each coworker gets a subfolder within a /Side Projects/ folder and within those there are folders for each little project. If I worked in a group where use of git was widespread I would consider making a separate project for each person and inviting them as a collaborator.
I kind of wish that QUT had a git server (the school of IT had a subversion server but I really dislike SVN after discovering git) and that scientists were encouraged to use R/MATLAB/SAS for their statistics and modelling instead of Excel. I think it’d a great way to foster collaboration and have people be able to work on a project and make changes, share their code with their coworkers, etc. without sending code and draft papers around via email. Actually a private git server without the account level limitations that github imposes would be an invaluable tool, especially if you could just open up your repositories to the QUT community to show what you’re doing and provide colleagues with usable code for statistical analysis, image manipulation tools, etc. And if someone within the university came across your work and liked it, you would potentially have another paper to work on within the uni.