This document discusses version control, including what it is, why it is useful, and how it works. Version control allows tracking changes made to files over time, recovering previous versions of files, and managing files changed by different users to avoid conflicts. It gives examples of how version control could help with reproducing figures for a paper review, maintaining old versions of scripts, safely trying out changes to code, and enabling traceable collaboration. While some version control concepts can seem complex, it notes it can be kept very simple for individual use cases.
2. What is version control
Annotated log of changes
Backup system
Collaboration tool
For any file type, but text files work best
Code!
3. A definition
Version control
A tool for managing changes to a set of files
Each set of changes creates a new revision of the files
Allows users to recover old revisions reliably
Helps manage conflicting changes made by different users
From the Software Carpentry website
4. Use case I: reviewer #3
Paper submitted
After a couple of months Reviewer 3 writes:
Please generate figure 3 with a higher resolution
5. Use case I: reviewer #3
Your reply to the reviewer
We have continued to work on the code that
had generated the figures for the original
paper, and couldnt recreate the exact code used
Our new code generates a graph which slightly
alters the interpretation
6. Use case I: reviewer #3
How would version control have helped?
Turn back the clock to the code used
Rerun analysis
Recreate exact figure
8. Use case II
From my own work:
$ cd scripts
$ ls
blat_parse4.pl old_versions snps_flanks_2_fastq.pl
$ ls old_versions/
blat_parse2.pl blat_parse_attemp1.pl blat_parse.pl.bak
blat_parse.pl
blat_parse3_backup.pl
blat_parse3.pl
9. Use case II
How would version control have helped?
Older versions hidden but still accessible
Annotated history of all changes available
Bonus:
Allows for safely trying out changes
10. Annotated log of changes
http://starlogs.net/#lexnederbragt/denovo-assembly-tutorial
11. Use case III: collaboration
Example: Wikipedia
http://en.wikipedia.org/wiki/Version_control
Fully traceable history of all contributions
12. Use case III: collaboration
Example: Google docs
Fully traceable history of all contributions
13. Use case IV: how you work
Using version control:
makes me change code in small steps
makes me log (annotate) my changes
makes me feel safe to change code
makes it easier to try out things
14. But, this is way too complex!
merge
tag
checkout
rebase
branch
pull request
conflict
https://www.atlassian.com/git/workflows
fetch
pull
push
diff log
status
15. But, this is way too complex!
Can keep it very simple
Just one piece of code, only you work on it
A few scripts in one folder, made available online
A large code base with multiple contributors