Backuping a whole bunch of photos and videos might be a difficult task. Taking care of the consistency of your backups is even more complicated task. Besides that you don’t want to have your backups at a single place, thus mitigating the impact of a single point of data loss.
I’ve been using git-annex
for some years but not like a pro user, rather than “it just works”. Meanwhile I’ve heard of ownCloud
and I’ve liked it because one can access its data via web, mobile client or whatever. In connection with a VPN (I prefer openvpn) solution you can have a secure way of remotely accessing your data from everywhere.
Well this post is where the link between git-annex
and ownCloud
should be emphasized: Use ownCloud as your “frontend” tool for accessing the data while letting git-annex do the “backend” (aka backup) job. While this might sound like a pretty easy task, it does have some peculiarities to be taken into consideration.
Backup setup
blockdiag code
blockdiag {
R [label="Raspberry Pi (central repo)", color="lightblue", width=192];
S [label="External Server (git + data)", width=192];
C [label="Cloud"];
L [label="Laptop"];
M [label="Mobile Clients"];
G [label="GitHub (only git)", width=192];
L -> R [label="plain"];
M -> R ;
R -> S [label="enc"];
R -> G ;
S -> C [label="enc"];
}
There is one centralized repo on my raspberry pi where my HDD is attached to. Usually I push stuff to the HDD using rsync
from my laptop or ownCloud
using my mobile clients. Afterwards the encrypted repo and the data itself is being pushed to some external server. Data is being encrypted using my private GPG key. From the server I could then replicate the repo+data stuff to some cloud provider like AWS, DropBox or whatever.
Additionally one could push the git repo to GitHub in an encrypted form - without the data itself. It will then only contain the git information (symlinks) but no data (annexed data).
Using ownCloud with git-annex
blockdiag code
blockdiag {
C [label="ownCloud client", width=192];
S [label="Server"];
Cloud [label="Cloud"];
group {
label = "frontend";
O [label="ownCloud"];
}
group {
label = "backend";
G [label="git-annex"];
}
C -> O [folded];
G -> S;
S -> Cloud;
}
ownCloud
will act as a front-end and can be used by any ownCloud client. The data itself is then managed by git-annex
which basically acts as a back-end
. One can access the data using ownCloud but you there are some restrictions:
- already annexed data can’t be deleted
- you can add files/folders and delete them only if these weren’t added to the git-annex repo yet
That means: Newly added data (but not committed to the repo) can be deleted by the client which added the data. In this case old data can’t be deleted. You’ll have to work with git-annex to do that.
Encryption
One the important constraints before pushing my backups into the cloud was security. In order to be able to encrypt my stuff before pushing into the cloud, I had to
- generate a GPG key
- create a special remote (see below) which offers encryption
GPG
Generating a GPG key was the easiest step. Afterwards I had to make sure that my keychain was available git-annex
for a specific period of time. Here are my configuration files:
|
|
and then the GPG agent configuration:
|
|
Keychain
Then I’ve found keychain which helps you manage your SSH and GPG keys in a secure manner. Adding this to your bashrc/zshrc/whatever
|
|
will cache your GPG and SSH keys to a specific time of period.
git-annex reference
Create git repo
|
|
Create git annex repo
|
|
Add remotes
Create bare git repo on the server
|
|
Add special remote (ssh+rsync)
|
|
Synchronize data
Sync only git repository to ssh remote
|
|
Sync git repo + content to ssh remote
|
|
If you have a look at the repo oc-encrypted
on the external server, you’ll see only encrypted stuff:
|
|
Troubleshooting
Find broken symlinks
|
|
or
|
|
Garbage collection
Delete unused (annexed) data
|
|
Now you can drop the unused data:
|
|
Delete unused remote
First you’ll have to mark the repo as dead:
|
|
Then you’ll have to forget the dead repo:
|
|
And finally you can remove the remote using git
:
|
|