Someone on Fedora People, I regret I can’t recall who, proposed a scheme to backup user data a while back, as I thought it was a brilliant idea I started sketching out a solution but it seems one of those pesky Ubuntu people beat me to it before I even left the drawing board.
HUbackup screenshots and presentation, bazaar entry in Ubuntu utterly confusing Launchpad thingy
Anyways, working code is always nice to have, I hope this forfills the requesters needs if not then at least we have a base to work from.
philtrick said,
October 26, 2006 at 19:59
Have you seen:
http://andrewprice.me.uk/projects/pybackpack/
Uses rdiff-backup, and can backup over ssh.
Interface is nice as well, but might be a bit complex for a first time user.
David Nielsen said,
October 26, 2006 at 20:55
A nice program, never the less way off the mark. It’s targeted at people with a storage server to back up personal data to.
The request was more like:
1) detect backup devices (cdrw,dvdrw, etc)
2) ask user which device he’d like to back up to
3) create and burn backup
step 3 requires a bit of work, take my setup e.g. I have 800 gb of storage in my desktop machine, 700 of which is dedicated to /home. If that’s >50% full I can’t do a backup the easy way (create isos, burn them, erase them).. in fact nobody can backup without available space. So we need expand it to:
3.1) check for available space
3.2) if free space exceeds the total amount of space desired backed up
3.2.1) create isos
3.2.1.1) burn and verify
3.2.1.2) erase
3.2.2) if free space exceeds amount required for one iso
3.2.2.1) create iso
3.2.2.2) burn and verify
3.2.2.3) erase iso and proceed from 3.2.2.1 till there’s no long any data left
3.2.3) if free space does not accomidate the required iso size complain (options could be: smaller medium if suitable or create more free space)
Also it would be nice to somehow have allowances for incremental backups, I haven’t quite figured out the best way for that to work GUI wise.
Other questions that made me wonder, say the user starts a backup with compression enabled on my dualcore 4400+ AMD64 a full backup of /home with 7zip compression takes the better part of a day (now that I have so much space available it might take a week at full compression rate, who knows), so any data I create in the time between the backup starting and finishing is potentially lost. How to avoid such issues and still allow for the user to use his machine? I imagine we could at the time the backup starts enable a file watcher (inotify/gamin/beagle maybe) to track changes and then once then backup finishes, compress the changed files and append them to the backup container.. and so on and so forth with the risk of an infinite loop happening also this would require a blacklist of files not to track.. overall not a desirable solution.
I can’t really think of anything more elegant that doesn’t require API additions, locking, etc to the platform at large.. hell I don’t even know if it’s a realistic problem on average setups.. overengineering is in my blood.
philtrick said,
October 26, 2006 at 23:37
I see your point.
I think though that the situation you have is probably more thn an average user.
There is also the point that if you try to back up your home dir while logged in, and running apps such as evolution, the backup will not be complete, which can cause restore problems.
Also, I think that a backup program that creates 10 CDs full of data will quickly get tiresome, and the user will see backing up as a chore, instead of a necessity. Maybe some way of detecting when removeable storage is plugged in, e.g an iPod, and it can create a backup file on that with essential settings.
With regard to modified files, why re-invent the wheel? rdiff-backup does all the tracking at backup, and could be run again at the end of the backup, to get the minimal amount of files that may have changed.
I see the biggest problem being the storage of the backups, while DVDs allow the storage of nearly 5GB of data, the amount of personal data people have is quickly approaching that.
For this it might be worth categorising the backups, such as work, PIM, audio files, video, downloads etc.
I still think pybackpack has the right core for this, it just needs to be simplified and have a bit more of an autopilot mode.
Phil
David Nielsen said,
October 27, 2006 at 01:56
Well.. I’ll have a look at rdiff-backup’ code for during the tracking.
The compartmentalized backup seems like a good idea but here’s what I would do, have three modes – full backup, full backup (no multimedia files) and incremental backup.
Incremental backup needs to be a bit smarter than your average brick though, I’ve seen this happen:
day 01: admin issues full backup to tape
day 07: admin issue incremental backup to…. the same tape
day 09: For reasons unknown to me, management demanded the backup server had to be RAID0 and guess what today that blows up. Admin confident that at least we’d only lose data written the past 2 days, bad but not a disaster, inserts backup tape and realises that the little people inside the computer didn’t just know what he wanted. A month worth of development data lost, estimated total loss registered close to a million USD.
A problem for which two solutions exist:
A) shoot users of the intelligence level just above “takes backup” and just below “but does it wrong”
B) verify that we aren’t overwriting previous data (if burning to RW media) and verify integrity for the previous backup
Neither one beats having only smart users.. but alas we can’t have everything we want.
sivan@ubuntu.com said,
November 18, 2006 at 00:55
Hi David!
Being actually THE pesky Ubuntu dude that was ahead of you with respect to desktop backups, I couldn’t not be joyed when I spotted this discussion, and I would like very much to discuss with you what you feel needs to be improved and how this should be solved. Accidently or not, what you describe on comment (2) is exactly the approach I’ve taken. Oh, and also to your joy maybe I have solved the incremental archive creation problem in a rather nice way IMHO, through mime type association for backup files and backup meta data configuration files, let’s talk to see how we can make a difference!
Sivan