Amazon S3 backup just saved my butt
May 15th, 2009 at 10:23 pm
My LVM volume
As I sit here waiting for dd_rescue to try to salvage what’s left of my 500GB hard drive, and wondering what will become of the 1.75 TB LVM volume that is now toast, I count my lucky stars that I decided a month ago to finally fix that new backup system that I was thinking about – had I not done that, 5000+ photos, including from our wedding, would’ve been down the drain. I feel like I just finished backing up everything yesterday, and it’s already saved me from disaster.
Let me take a step back. For years, I’ve used Logical Volume Manager to manage the volume that I store my movies and TV shows on. The advantage of LVM is, that it abstracts away the physical drives – to my apps, it’s just one large hard drive, but I can use the admin tools to add and remove drives without formatting.
That’s all well and good, but the big caveat to running drives like this, is that if any one of them fails, the whole volume is gone. I knew this going into it, so I made sure that any data I really cared about had another copy somewhere.
So originally, I had a setup which ended up rsync’ing to this website, which is hosted on Linode, who I very much recommend by the way – I’ve used it for years and support is fantastic and the site is pretty powerful. That worked fine until I bought my wife a DSLR, and her great but giant photos caused my 4GB of diskspace to vanish. So for a long time, I just gave up on backup, and since the photos folder is so gigantic, the only place I could put it is on the previously mentioned storage volume and symlink it to where it should be. Bad developer, no Twinkie!
Why backup using Amazon S3
Amazon S3 as a user-friendly backup solution sucks – it’s extremely developer-centric, and it’s about as friendly as the tram drivers in Berlin, which afaik are some of the least friendly/helpful people on the planet.
However, here’s why it’s great – it’s cheap, and it has no limits. All of the free file storage / backup services top out at ~4 GB or so, and the paid services start out at $50/yr for > enough space for us. I’ve backed up 20GB on S3 so far, and I’ve incurred about $3.50 in charges. For periodic automated backup, it’s hard to beat S3 as a backend.
If you’re a programmer, just think of Amazon S3 as basically a giant hash table (or Dictionary<string, byte[]> for you .NET people). First, you set up a bucket, which is an instance of the table – you won’t have more than a few of these, and probably need only one for backup. These bucket names are shared among everyone, so you have to make them unique – I just prefix my username to it since I don’t intend to share the files out over the web. Since S3 has no concept of folders, the convention is to just encode the path inside the key (the key can have slashes).
The great news is, that an anonymous coder has done all of the grunt work for you to make the backup scenario work, via a tool called s3sync. Basically, it’s rsync to S3 – you create an initial key to put a root folder under, then run s3sync to copy it over. Since s3sync only copies over the files that are new or have changed, it saves bandwidth (and by extension, cash). Here’s how to run your first sync:
export AWS_ACCESS_KEY_ID="FILLMEINHERE"
export AWS_SECRET_ACCESS_KEY="OMGSEKRITACCESSKEY"
tar -xzvf s3sync*.tar.gz
cd s3sync
chmod +x *.rb # Just to make sure
./s3cmd.rb createbucket yourusername-backup # Only need to do this 1x!
# -(r)ecursive, -(v)erbose –delete(old S3 files that no longer exist)
# Sync your Documents folder to S3
./s3sync.rb -r -v –delete "$HOME/Documents" "yourusername-backup:documents"
Seeing what’s on your S3 account
While you’re setting this up and making test runs, it’s pretty useful to be able to see what’s currently in your S3 account. To do this, there’s a great Java applet called Cockpit by James Murty

Runs in-browser, great for management and verifying the backup worked
Make it Automatic
Now that we know how to do one sync, making it automatic is the most important part – if you have to remember to do it, you’re bound to forget. I put it into a script:
### Make sure to fill in the blanks here!
export AWS_ACCESS_KEY_ID="FILLMEINHERE"
export AWS_SECRET_ACCESS_KEY="OMGSEKRITACCESSKEY"
export BUCKET_ID="paulbetts-backup"
export BACKUP_PATH="/storage"
export BACKUP_KEY="website"
echo "**** Backup start ***"
echo `date`
/root/s3sync/s3sync.rb -r -v –delete "$BACKUP_PATH" "$BUCKET_ID:$BACKUP_KEY"
And then, put that script into a cron job, so that it runs at 4am every morning:
0 4 * * * /root/storage_backup >> /var/log/s3backup.log
What if I’m using Windows
If you’re using Windows, this approach is going to be an order-of-magnitude more annoying, due to the difficulties that Ruby has with the Windows filesystem (backslashes, ACLs, etc) – while you might be able to get it to work, I can’t recommend it. However, one of the developers from Cloudberry Labs Emailed me about some tools for Windows centered around S3 backup that look pretty promising, especially the potential for easy automation via their PowerShell Snapin.
Live Mesh – good UI changes how you use software
May 15th, 2009 at 9:05 pm
This article was supposed to be longer, but I hit “Publish” too fast. Mea culpa.
Because once again, I’ve turned into a giant Microsoft shill now that I work there, I thought I’d try Live Mesh, Microsoft’s new file sharing/storage service; especially since they now have a Mac Beta. After using it for a few weeks, I can now definitely say that it’s a pretty well-written piece of software, and that’s not a compliment I give out lightly.
Good UI sucks to write
Good UI execution is hard for two big reasons. It’s hard because you’ll build a UI based on how the underlying code is set up, you’ll sit down to use it, and then you’ll find it completely unusable – programmers are bad at design because their entire mindset is to build it right the first time. Contrast this to Industrial / Product Designers, who from the very beginning of school are taught, “come up with 50 ideas for the same project and turn them in”. Making good UI is a process that requires a lot of iteration and experimentation, and a willingness to come up with 25 ideas and throw away 23 of them.
In design, iteration is made to be very cheap – you’ve got a pencil and a piece of paper, and you draw it all out; coming up with those 50 ideas doesn’t involve tertiary work, 100% of your efforts are moving your ideas forward. With making UI, it’s not so easy – even with tools like Expression Blend trying to make UI work easier, it’s still a ton of effort to create all of the little interactions that make a UI great. Writing a UI is difficult, and writing a great UI is difficulty multiplied.
So why is Live Mesh’s UI so good? No modal dialogs!
One reason that I like Live Mesh is, unlike a lot of other software products (mostly ones written by Microsoft), I’ve never seen Live Mesh pop up a message box forcing me to answer some question when it wasn’t prompted by me. That last part is important, because it touches on the concept of user intent. When I click on “Send/Receive mail” and it needs my password, it’s okay to pop up a dialog to ask it, because I asked the program to do something. When apps pop a dialog on their own, imagine someone suddenly walking up to you in the middle of a conversation with someone else – impolite at best.
I promise, I’m going to write actual technical articles here
May 15th, 2009 at 1:21 am
One of the cool things about StackOverflow is…
March 1st, 2009 at 6:19 pm
…that sometimes you run into some semi-famous CS people on there.

Good thing the author of the book I forgot still remembered that he wrote it.
gnome-format made Phoronix!
February 6th, 2009 at 3:03 pm
An app that I had to abandon once I started at Microsoft called gnome-format; apparently, one of the writers at Phoronix picked up on my program and wrote a review on it! Even though the code is dead, it’s still cool to see that it was a pretty positive review. If you want to see the last version of the code (i.e. the one the reviewer was working on), check out:
However, while the review is great, the version that he meant to review, was the one written by Felix Kaser, who has started from scratch on a new version of the code in Vala. So, check out his version too, and help him to get it into GNOME where it belongs!
A public service: toolbox-free Planet GNOME
February 1st, 2009 at 3:41 pm
I don’t know why I didn’t think of this sooner – after a few minutes of using the highly underrated Yahoo! Pipes tool, I’ve created a very useful feed – an RSS feed of Planet GNOME, but with the annoying people filtered out.
So far, it’s just Karl Lattimer, because that guy annoys the fuck out of me. Give me suggestions on who else is annoying and I might add them to the list as well, or clone my Pipe and make your own Toolbox-free Planet.
This Wordpress Theme
January 26th, 2009 at 11:52 pm
In keeping with the Creative Commons license (and spirit) of this website, as well as to respect the original author’s request, you can download the theme to this website which is based on Lucian Marin’s Journalist theme here:
The big differences are that H2 looks different than H3, some font changes and spacing issue fixes (Helvetica, *swoon*), making the title at the top a different color, and there are some PHP hacks to remove the footer element from every item so that the main page looks cleaner.
Smartcard Readers, Windows 7, and VMWare Fusion
January 26th, 2009 at 11:27 pm
At work, I’m on the pilot for one of the new features in Windows 7, DirectAccess. You know how Outlook just magically works, whether you’re connecting inside your LAN or outside it, even though they use different servers/protocols/whatever? Imagine that, only with every app – it’s totally transparent VPN, and except for a few apps that can’t grok IPv6, it works great.

I’m a Smartcard reader, and I’m about to throw some salt in your game
One of the caveats at least at Microsoft is, you have to use a Smartcard reader, like the one pictured above. Unfortunately for me, these appear to be some kind of broken under VMWare Fusion 2.0; trying the advice in the thread unfortunately got me nowhere.
Here’s the workaround, though it’s super-annoying because it basically makes suspending the VM worthless.
- Power off the VM completely
- Plug in the Smartcard reader, with the card inserted
- Hit the Power button on the VM
- Before the machine boots completely, click on the USB icon at the bottom right, and select the “Connect Omnicard USB Reader” menu item that pops up
The Smartcard reader will appear to be dead until right before LogonUI spins up (i.e. right when you see the blue “underwater” screen where you type in your Smartcard PIN). It should work from then on, as long as you never detach the reader.
Like I said, super annoying; if anyone’s got any better ideas, I’m definitely willing to hear them; I suspect this has something to do with the new Smartcard support that VMWare added in their Workstation product – maybe there’s a way to disable this via a VMX config option?
Essentials 2008
December 31st, 2008 at 10:00 am
Continuing from last year’s edition, here’s the software that I use on a day-to-day basis. Because of my traitorous switch to Mac, this list looks quite a bit different than it did earlier. As before, this is cribbing from Mark Pilgrim’s series – his 2008 edition is also full of good recommendations.
- Mac OS X 10.5 – after using a Mac as my primary machine for over a year now, I’m on the fence as to whether I’d ever go back to Linux or not. On one hand, the UI design and software support is great, but there’s just enough proprietary bullshit to make me reconsider. For the time being though, it’s the best there is, because my other favorite OS Ubuntu and Linux in general seems to be on a steady decline to crap city. If anyone takes offense to this assertion, I’ll be glad to write up a detailed response as to why it’s a mess.
- F-Spot, delicious, Vim – still continue to kick ass. Because of the wedding, I ended up effectively doubling my picture library, mostly with U’s 2MB pictures from her SLR camera – F-Spot handles it like a champ.
- Git and GitHub – Git is so mind-blowingly useful to anyone who is a developer, power-user, or anyone who works with text files, that I can’t possibly leave it off this list. This program continues to help me out just about every time I program; at work, I use it to manage multiple in-flight hotfixes to certain train-wreck components who shall remain nameless (but not linkless), as well as whenever I have to do any large change to Windows source code. At home, GitHub is a great way to manage my developer tools as well as my personal projects.
Seriously, if you’re any sort of programmer whatsoever, learn Git.
- Quicksilver – it’s hard to describe what QS actually is, the term “app launcher” betrays its real utility, but it’s by-far one of the best reasons to use Mac OS X. Basically, it’s a GUI version of a fast command-line interface, one that learns which commands you use most often and shortens the number of keystrokes you need to use them. Taking some time to learn everything that QS can do pays off quite a bit for your productivity.
- Firefox – continues to be the browser of choice, with its fantastic plugins (the “It’s All Text!” plugin being one of my favorite, lets me use GVim to type Emails or this blog entry for example). Great developer tools like Web Developer Toolbar and Firebug make it way better than Safari for most things. Speed and platform-integration are two of the things I do miss though…
- Live Mesh – file synchronization that just works. Works great with both Windows and Mac, and its remote desktop feature while somewhat anemic, is beautifully simple to use. If you don’t have a backup solution (and if you don’t, you will lose your stuff – storing everything on a USB stick does not count), this is a fantastic way to do it with almost zero work
- VMware Fusion – solid virtualization software for Mac, great integration with the guest without the evil hacks that Parallels uses (trust me as a Windows developer when I say this, you are much safer with VMware than with Parallels). These days however, I prefer more to use a dedicated VM running on another machine that I can remote desktop into rather than a local VM.
- Cygwin – without this, work would be way more painful. A Windows machine without Cygwin is nearly worthless to me.
- iTunes – …and I f’ing hate it. Please, someone write a music player for OS X that doesn’t epically suck. Amarok 2.0 and Songbird don’t count – Amarok went from the best music player on the planet without question, to a giant pile of gray crap. Trolltech systematically destroyed every decent KDE piece of software by releasing Qt4 and causing everyone to decide to make massive rewrites of their software, but I digress.
Anyways, some of the few things that iTunes got very, very right are, how it remembers exactly where I am in a podcast and syncs it with the iPhone, its podcast support in general, and its great sync support with devices (letting me choose whether to auto/manual sync music, making automatic backups of my phone, etc). The iTunes store would also be a gigantic win if it wasn’t so DRM-encumbered, and I would spend way more money there, instead of at Amazon MP3, which is also a fantastic service.
- XBMC – I’ve been using this on the XBox for years, and now that it’s a 100% cross-platform app, it’s even more awesome. Playing back video on your TV with this is fantastic, I stream movies and TV from my desktop machine over wireless and it works near-flawlessly, and understands just about any format. Setting up a cheap box with XBMC on it is the best way to get your music and movies onto your TV, hands-down. There’s also versions for hacked AppleTVs, which turns an AppleTV to me, from “complete trash” to “very compelling”. If the AppleTV had decent audio/video outputs, it would’ve been my new media box.
Stuff I don’t use anymore
- Tomboy – only runs on Linux until recently, and is local-only. The original developer also annoys me by coming up with fantastic ideas then abandoning them (I could also point that right towards myself, but anyways…)
- sshfs – I still use this occasionally when I have to traverse firewalls, but for everything local, Samba is faster and a bit less of a pain on OS X
- Unison – since I do much less work on my desktop than I used to, having two-way sync isn’t as useful to me; I just rsync from laptop -> desktop.
- Ubuntu – too many things broken on MacBook Pros, most the fault of Apple’s strange hardware, but it doesn’t change the fact that it’s broken.
Troll commenters, please continue
November 18th, 2008 at 10:59 pm
One of the greatest things about this blog just happened again today – the troll commenter who randomly finds me via Google, then writes something idiotic, usually with bad Youtube’ish grammar. Now you may be confused – usually these comments make authors feel lousy, and question why they should be trying to help people who just come back with insults. Not me. I love every minute of it.
Why?
Because I’ve found the best thing to do to flame comments – use my Admin powers to edit them to be of the polar opposite opinion. I’m a “dumbass American?” Now I’m a “totally awesome guy”. Apologies, praise, and all sorts of positive feedback on my blog entries show up when the trolls come out. The best part is, the original commenter becomes absolutely fucking furious and posts way more negative comments, which I promptly edit again to say even nicer things about me. There’s no way I can lose here, it’s great!
Now, for those of you thinking that I’m an anti-free-speech asshole, I always am willing to leave comments up that disagree with me – I’ll never change that. Blogs without dissenting comments lose their value, as a means of generating discussion on a topic. If you’re intent on leaving nonsense on my blog however, I will almost certainly be enjoying myself that day….
