Archive for the ‘Linux’ Category
Amazon S3 backup just saved my butt
My LVM volume
As I sit here waiting for dd_rescue to try to salvage what’s left of my 500GB hard drive, and wondering what will become of the 1.75 TB LVM volume that is now toast, I count my lucky stars that I decided a month ago to finally fix that new backup system that I was thinking about – had I not done that, 5000+ photos, including from our wedding, would’ve been down the drain. I feel like I just finished backing up everything yesterday, and it’s already saved me from disaster.
Let me take a step back. For years, I’ve used Logical Volume Manager to manage the volume that I store my movies and TV shows on. The advantage of LVM is, that it abstracts away the physical drives – to my apps, it’s just one large hard drive, but I can use the admin tools to add and remove drives without formatting.
That’s all well and good, but the big caveat to running drives like this, is that if any one of them fails, the whole volume is gone. I knew this going into it, so I made sure that any data I really cared about had another copy somewhere.
So originally, I had a setup which ended up rsync’ing to this website, which is hosted on Linode, who I very much recommend by the way – I’ve used it for years and support is fantastic and the site is pretty powerful. That worked fine until I bought my wife a DSLR, and her great but giant photos caused my 4GB of diskspace to vanish. So for a long time, I just gave up on backup, and since the photos folder is so gigantic, the only place I could put it is on the previously mentioned storage volume and symlink it to where it should be. Bad developer, no Twinkie!
Why backup using Amazon S3
Amazon S3 as a user-friendly backup solution sucks – it’s extremely developer-centric, and it’s about as friendly as the tram drivers in Berlin, which afaik are some of the least friendly/helpful people on the planet.
However, here’s why it’s great – it’s cheap, and it has no limits. All of the free file storage / backup services top out at ~4 GB or so, and the paid services start out at $50/yr for > enough space for us. I’ve backed up 20GB on S3 so far, and I’ve incurred about $3.50 in charges. For periodic automated backup, it’s hard to beat S3 as a backend.
If you’re a programmer, just think of Amazon S3 as basically a giant hash table (or Dictionary<string, byte[]> for you .NET people). First, you set up a bucket, which is an instance of the table – you won’t have more than a few of these, and probably need only one for backup. These bucket names are shared among everyone, so you have to make them unique – I just prefix my username to it since I don’t intend to share the files out over the web. Since S3 has no concept of folders, the convention is to just encode the path inside the key (the key can have slashes).
The great news is, that an anonymous coder has done all of the grunt work for you to make the backup scenario work, via a tool called s3sync. Basically, it’s rsync to S3 – you create an initial key to put a root folder under, then run s3sync to copy it over. Since s3sync only copies over the files that are new or have changed, it saves bandwidth (and by extension, cash). Here’s how to run your first sync:
export AWS_ACCESS_KEY_ID="FILLMEINHERE"
export AWS_SECRET_ACCESS_KEY="OMGSEKRITACCESSKEY"
tar -xzvf s3sync*.tar.gz
cd s3sync
chmod +x *.rb # Just to make sure
./s3cmd.rb createbucket yourusername-backup # Only need to do this 1x!
# -(r)ecursive, -(v)erbose –delete(old S3 files that no longer exist)
# Sync your Documents folder to S3
./s3sync.rb -r -v –delete "$HOME/Documents" "yourusername-backup:documents"
Seeing what’s on your S3 account
While you’re setting this up and making test runs, it’s pretty useful to be able to see what’s currently in your S3 account. To do this, there’s a great Java applet called Cockpit by James Murty

Runs in-browser, great for management and verifying the backup worked
Make it Automatic
Now that we know how to do one sync, making it automatic is the most important part – if you have to remember to do it, you’re bound to forget. I put it into a script:
### Make sure to fill in the blanks here!
export AWS_ACCESS_KEY_ID="FILLMEINHERE"
export AWS_SECRET_ACCESS_KEY="OMGSEKRITACCESSKEY"
export BUCKET_ID="paulbetts-backup"
export BACKUP_PATH="/storage"
export BACKUP_KEY="website"
echo "**** Backup start ***"
echo `date`
/root/s3sync/s3sync.rb -r -v –delete "$BACKUP_PATH" "$BUCKET_ID:$BACKUP_KEY"
And then, put that script into a cron job, so that it runs at 4am every morning:
0 4 * * * /root/storage_backup >> /var/log/s3backup.log
What if I’m using Windows
If you’re using Windows, this approach is going to be an order-of-magnitude more annoying, due to the difficulties that Ruby has with the Windows filesystem (backslashes, ACLs, etc) – while you might be able to get it to work, I can’t recommend it. However, one of the developers from Cloudberry Labs Emailed me about some tools for Windows centered around S3 backup that look pretty promising, especially the potential for easy automation via their PowerShell Snapin.
gnome-format made Phoronix!
An app that I had to abandon once I started at Microsoft called gnome-format; apparently, one of the writers at Phoronix picked up on my program and wrote a review on it! Even though the code is dead, it’s still cool to see that it was a pretty positive review. If you want to see the last version of the code (i.e. the one the reviewer was working on), check out:
However, while the review is great, the version that he meant to review, was the one written by Felix Kaser, who has started from scratch on a new version of the code in Vala. So, check out his version too, and help him to get it into GNOME where it belongs!
A public service: toolbox-free Planet GNOME
I don’t know why I didn’t think of this sooner – after a few minutes of using the highly underrated Yahoo! Pipes tool, I’ve created a very useful feed – an RSS feed of Planet GNOME, but with the annoying people filtered out.
So far, it’s just Karl Lattimer, because that guy annoys the crap out of me. Give me suggestions on who else is annoying and I might add them to the list as well, or clone my Pipe and make your own Toolbox-free Planet.
Essentials 2008
Continuing from last year’s edition, here’s the software that I use on a day-to-day basis. Because of my traitorous switch to Mac, this list looks quite a bit different than it did earlier. As before, this is cribbing from Mark Pilgrim’s series – his 2008 edition is also full of good recommendations.
- Mac OS X 10.5 – after using a Mac as my primary machine for over a year now, I’m on the fence as to whether I’d ever go back to Linux or not. On one hand, the UI design and software support is great, but there’s just enough proprietary bullshit to make me reconsider. For the time being though, it’s the best there is, because my other favorite OS Ubuntu and Linux in general seems to be on a steady decline to crap city. If anyone takes offense to this assertion, I’ll be glad to write up a detailed response as to why it’s a mess.
- F-Spot, delicious, Vim – still continue to kick ass. Because of the wedding, I ended up effectively doubling my picture library, mostly with U’s 2MB pictures from her SLR camera – F-Spot handles it like a champ.
- Git and GitHub – Git is so mind-blowingly useful to anyone who is a developer, power-user, or anyone who works with text files, that I can’t possibly leave it off this list. This program continues to help me out just about every time I program; at work, I use it to manage multiple in-flight hotfixes to certain train-wreck components who shall remain nameless (but not linkless), as well as whenever I have to do any large change to Windows source code. At home, GitHub is a great way to manage my developer tools as well as my personal projects.
Seriously, if you’re any sort of programmer whatsoever, learn Git.
- Quicksilver – it’s hard to describe what QS actually is, the term “app launcher” betrays its real utility, but it’s by-far one of the best reasons to use Mac OS X. Basically, it’s a GUI version of a fast command-line interface, one that learns which commands you use most often and shortens the number of keystrokes you need to use them. Taking some time to learn everything that QS can do pays off quite a bit for your productivity.
- Firefox – continues to be the browser of choice, with its fantastic plugins (the “It’s All Text!” plugin being one of my favorite, lets me use GVim to type Emails or this blog entry for example). Great developer tools like Web Developer Toolbar and Firebug make it way better than Safari for most things. Speed and platform-integration are two of the things I do miss though…
- Live Mesh – file synchronization that just works. Works great with both Windows and Mac, and its remote desktop feature while somewhat anemic, is beautifully simple to use. If you don’t have a backup solution (and if you don’t, you will lose your stuff – storing everything on a USB stick does not count), this is a fantastic way to do it with almost zero work
- VMware Fusion – solid virtualization software for Mac, great integration with the guest without the evil hacks that Parallels uses (trust me as a Windows developer when I say this, you are much safer with VMware than with Parallels). These days however, I prefer more to use a dedicated VM running on another machine that I can remote desktop into rather than a local VM.
- Cygwin – without this, work would be way more painful. A Windows machine without Cygwin is nearly worthless to me.
- iTunes – …and I f’ing hate it. Please, someone write a music player for OS X that doesn’t epically suck. Amarok 2.0 and Songbird don’t count – Amarok went from the best music player on the planet without question, to a giant pile of gray crap. Trolltech systematically destroyed every decent KDE piece of software by releasing Qt4 and causing everyone to decide to make massive rewrites of their software, but I digress.
Anyways, some of the few things that iTunes got very, very right are, how it remembers exactly where I am in a podcast and syncs it with the iPhone, its podcast support in general, and its great sync support with devices (letting me choose whether to auto/manual sync music, making automatic backups of my phone, etc). The iTunes store would also be a gigantic win if it wasn’t so DRM-encumbered, and I would spend way more money there, instead of at Amazon MP3, which is also a fantastic service.
- XBMC – I’ve been using this on the XBox for years, and now that it’s a 100% cross-platform app, it’s even more awesome. Playing back video on your TV with this is fantastic, I stream movies and TV from my desktop machine over wireless and it works near-flawlessly, and understands just about any format. Setting up a cheap box with XBMC on it is the best way to get your music and movies onto your TV, hands-down. There’s also versions for hacked AppleTVs, which turns an AppleTV to me, from “complete trash” to “very compelling”. If the AppleTV had decent audio/video outputs, it would’ve been my new media box.
Stuff I don’t use anymore
- Tomboy – only runs on Linux until recently, and is local-only. The original developer also annoys me by coming up with fantastic ideas then abandoning them (I could also point that right towards myself, but anyways…)
- sshfs – I still use this occasionally when I have to traverse firewalls, but for everything local, Samba is faster and a bit less of a pain on OS X
- Unison – since I do much less work on my desktop than I used to, having two-way sync isn’t as useful to me; I just rsync from laptop -> desktop.
- Ubuntu – too many things broken on MacBook Pros, most the fault of Apple’s strange hardware, but it doesn’t change the fact that it’s broken.
Yikes! for nerds: how to get the code
I ran out of time yesterday, but as promised, here’s how to build and run Yikes!
Quick Start
cd yikes
rake && rake ffmpeg
ruby lib/main.rb -l /path/to/videos -t /path/to/ipod/output -r 1800 -b
Getting the code
By far, the best way to get the code is via Git; this lets you view the entire commit history, as well as send me changes. If you don’t have Git, you can download precompiled source code trees for Linux or Mac OS X 10.5. The Git clone URL via Github is git://github.com/xpaulbettsx/yikes.git
Building (”Huh? Building? On Ruby?”)
(If you downloaded the precompiled version, skip this part!) Even though the application is in Ruby, we need to build ffmpeg and its associated libraries from source, so you need to have the XCode tools installed, and you probably need MacPorts as well. While building this takes forever, it’s fairly easy:
Running the app
Right now, you have to run Yikes! from the command line, but the syntax is pretty easy. Here’s a sample:
# The long version
ruby lib/main.rb –library /path/to/videos –target /path/to/ipod/output –rate 1800 –background
# or if you want the short version
ruby lib/main.rb -l /path/to/videos -t /path/to/ipod/output -r 1800 -b
# If you want to run it on the sample files for development, there’s an easier way
rake run
Yikes! It’s your videos on your iPod!
For awhile, I’ve been working on a project that is pretty cool, and I’m finally getting near the “first 90% done” software development mark; now I’ve got the 2nd 90% to get it to production-quality, and the 3rd 90% will make it actually good. Here’s the screenshot:

Yes, the UI is rough-draft, I’ve got to go to town on it in CSSEdit
What’s it do?
In its simple mode, Yikes! will take a directory and convert all the movies to iPod/iPhone format (H.264 MPEG-4’s, so compatible with most players), and it will skip files it’s already converted. This isn’t too far off from what you could do with Handbrake and some clever bash scripts.
However, you can also run the program in background mode, and this is where it gets really useful. You give the program a folder of videos, and a place to put the iPod videos, and it will start a web site that you can go to on another computer, where you can see the converted videos, download them, or (and here’s the clever part), add it to iTunes as a video podcast, which will copy all the videos to your iPod automagically.
Where’s the code?
Github! http://github.com/xpaulbettsx/yikes
Update: Changed URL from earlier, merged webif-ramaze into master
Later today once I’m back at home I’ll put up a “how to get/build the code”, as it’s a little tricky. I’m working on official releases for Mac and Linux, and a Windows port is in the future; while I haven’t been coding towards it, I also have made sure to not choose anything that’s completely impossible for Win32.
Thoughts? Ideas? Comments? Want to help?
Since I’m always busy with work, it’s taken me quite a while to get to this point, and I’m definitely open to accepting contributions and making this a real open-source project; so far, I’ve set up Github and a bug tracker (but no mailing list, forums, documentation, etc). If you’re not handy with coding, websites, or art/design, I would even just appreciate suggestions or ideas for cool features. My Email address is paul at paulbetts dot org, let me know!
Everyone change your SSH keys, hooray
Your keys are teh pwn3d
I’m sure everyone who uses SSH has heard by now, but you need to change your SSH keys if you are using Debian/Ubuntu (or took a key from said OS like I did). If you’re thinking that it’s not a big deal, you’re gonna get put in the hurt locker – the only source of entropy in those keys are the PID of the process that created them. That means, there are only 32768 keys; it takes a hacker ~20 mins to break into any server he wants.
If all of your machines are Debian-based, the best thing for you to do is to just delete all the entries in ~/.ssh/authorized_keys until you can regenerate them and patch all of your systems.
In miscellaneous news
- We’re finishing up the beta for my super-cool project at work today – so far my clever attempt to sneak Ruby/Python through the backdoor at WinSE is turning out wonderfully, mwa ha ha.
- Summer finally seems like it’s here, it’s nice to be done with the crappy weather. I need to find a cool bike so that I can start riding to work. Days like this make me wish I had a dog to walk, that’d be nice too
Getting SSH to connect through a SOCKS proxy
At work, one of the things that has never worked for me is being able to use SSH from my laptop. While Cocoa apps will get redirected properly, command-line apps try to directly connect and fail. I’ve been trying for months to fix it, but could never find any solution that worked. The problem is compounded by the fact that the tool to fix it is not very searchable, and doesn’t compile on OS X.
However, I now have the solution:
- Download connect-proxy.tar.bz2 from my website (which is a patched, compiled version of the Debian package), and unpack the archive
- If you’re on OS X, just copy “connect-proxy” to your /usr/local/bin, otherwise run “make clean && make”, then copy the file
- Go to your “~/.ssh” folder and create a file called config. Insert the following line:
Host *
ProxyCommand connect-proxy -R both -4 -S proxy.url:1080 %h %p
And now, you’re good to go! But what do those flags do? Here’s a description of some useful flags:
- “-R both” – try a DNS lookup directly, and if it fails, try asking the proxy server
- “-4/-5″ – SOCKS4/SOCKS5
- “-S/-H hostname:port” – SOCKS / HTTP-based proxy
- “-d” – Causes connect-proxy to spew out debugging information
vcachefs – a compelling solution to a remote iTunes library
Per my idea in my previous post, I decided to write my own FUSE filesystem; following the tradition of obscure names that end in ‘fs’ for filesystems, I am calling it vcachefs. Here’s the git repository at Github. I mention more in the README, but the goal is to be able to add network files to iTunes and other media apps’ libraries without the poor performance and random lockups that usually happen when you try to do this directly.
I’m still at the proof-of-concept stage right now, but you can see my plan in the TODO file. As of today, you can create a mirror of ‘/etc’ in some other folder – not too exciting, but now that I’ve got the base stuff working, I can put the caching stuff on top of it.
After I get the app working correctly at the command-line, I may consider making it a full “Mac-like” application (with a pretty icon and everything!) – I think a lot of people want to do this kind of thing, because requiring all of your library to be on the local computer is asinine.
For those who aren’t familiar, FUSE is a kernel interface that lets you quickly write your own virtual filesystems with a minimum amount of work, and without writing kernel code. While this approach isn’t very fast, it’s super-easy to get working, and there’s almost no bookkeeping stuff necessary. This has led people to write very strange yet useful “filesystems”, such as FlickrFS or WikipediaFS (which show Flickr collections / Wikipedia articles in your filesystem).
It’s pretty nice to come back to some old-school C hacking, while cool languages like Ruby and C# have a lot of fancy features, there’s nothing like busting out a copy of Vim and going to town.
Moving git repositories to SVN/TFS
Git is great!
At my job, I’ve been working on a project for several months now. Since I needed a cheap, easy VCS, and I wanted to learn the next cool thing, I decided to use Git, to shuffle files between computers and keep a history of commits. It turns out, Git is really, really awesome – beats the pants off of the customized Perforce we use at work. The individual pieces of our project are starting to come together, so one of my coworkers started the arduous project of setting up Team Foundation Server. I was interested because its integration with VS (especially being able to see the changes annotated in the code in the gutter – very cool), as well as the “Trac-like” stuff that I’ve never bothered with in my personal projects.
The plan from 10,000 feet
Unfortunately, getting my Git history into TFS hasn’t been the easiest thing, but it can be done. If you’re not familiar with TFS as a VCS, it’s pretty similar to Perforce (surprise!), but without a solid command-line client. Fortunately, via Scott Hanselman’s Blog, I found SVNBridge. Basically, you give it a TFS server, and it’ll pretend to be a local SVN server. This program is under really heavy development (aka “isn’t finished yet”), so you’re gonna run into some problems; run it under VS 2005/2008 debugger to see what goes wrong, it’s usually some dumb TFS rule.
Since Git has a really great tool called git-svn which allows us to pull and push to Subversion repositories, we’ll use this hacky structure to push our Git repository to a new blank TFS repository. Note that this method works for a regular Subversion repository too, just use your SVN URL instead of SVNBridge’s fake URL. We do some trickiness later because git-svn dcommit doesn’t seem to work as advertised, so we have to specifically commit each revision using git-svn set-tree (not manually, but it’s still hacky).
The guts – copying an example repo from Git to Subversion
Start up SVNBridge (under the debugger!), and point it to your TFS repository. Make sure this repository is blank (or at least doesn’t have any of your project files in it), or else TFS will have a fit. However, if the repo is completely empty, git-svn will have a fit, so make sure to commit some blank file to the TFS/SVN repository first. git-svn needs this to base its merge on.
In this case, our Subversion “repository” is mapped using SVNBridge to http://localhost:8081, and our git repo is at http://git.example.com. Now here’s the commandline magic – make sure to run this in a Bash(or zsh/ksh/whatever) shell, not under cmd – git doesn’t like non-sh shells.
mkdir tmpwc && cd tmpwc
git-svn clone http://localhost:8081/SomeProject .
# Merge in our source Git repo
git pull http://git.example.com
# Grab all of the commits and write them to a file, from first to last
# If you want to, you can manually edit this file first
git log –pretty=oneline –reverse > commits
# Commit them all, one by one
cat commits | cut -d ‘ ‘ -f 1 | xargs -l1 git-svn set-tree
If you’re lucky, that worked. If not, keep hacking at it and you’ll eventually get it to do your bidding.