s3putsecurefolder

Linux, Open Source, Tech, Web 4 Comments

Edit: this script is deprecated in favour of a rewritten version 2.

I use Amazon S3 to host large media files which I want cheap scalable bandwidth on, and for expandable offsite storage of important backups. I used to have some simple incremental tar scripts to do my offsite backups, but since I moved to Bacula, I’ve just established an alternative schedule and file set definition for my offsite backups, the critical subset of data I couldn’t possibly stand to lose (like company documents). Since I was refreshing all my procedures and tarring the Bacula volumes no longer made any sense, I rewrote my script for putting the resulting backup data on S3.

The prerequisite in all cases is s3cmd, which is pretty mature now and available on most distros (“apt-get install s3cmd” and you’re done on Ubuntu). s3cmd actually has a ’sync’ command, but firstly that tries to sync in both directions, which I don’t want (I know in theory it should never overwrite any local version so long as I don’t update the remote copies from somewhere else, but I’m paranoid when it comes to my backups and prefer to be explicit), and secondly it obviously has to connect to S3 to determine the sync status, wheras I always know whether I need to upload new files just from my local environment (and S3 charges per request – not much, but it’s not zero and it’s the principle of the thing). So, I decided not to use the ’sync’ command, and just determine locally what new files I needed to ‘put’ on the server.

Secondly, encryption is a must, since some of the data is sensitive and I don’t want to trust anyone else with it. I used to manually GPG my tarballs before uploading them, but I noticed that s3cmd supports an encryption option too. It just uses GPG anyway, just in symmetric form rather than asymmetric like my version did (translation: you use the same passphrase for encryption and decryption; a little less secure than using generated public/private keys but still ok so long as you pick a good passphrase and look after it). The default symmetric algorithm in gpg is CAST5 which seems pretty good, although you can change it if you want by editing your s3cmd config file. So, I decided to give it a try – after you configure s3cmd to use encryption, it actually automatically decrypts too when you pull the data back (symmetric key, remember) – being distrustful, I pulled the data back from S3 in a different environment and examined it, and it was indeed complete gibberish, but decipherable with the passphrase. Good stuff.

So, here’s my little script which will upload the encrypted contents of a folder to S3 – just the contents that have been added or updated since the last sync of that folder, and will encrypt them by default. I just run this on a cron schedule now and it seems to work fine. License is MIT, use at your own risk, no warranty is given that it won’t destroy every file on your machine or eat your children. Usage is like this:

s3putsecurefolder /my/source/folder my.s3.bucket

Edit: it was brought to my attention that Amazon have made it easier to create pseudo-folder structures in S3 buckets since I last tried to do it (I swear it used to throw out keys with forward slashes in them, I had to mangle my names last time I did this), so I’ve updated the script to allow nested folders too.

YouTube putting the bullet in IE6’s head?

Internet, Tech, Web 9 Comments

Oh, please let this come to pass soon. TechCrunch reports that YouTube is due to drop support for IE6 ’soon’, pointing users at Chrome (obviously), IE8 and Firefox 3.5. Finally, one of the worst pieces of software ever to pollute the Internet with its presence is getting taken out to the barn with a double barelled shotgun, and not a moment too soon.

Sure, Digg already said they might do this, but YouTube is far more significant; if YouTube stop supporting IE6, then in practice it means I can too :) Bye bye IE6, please do let the door hit your ass on the way out, preferably hard enough to fracture your pelvis. It’s the least you deserve for making life hell for hundreds of thousands of web site maintainers over the last few years.

Browsers just aren’t sexy anymore

Tech, Web 4 Comments

I’ve been running Firefox 3.5 and Internet Explorer 8 on this machine for a little while now. Both are worthy upgrades to their line, addressing their previous shortcomings quite nicely – Firefox is now faster and more importantly leaner on memory use, and Internet Explorer seems to have mostly shaken off the dull, bare bones feel that it’s had in the past, and is definitely faster and more standards compliant.

I actually feel I could use any of Firefox, IE, Safari, Chrome or Opera now and be fairly happy. I’m sticking with Firefox, because the addons I use still keep it ahead of the alternatives as a user experience for me – the reason that I find it better is, I think, that a vibrant community inherently produces enough breadth that I can always find things which make a substantive improvement to the way I personally want to use the browser. No matter how many snazzy features a single team decides to put in a browser, they’re never going to hit the mark with everyone, and I find that only a small percentage of the in-the-box feature points of IE8, Safari or Chrome are of any real interest to me.  That’s why just a speed bump and memory optimisation was all I really needed from Firefox 3.5; I make my own recipe of must-have features from the community instead.

But still, the days of ‘browser X sucks compared to browser Y’ seem to be mostly over for the moment, as competition has levelled the playing field to the extent that it’s mostly personal preference on the small things that remain.  That’s a huge improvement from Microsoft particularly, who deserved their reputation for producing terrible browsers in the past, but who I think have now earned the right to shake off that reputation. As much as it’s a difficult adjustment to make, IE is no longer a bad browser. It’s just another decent browser that is missing my Firefox addons ;)

tengrandisburiedhere.com

Comedy, Internet, Web 19 Comments

Oh, this is so ripe for satire I really can’t believe Microsoft didn’t see it coming. Or, perhaps they did and just ran with it anyway, for funsies. It appears Microsoft’s Australian website is encouraging people to switch to IE8 by offering an online treasure hunt, where a series of clues will lead you to a site identifying the location of the $10k (AU$ presumably), which can only be viewed with IE8. They gleefully point out:

“But you’ll never find it with old Firefox. So get rid of it, or get lost.”

So, let’s stack up the issues here:

  • Microsoft has resorted to offering a monetary incentive to encourage people to use its free browser. Is that an admission that based on just the merits of the product itself, IE8 probably wouldn’t be the user’s first choice? I’d guess that people who actually choose their browser (rather than accepting what they get preinstalled) are not that likely to pick IE8.
  • ‘old’ Firefox? Last I checked, IE predated Firefox by some years, and the latter has a new version coming out in mere weeks. Resorting to empty name-calling now? Dear me.
  • Websites that only work in IE? Wow, welcome back to 1999 guys. ActiveX, Outlook Express bindings – ah, the memories. The horrible, eye-watering memories.

A Mozilla dev has already fired back a response, but really I don’t think it needed one. I think the fact that this promotion exists at all, and the tone which it takes, speaks volumes about how much the browser landscape has changed in recent years.

Opera Unite – another step in the right direction?

Internet, Tech, Web 3 Comments

operatuniteI’ve harped on many times about how I think centrally controlled services like Facebook are the antithesis of what the Internet was supposed to be about – a distributed, decentralised place with authority controlled at the leaves by those with most interest in maintaining it, rather than some corporate hub holding all the cards.

Well, it seems like a small bunch of companies are starting to latch on to this idea too, a welcome respite from the huge number of ventures that just want to be the new singular nexus of your internet life. Google Wave certainly ‘gets it’, if the reality reflects the stated vision where the open-source software can be run anywhere, not just on Google’s servers. And Opera Unite is making the right kind of noises for me too, even if right now the service is embryonic.

In essence, it’s a semantically richer, more secure version of BitTorrent – the ability to share files, photos and media within interfaces dedicated to that purpose, serve web pages, and chat, but by making direct connections with your peers rather than going through a centralised hosting service. Opera Unite provides the software to perform the hosting from your own devices, and provides the discovery and network trust systems to allow people to hook up.

There are lots of issues with this approach of course – such as whether you trust the hosting software not to punch holes in your local security, whether you really want to have the bandwidth issues of self-hosting, what happens when your machine is off, etc. Right now, I don’t think it’s that workable as a replacement for centralised systems, but that’s not the point – the point is that the principle of entrusting all your unencrypted data to a single online entity is eventually not going to be good enough anymore, and we need to be developing alternative approaches. If the future is truly in the cloud, we need far more than what the cloud offers right now – which is to say services that while user-friendly, require you to give up far more control over your data than is feasible for anything remotely important. Sure, you’re happy to put photos on Facebook, and Twitter about all those things that you don’t mind the world knowing, but that’s a very specific, non-critical subset of the data we all increasingly need to hold. Would you be happy to scan your bank statements and put them on Facebook, even if you set them to private? Of course not – but if the cloud is to realise its potential, these are the kinds of harder applications we need to try to address.

I’m not saying Opera Unite addresses that – not even close. But the fact that people are exploring alternative approaches to the 100% centralised model is a positive sign to me. We need to start tackling how we use entirely public transport & repository systems (ie the cloud) to securely store and exchange important and sensitive data, and I say that’s impossible to address with an entirely centralisd model, because a centralised model focusses control in too few hands. Encryption gives us the ability to store and transport secure information in plain sight, but it’s traditionally a very tricky thing to make easy to use for the general public, particularly when multiple parties and ‘controlled’ sharing is required. Thus, one approach is to focus on securing the transport instead (which is easier, and why SSL is ubiquitous) and lock down access to the leaves more tightly. Opera Unite is an experiment in the leaf model and may well inform the process, leading to more innovation in this area down the road.

Xbox Support survey – #fail

Comedy, Web 2 Comments

Now that I’ve had my 360 fixed & returned, I’m being pestered with requests to fill in a survey about my experience. Ignored the first two, since I was neither ecstatic nor furious about my support experience, so it would make a particularly tedious ‘ok I guess’ response. But, they’re insistent with their damnable reminder emails, so I tried to do it.

I got right to the end screen, and then got this:

Microsoft OLE DB Provider for ODBC Drivers error ‘80040e31′

[Microsoft][ODBC SQL Server Driver]Timeout expired

/CSatCalSrvy/LibProcess.inc, line 216

Heh. So, where do I fill in a survey about how badly the survey system worked? ;)

[Edit] Ah well, 4th attempt worked at least. And to be fair, they farm this out to a 3rd party research firm anyway. But, still funny.

Google Wave – email finally RIP?

Internet, Open Source, Tech, Web 12 Comments

googlewaveMany people have declared email to be dead in the past, and they’ve all been wrong. The typical play has been from instant messenger advocates, and most recently from Facebook. But, while these options have been a valid all-encompassing solution for teenagers and students, I haven’t met a single serious modern IT user whose life isn’t still driven primarily by email. There’s a reason that Outlook and Exchange are such consistent cash cows for Microsoft, and so many business people own Blackberrys. IM, Facebook & Twitter may represent certain facets of your online existence, but if push really came to shove, and you were only allowed to use one electronic service, I bet you every gadget I own that almost everyone will opt to keep their email over anything else.

I certainly could not operate without my email, but after watching the demo of Google Wave, I saw for the first time something that could genuinely be better, without leaving me with a gaping functionality hole in the name of ‘progress’. In retrospect, the idea seems incredibly obvious, but I’m sure the implementation was tricky.

In essence, Google Wave is basically a fusion of email and IM functionality. You still compose emails, reply to them, and include people in the threads, but the whole functionality set can also operate basically in real-time, just like an IM client. Whether something is instantly transmitted interactively depends on whether the person you’re sending it to is online, and some preferences of your own. There were some nice geeky demo things like instant translation via a bot too, but the most important thing to me was how holistic it was. One of the major problems I have is connected with using a mixture of IM and email communication, particularly with clients but also friends. I might remember that I had a conversation with X about Y, and want to go refresh my memory about it, but I don’t remember whether we talked about it on email, or on IM (and whether it was Skype, GTalk, MSN, and which machine I was on at the time). Looking up this information is a pain, because my email is one island of information, and my IM conversations are many separate islands. Being able to search across the whole thing in one swoop, from any PC/device, and see all the conversations, both deferred email-like and instant, all in their original threaded contexts, would be absolutely fantastic. It would give me value in my working life, instantly, right now.

They could go even further, by supporting VOIP calls through it too, and have the option to use voice recognition to transcribe it (or record it as well), or even just to log that the conversation happened and allow me to add a few manual notes to it. I’d imagine this would be early on the list of extensions, by Google themselves or by external developers, since they’re encouraging people to use the API. And I can imagine that linking all this up with Google Maps, Google Calendar, Google Docs etc will have a multiplicative effect.

So I’m pretty excited about Wave. It’s the first collaboration tool I’ve seen that could genuinely replace my email (and IM), although I’d then have to tackle the very real question of whether I really want to give Google control of such critical parts of my electronic life.

Bing – decent Google clone, but where’s the snazzy?

Tech, Web 8 Comments

I, like many people, viewed the Bing marketing video last week, which promised not to create a search engine, but to create a ‘decision engine’ – if you winced at this blatant attempt at the ‘game changing switcheroo’, congratulations, you can join me on the ‘jaded technology observer’ bench. Despite my distaste at having to swim through the murky waters of marketing blurb in this video, the demonstration looked pretty nice – showing how the ‘decision engine’ picks out flight details, product reviews and other things out of your search terms and provides context-sensitive recategorisations such as price and specifications of digital cameras if that’s what you were searching for etc. I could definitely see a use for that sort of thing, provided the ‘decision’ part of the engine didn’t get too zealous, start finding unintended patterns, and obscuring the regular search results.

Now that I’ve had a few minutes to play with the real thing today, I needn’t have worried. Because it seems that despite what I type in to the search engine, it just gives me regular search results – even if I try to recreate some of the examples in the demo from last week, like searching for digital cameras. Is the ‘decision’ part of the engine just turned off right now? Because as it stands, what I see is a pretty competent Google rip-off, which technically is no mean feat, but the world doesn’t need another Google. We already have one.

Maybe someone will flip a switch and it’ll suddenly start doing the magical things the marketing video talked about.

Edit: ok, I see that the filtering and review features are in the Shopping section, but Google has these in it’s own Shopping section too. Someone tell me, exactly what is Bing doing that Google hasn’t been for the last several years? Am I just blind to the genius of the ‘decision engine’?

Who cares about #fixreplies?

Internet, Web 2 Comments

So, the intertubes are awash today with people venting their spleens about Twitter’s decision to stop sending replies by people you do follow, but to people you don’t follow, to your main Twitter feed. Previously you had the option either way, and now some people are getting their panties in a bunch about it.

There are two things to say about this issue:

  1. Personally, I don’t want to see all the random replies to other people I don’t follow. I already deliberately only follow a small number of people, beacuse frankly I don’t have time to sift through a huge list of tweets every day. I have absolutely no idea how anyone copes with following more than about 10 people who tweet regularly, and still get something done in the day, nevermind seeing all the secondary replies. Am I just inefficient at processing large numbers of posts, or do I just have a staggeringly lower level of patience than the average Twitter user? The way I have things right now, I read every one of the posts from people I follow, because I consider them interesting, and that takes little time. I couldn’t do that if I was following 100 people and their replies to other contacts too, so I’d either have to lie (ie stick to etiquette and follow them, but then filter out most of what they say on the client), or just spend all day reading Twitter. So personally, this seems a sensible choice – you can always use the Twitter web if you really have nothing better to do but surf Twitter, or browse your friends ‘following’ list if you’re desperate to mine the system for new contacts.
  2. Twitter is free. If you paid nothing for a service, you are entitled to offer your constructive feedback which the providers may choose to listen to, but you are not entitled to have a major tantrum about it. As Matt Asay suggests, if you care about the service that much, then you should probably be paying for it – and God knows, Twitter needs a business model other than the typical Web 2.0 “Attract viewers …….  profit!” fantasy right now. On the whole, the Internet needs a slap to wake up its users from the bloated sense of entitlement they’ve developed over the years, fueled by a huge number of startups that delude people into thinking they can expect everything for nothing. 100% free models don’t work (yes, I know, I’m an open source advocate, but that doesn’t mean I believe that you can give everything away) – they are a complimentary aspect, or a stop-gap until you can develop a real model or pursuade some sucker to acquire you before the hype train grinds to a halt. Eventually, these cycles of pretending that you can get premium service for free will end, and everyone will have to face up to the reality that ‘freeloaders’ have a place (building momentum, awareness etc), but ultimately they’re at the bottom of the food chain. Plankton are vital to the oceanic ecosystem, but no-one asks them for their opinion. ;)

Wordpress upgrade script

Web No Comments

Keeping web software up to date is a pain, but failure to do it can result in significant ramifications. Some bits of software are easier to keep up to date than others, but one thing I never like doing is using web-based upgraders. They may be convenient, but for a start they require that you give the web server far more file permissions than any sane person would want to during the upgrade process, plus the fact that any kind of ‘black box’ upgrade makes me nervous.

Wordpress is fairly easy one to update, but even so requires some manual fiddling if like me you shy away from fully web-based upgraders. So, I wrote a simple script to automate it, including backing up previous files and the database in case anything goes horribly wrong. I’ve tested this on 2 sites so far (in 3 environments, my local test server and 2 separate live sites) and it upgraded WordPress installs running both 2.5.x and 2.7.x to the latest version without any problems. Certainly saved me time once I’d written it and removed the element of random human error (replacing it with predictable automated error ;) ). I figured someone else might find it useful.

Disclaimers: this script is presented AS-IS and I take no responsibility for any effects of using it, use entirely at your own risk. In particular, it only backs up the core Wordpress tables and assumes that you used the default ‘wp_’ table prefix, so if you have plugins which need extra tables, or used a different prefix, you will need to alter the script if you want a valid database backup. It’s also only applicable to Linux servers, although that’s the most common setup for Wordpress anyway.

#!/bin/bash

# Wordpress upgrader, run from the root of your Wordpress install like this:
# /path/to/wpupgrade.sh /path/to/wordpress-x-x-x.zip
# (tar.bz2 / tar.gz archives also supported)

if [[ "$1" == "" ]]
then
	echo Required: Wordpress archive parameter
	exit 1
fi

if [ ! -e $1 ]
then
	echo Archive $1 does not exist
	exit 1
fi

if [ ! -e wp-config.php ]
then
	echo This script must be run from the Wordpress root directory
	exit 1
fi

# Back up old files
echo Backing up DB...
rm wordpress_db_backup.sql.bz2
read -pDatabase: db
read -pusername: user

mysqldump -u$user -p $db wp_comments wp_links wp_options wp_postmeta wp_posts wp_term_relationships wp_term_taxonomy wp_terms wp_usermeta wp_users > wordpress_db_backup.sql || exit 1
bzip2 wordpress_db_backup.sql

echo Backing up files...
rm wordpress_file_backup.tar.bz2

tar -cjf wordpress_file_backup.tar.bz2 wp-includes wp-admin wp-content wp-*.php index.php xmlrpc.php || exit 1

rm -rf tmpwordpress
mkdir tmpwordpress

case $1 in
	*.zip) unzip $1 -d tmpwordpress/
	;;
	*.tar.bz2) tar -xvjf $1 -C tmpwordpress/
	;;
	*.tar.gz) tar -xvzf $1 -C tmpwordpress/
esac

rm -rf wp-includes
cp -R tmpwordpress/wordpress/wp-includes ./

rm -rf wp-admin
cp -R tmpwordpress/wordpress/wp-admin ./

cp -f tmpwordpress/wordpress/*.php ./
cp -f tmpwordpress/wordpress/*.html ./
cp -f tmpwordpress/wordpress/*.txt ./ 

rm -rf tmpwordpress

echo Files updated, now go into Wordpress admin to finish the upgrade

As an aside, Wordpress 2.7.1 has a bug in it that can cause saving posts to fail with a message about htmlspecialcharacters_decode sometimes (such as when using the syntax highlighter in the above code). If you hit it too, the fix is in this bug report.