Posts Tagged ‘amazon’

Britain’s Got Talent and The Cabinet Office – Ubuntu bedfellows in the public cloud.

// June 6th, 2011 // Comments Off // Uncategorized

Last week saw the culmination of one of the UK’s most popular TV shows – Britain’s Got Talent. The way in which this show over five series has captured the attention of the British public is quite incredible, with the majority of popular media outlets dedicating significant space to the contestants, the judges, rumours about the format and speculation about who would win.

Such coverage and excitement means that Britain’s Got Talent drives audience and voter engagement to levels that politicians must dream about. Of course there are many ways that the show makes sure it gets our attention, not least of which is having hours of live coverage on prime time television, but, the talented team behind the show are also using many techniques to encourage deeper engagement for a modern audience.

Take for example the Buzz Off game. This is a game with which viewers can play along while watching the show to ‘buzz’ the acts that they don’t like using a mobile or web-based application. The buzzes are stored with a running total kept and shown per act on the website, so that the audience goes from being an passive viewer to an active participant in the show. The Buzz Off game is developed by Livetalkback for the Britain’s Got Talent Team and recently Malcolm Box, CTO of Livetalkback explained to a group of London big data enthusiasts some of the challenges in building and designing an application that is required to scale to almost Facebook like proportions for a short period of time. The full presentation is below, but for convenience some of the key points are:

  • The volume of traffic being handled by the Buzz application during a two hour live show is equivalent to 130 billion requests per month – excluding Google, this would put the application as approximately the 2nd largest website in the world behind Facebook.
  • To manage this scale, the application is based on Ubuntu Server, MySQL and Cassandra all hosted in the Amazon Public Cloud
  • The service uses hundreds of instances that must be brought online very quickly as additional capacity is required and then released as the load declines after the show.

Malcolm and the team at Livetalkback have done an incredible job to put this together in a short space of time and have it work reliably throughout this year’s programme. A cloud-based approach made perfect sense for an application with such specific scaling requirements, and it was vital that the application scaled not only technically but financially as well. This is where Ubuntu on Amazon really proved its worth – customers pay for the resources they use and there are no license fees or royalties to worry about when bringing up new instances. It is the type of efficient driving of engagement that once again Government departments must be in awe of.

Which brings us onto the Cabinet Office. The UK Government is looking for ways to provide cost effective online systems that drive audience engagement. Recently there have been signs that there has been progress  through the Alpha.gov.uk project led by Martha Lane Fox. Alpha.gov.uk is a prototype site that demonstrates how digital services could be delivered more effectively and simply to users through the use of open, agile and cheaper digital technologies. It is only a prototype at the moment but it is significant in that it has been quickly put together and delivers exactly what it is supposed to do in a cost effective way. So how did they do it? Well they decided on a similar architecture to Livetalkback – Open source software based on Ubuntu Server in a public cloud. Full details of the technology used is at:

http://blog.alpha.gov.uk/colophon

British tax payers will take heart form the knowledge that someone in the Cabinet Office is looking at this and hopefully wondering why more Government services can’t be delivered like this. When it comes to engaging an audience and encouraging interaction in a cost effective way, Britain’s Got Talent and the Cabinet Office now have more in common than you’d think.

Cloud Portal one-click launch

// May 11th, 2011 // Comments Off // Uncategorized

Announcing two nice little additions to the cloud portal AMI tool that should make everyone's cloud life a little easier

  • Amazon now allows passing parameters to the AWS console to basically be able to choose the region and AMI-ID to launch. This is now integrated in the cloud portal so that you search for the ami, click the link and it's pre-selected, ready to be launched in AWS console. Here's a screenshot

cloud-portal-one-click-launch


  • As that image shows as well, you can search now for "cluster" to get Ubuntu cloud cluster-ready images!
Interested to add any features? Have any thoughts or comments, let me know, leave me a word

My Experience With the EC2 Judgment Day Outage

// April 21st, 2011 // Comments Off // Uncategorized

Amazon designs availability zones so that it is extremely unlikely that a single failure will take out multiple zones at once. Unfortunately, whatever happened last night seemed to cause problems in all us-east-1 zones for one of my accounts.

Of the 20+ full time EC2 instances that I am responsible for (including my startup and my personal servers) only one instance was affected. As it turns out it was the one that hosts Alestic.com along with some other services that are important to me personally.

Here are some steps I took in response to the outage:

  1. My instance was responding to pings, but no other services were responding. I tried a reboot through the EC2 API, but that did not seem to take effect as the pings never stopped responding.

  2. I then tried to stop/start the instance as this is an easy way to move an EBS boot instance to new hardware. Unfortunately, the instance stayed in a “stopping” state (for what turned out to be at least 7 hours).

  3. After the stop did not take effect in 20 minutes, I posted on the EC2 forum and noticed that a slew of others were having similar issues.

  4. I was regularly checking the AWS status page and within a short time, Amazon posted that they were having issues.

  5. With an instance wedged, my next move was to bring up a new replacement instance. This is fairly easy for me with complete documented steps and automated scripts that I have used in the past in similar situations.

    Unfortunately, my attempts to bring up new EBS boot instances failed in all four availability zones in us-east-1. In three of the zones the new instances went from pending to terminated. In the fourth zone, I got the insufficient capacity error for the c1.medium size.

    At this point, I realized that though my disaster recovery strategy for this particular server did span multiple availability zones, it did not extend beyond a single EC2 region. My databases are backed up outside of EC2, so I won’t lose key data, but there are other files that I would have to recreate in order for all systems to operate smoothly.

  6. I decided to trust Amazon to bring things back faster than I could rebuild everything in a new region and went to sleep for a few hours.

  7. Shortly after I woke up, I was able to start a new instance in a different availability zone. To move the services from the wedged instance to the new instance, I:

    a. Created snapshots of the EBS volumes on the wedged instance.

    b. Created new volumes in the new availability zone based on those snapshots.

    c. Stopped the new EBS boot instance.

    d. Detached the EBS boot volume from the new instance.

    e. Attached the new EBS volumes (created from the snapshots) to the new instance.

    f. Started the new EBS boot instance.

  8. Sure ‘nuff. The new instance looked and ran exactly like the old, wedged instance and I was able to assign my Elastic IP Address to the new instance and move on with my day.

Notes

  • Since AWS does not post their status updates to Twitter (that I’ve found), I follow ylastic. Ylastic sends out updates on Twitter as soon as Amazon updates the status page. In fact, Ylastic often sends out problem alerts before Amazon updates their status page!

  • Though I got many errors in trying to run API commands, I kept trying and some requests got through so that I was able to accomplish my recovery well before Amazon announced that things were getting better.

  • Judgment Day = 2011-04-21 (in some timelines)

Lessons learned

I’m happy with AWS and my current strategies except for one weakness: I need to keep backups of everything required to bring up my servers in a different EC2 region, not just a different availability zone. This would also work for moving to a different provider if that became necessary.

At my company, Campus Explorer, we have implemented this higher level of disaster recovery and have even tested it with a fire drill. We were able to recreate our entire service on a different hosting provider using only off-AWS backups in about 24 hours of work. Fortunately for my sleep last night, none of our company services were affected in today’s EC2 outage.

Original article: http://alestic.com/2011/04/ec2-outage

My Experience With the EC2 Judgment Day Outage

// April 21st, 2011 // Comments Off // Uncategorized

Amazon designs availability zones so that it is extremely unlikely that a single failure will take out multiple zones at once. Unfortunately, whatever happened last night seemed to cause problems in all us-east-1 zones for one of my accounts.

Of the 20+ full time EC2 instances that I am responsible for (including my startup and my personal servers) only one instance was affected. As it turns out it was the one that hosts Alestic.com along with some other services that are important to me personally.

Here are some steps I took in response to the outage:

  1. My instance was responding to pings, but no other services were responding. I tried a reboot through the EC2 API, but that did not seem to take effect as the pings never stopped responding.

  2. I then tried to stop/start the instance as this is an easy way to move an EBS boot instance to new hardware. Unfortunately, the instance stayed in a “stopping” state (for what turned out to be at least 7 hours).

  3. After the stop did not take effect in 20 minutes, I posted on the EC2 forum and noticed that a slew of others were having similar issues.

  4. I was regularly checking the AWS status page and within a short time, Amazon posted that they were having issues.

  5. With an instance wedged, my next move was to bring up a new replacement instance. This is fairly easy for me with complete documented steps and automated scripts that I have used in the past in similar situations.

    Unfortunately, my attempts to bring up new EBS boot instances failed in all four availability zones in us-east-1. In three of the zones the new instances went from pending to terminated. In the fourth zone, I got the insufficient capacity error for the c1.medium size.

    At this point, I realized that though my disaster recovery strategy for this particular server did span multiple availability zones, it did not extend beyond a single EC2 region. My databases are backed up outside of EC2, so I won’t lose key data, but there are other files that I would have to recreate in order for all systems to operate smoothly.

  6. I decided to trust Amazon to bring things back faster than I could rebuild everything in a new region and went to sleep for a few hours.

  7. Shortly after I woke up, I was able to start a new instance in a different availability zone. To move the services from the wedged instance to the new instance, I:

    a. Created snapshots of the EBS volumes on the wedged instance.

    b. Created new volumes in the new availability zone based on those snapshots.

    c. Stopped the new EBS boot instance.

    d. Detached the EBS boot volume from the new instance.

    e. Attached the new EBS volumes (created from the snapshots) to the new instance.

    f. Started the new EBS boot instance.

  8. Sure ‘nuff. The new instance looked and ran exactly like the old, wedged instance and I was able to assign my Elastic IP Address to the new instance and move on with my day.

Notes

  • Since AWS does not post their status updates to Twitter (that I’ve found), I follow ylastic. Ylastic sends out updates on Twitter as soon as Amazon updates the status page. In fact, Ylastic often sends out problem alerts before Amazon updates their status page!

  • Though I got many errors in trying to run API commands, I kept trying and some requests got through so that I was able to accomplish my recovery well before Amazon announced that things were getting better.

  • Judgment Day = 2011-04-21 (in some timelines)

Lessons learned

I’m happy with AWS and my current strategies except for one weakness: I need to keep backups of everything required to bring up my servers in a different EC2 region, not just a different availability zone. This would also work for moving to a different provider if that became necessary.

At my company, Campus Explorer, we have implemented this higher level of disaster recovery and have even tested it with a fire drill. We were able to recreate our entire service on a different hosting provider using only off-AWS backups in about 24 hours of work. Fortunately for my sleep last night, none of our company services were affected in today’s EC2 outage.

Original article: http://alestic.com/2011/04/ec2-outage

Secure, flexible and scalable Amazon EC2 instance preseeding

// January 3rd, 2011 // 9 Comments » // Uncategorized

I'd like to introduce Joe. He is a good looking, experienced sys-admin and like all good sysadmins, he has more stuff to do than time to do it.

Joe wants to get up and running on Amazon EC2 with a Wordpress installation, and chooses to do so with a pre-configured appliance. These are the steps Joe performs:

  • Joe logs into his favourite Amazon EC2 console, specifies a Wordpress appliance, and other configurations.
  • Clicks launch.
  • Once the instance is running, he logs in using his SSH public key and changes the root password (it was set randomly on firstboot, right?).
  • He then proceeds to change the MySQL root password as well (also set randomly on firstboot, hopefully!). Joe knows how to do this as he's an experienced sys-admin, do you?
  • Finally, Joe logs into Wordpress using the default admin password (he noted the default password in the release notes before launching), resets the password and specifies his own email for the account.

While performing the above, Joe was holding his breath and working as fast as he could because he was previously hit by a botnet looking for random systems using default passwords and was compromised. Luckily this time he came out unscaved.

Does this sound familiar? Well, it should because that's how it's mostly being done.

You might be thinking to yourself "but I used the TurnKey Hub to set the root password for my instances, which also set the database password". True, that has been a feature of the Hub from day one, but with the release of TurnKey 11.0 and the end to default passwords, we've extended the Hub to support preseeding as well.

The idea behind this was not only to make cloud deployments more secure, but to make it much easier. We wanted to simplify the process for Joe from the above to this:

  • Joe logs into the Hub, selects Wordpress and preseeds the configuration.
  • Clicks launch.

The above is not a mock-up of a future implementation, it's live on the Hub.
So how does it work? Read on...
 

Brainstorming a solution

The problem in preseeding an instance is sending the information (securely) to the instance.
So how do you do it?
 

Idea #1: pass it through Amazon EC2 user-data?

If you know a little about Amazon EC2 you'll know that when launching an instance you can specify user-data which is accessible from the instance via Amazon's API.
 
But wait, do you really want to store authentication credentials in user-data? 
 
You could, but because any process on the instance that can open a network socket can access the user-data as it never expires, you'll probably want to firewall off the Amazon API as soon as it's not required anymore during instance initialization. But maybe the user of the instance needs access to the Amazon API? Crippling the service by design isn't a good solution in my honest opinion.
 

Idea #2: store it in the Hub's database, and let the server query the API

So, instead of sending authentication credentials via user-data, why not send a unique identifier (e.g., SERVER_ID), so the instance can use the Hub's API to pull the credentials?
 
Well, you could, but that would mean the Hub service needs to store the instance's configuration, passwords and all, in its database and delete it when it's no longer needed. Storing an item in a database for just one use is inelegant. But it's a natural solution if you only have a database, as I dicussed in a previous blog post, "when all you have is a hammer, everything looks like a nail".
 
In my opinion, it ultimately comes down to separation of concerns. For this type of pattern, the most natural solution would be some sort of messaging service. The Hub publishes a message to a queue, which the instance consumes.
 

Idea #3: pass it as messages using the Advanced Message Queuing Protocol (AMQP)

So whats wrong with messaging? Nothing really, so long as you take care when designing the system for confidentiality and integrity - we don't want others eavesdropping on messages, or sending spoofed messages.
 
Messages that fail a CRC or cannot be decrypted successfully should be discarded, and removed from the queue so not to block it.
 

Designing infrastructure that is secure, scalable and extendible

The solution we came up with is designed to be secure, scalable and extendible. Eventually it will support other cloud hosting providers, as well as provide bi-directional secure communication for future Hub-based services still under development.
 
The solution uses each of the brainstormed solutions above for what they were designed for, and no more.
 
Data Flow Diagram (DFD) explained:
  1. The user specifies preseeding data.
  2. The Hub tells Amazon EC2 to launch the instance with user-data which includes the SERVER_ID.
  3. The Hub creates a direct message exchange and queue for the server, which is configured to only receive messages sent from the Hub.
  4. The Hub publishes symmetrically encrypted messages (incl. a CRC) to the server queue with preseeding data that only the server can decrypt.
  5. The instance pulls user-data from the Amazon EC2 API (SERVER_ID).
  6. The Instance registers itself with the Hub via an SSL secured API using the SERVER_ID, which responds back with the server subkey and messaging secret. Note that this can only be done once for security.
    • subkey: A one way hash generated from the user's APIKEY. It is unique for each server registered in the Hub, and is used as part of the exchange and queue naming structure.
    • secret: A secure unique hash used for message encryption and decryption.
  7. The instance consumes messages from the queue. Messages are decrypted and passed to the callback for processing (preseeding messages appends the arg=value to inithooks.conf).
  8. During inithooks execution, inithooks.conf is sourced so preseeding will happen. Once inithooks.conf is no longer needed, it is deleted.

In addition to authentication related preseeding, TKLBAM is also preseeded with the HUB_APIKEY and is initialized, so performing the initial backup is as easy as executing tklbam-backup or using the TKLBAM webmin module.

As always, the client side code that implements the above is open source, and can be found in the following projects: hubclient, tklamq, inithooks, as well as the above mentioned blog posts.

Take the Hub for a spin and let us know what you think.

Secure, flexible and scalable Amazon EC2 instance preseeding

// January 3rd, 2011 // Comments Off // Uncategorized

I'd like to introduce Joe. He is a good looking, experienced sys-admin and like all good sysadmins, he has more stuff to do than time to do it.

Joe wants to get up and running on Amazon EC2 with a Wordpress installation, and chooses to do so with a pre-configured appliance. These are the steps Joe performs:

  • Joe logs into his favourite Amazon EC2 console, specifies a Wordpress appliance, and other configurations.
  • Clicks launch.
  • Once the instance is running, he logs in using his SSH public key and changes the root password (it was set randomly on firstboot, right?).
  • He then proceeds to change the MySQL root password as well (also set randomly on firstboot, hopefully!). Joe knows how to do this as he's an experienced sys-admin, do you?
  • Finally, Joe logs into Wordpress using the default admin password (he noted the default password in the release notes before launching), resets the password and specifies his own email for the account.

While performing the above, Joe was holding his breath and working as fast as he could because he was previously hit by a botnet looking for random systems using default passwords and was compromised. Luckily this time he came out unscaved.

Does this sound familiar? Well, it should because that's how it's mostly being done.

You might be thinking to yourself "but I used the TurnKey Hub to set the root password for my instances, which also set the database password". True, that has been a feature of the Hub from day one, but with the release of TurnKey 11.0 and the end to default passwords, we've extended the Hub to support preseeding as well.

The idea behind this was not only to make cloud deployments more secure, but to make it much easier. We wanted to simplify the process for Joe from the above to this:

  • Joe logs into the Hub, selects Wordpress and preseeds the configuration.
  • Clicks launch.

The above is not a mock-up of a future implementation, it's live on the Hub.
So how does it work? Read on...
 

Brainstorming a solution

The problem in preseeding an instance is sending the information (securely) to the instance.
So how do you do it?
 

Idea #1: pass it through Amazon EC2 user-data?

If you know a little about Amazon EC2 you'll know that when launching an instance you can specify user-data which is accessible from the instance via Amazon's API.
 
But wait, do you really want to store authentication credentials in user-data? 
 
You could, but because any process on the instance that can open a network socket can access the user-data as it never expires, you'll probably want to firewall off the Amazon API as soon as it's not required anymore during instance initialization. But maybe the user of the instance needs access to the Amazon API? Crippling the service by design isn't a good solution in my honest opinion.
 

Idea #2: store it in the Hub's database, and let the server query the API

So, instead of sending authentication credentials via user-data, why not send a unique identifier (e.g., SERVER_ID), so the instance can use the Hub's API to pull the credentials?
 
Well, you could, but that would mean the Hub service needs to store the instance's configuration, passwords and all, in its database and delete it when it's no longer needed. Storing an item in a database for just one use is inelegant. But it's a natural solution if you only have a database, as I dicussed in a previous blog post, "when all you have is a hammer, everything looks like a nail".
 
In my opinion, it ultimately comes down to separation of concerns. For this type of pattern, the most natural solution would be some sort of messaging service. The Hub publishes a message to a queue, which the instance consumes.
 

Idea #3: pass it as messages using the Advanced Message Queuing Protocol (AMQP)

So whats wrong with messaging? Nothing really, so long as you take care when designing the system for confidentiality and integrity - we don't want others eavesdropping on messages, or sending spoofed messages.
 
Messages that fail a CRC or cannot be decrypted successfully should be discarded, and removed from the queue so not to block it.
 

Designing infrastructure that is secure, scalable and extendible

The solution we came up with is designed to be secure, scalable and extendible. Eventually it will support other cloud hosting providers, as well as provide bi-directional secure communication for future Hub-based services still under development.
 
The solution uses each of the brainstormed solutions above for what they were designed for, and no more.
 
Data Flow Diagram (DFD) explained:
  1. The user specifies preseeding data.
  2. The Hub tells Amazon EC2 to launch the instance with user-data which includes the SERVER_ID.
  3. The Hub creates a direct message exchange and queue for the server, which is configured to only receive messages sent from the Hub.
  4. The Hub publishes symmetrically encrypted messages (incl. a CRC) to the server queue with preseeding data that only the server can decrypt.
  5. The instance pulls user-data from the Amazon EC2 API (SERVER_ID).
  6. The Instance registers itself with the Hub via an SSL secured API using the SERVER_ID, which responds back with the server subkey and messaging secret. Note that this can only be done once for security.
    • subkey: A one way hash generated from the user's APIKEY. It is unique for each server registered in the Hub, and is used as part of the exchange and queue naming structure.
    • secret: A secure unique hash used for message encryption and decryption.
  7. The instance consumes messages from the queue. Messages are decrypted and passed to the callback for processing (preseeding messages appends the arg=value to inithooks.conf).
  8. During inithooks execution, inithooks.conf is sourced so preseeding will happen. Once inithooks.conf is no longer needed, it is deleted.

In addition to authentication related preseeding, TKLBAM is also preseeded with the HUB_APIKEY and is initialized, so performing the initial backup is as easy as executing tklbam-backup or using the TKLBAM webmin module.

As always, the client side code that implements the above is open source, and can be found in the following projects: hubclient, tklamq, inithooks, as well as the above mentioned blog posts.

Take the Hub for a spin and let us know what you think.

Peakscale – Keep a Small Surface – Webapp Isolation

// November 16th, 2010 // Comments Off // Uncategorized

"A fast way to achieve this setup is to use Amazon's SQS with its rich permission schemes and the new multiple credential support. Each host gets entirely different credentials and Amazon's operations and security teams sit between them."

Free Ubuntu Server for a year at Amazon

// October 22nd, 2010 // Comments Off // Uncategorized

Yes, you can get your very own Free Ubuntu server in the clouds for one full year! The folks at Amazon have just announced

"Beginning November 1, new AWS customers will be able to run a free Amazon EC2 Micro Instance for a year, while also leveraging a new free usage tier for Amazon S3, Amazon Elastic Block Store, Amazon Elastic Load Balancing, and AWS data transfer. AWS’s free usage tier can be used for anything you want to run in the cloud: launch new applications, test existing applications in the cloud, or simply gain hands-on experience with AWS"

You can find the details of the offer at: http://aws.amazon.com/free/

Ubuntu being arguably the most popular Amazon guest image is also available to you for free today! Get the list of Official Ubuntu images created by the Ubuntu's very own server team at

Maverick http://uec-images.ubuntu.com/server/maverick/current/
Lucid http://uec-images.ubuntu.com/server/lucid/current/

For the free offer, you will want to launch a t1.micro instance, and it seems you will want to wait till Nov 1st for registering your account (credit card needed). This is great news! If you ever wanted to play with Ubuntu server on the cloud, now is the best time to get started!

Update: The terms page mentions "Only accounts created after October 20, 2010 are eligible for the Offer"

Update2: Please read Scott's comment below, on why you will be charged 0.5$/mo if you run the standard Ubuntu image (or Amazon's own AMIs)

TKLBAM: a new kind of smart backup/restore system that just works

// September 8th, 2010 // Comments Off // Uncategorized

Drum roll please...

Today, I'm proud to officially unveil TKLBAM (AKA TurnKey Linux Backup and Migration): the easiest, most powerful system-level backup anyone has ever seen. Skeptical? I would be too. But if you read all the way through you'll see I'm not exaggerating and I have the screencast to prove it. Aha!

This was the missing piece of the puzzle that has been holding up the Ubuntu Lucid based release batch. You'll soon understand why and hopefully agree it was worth the wait.

We set out to design the ideal backup system

Imagine the ideal backup system. That's what we did.

Pain free

A fully automated backup and restore system with no pain. That you wouldn't need to configure. That just magically knows what to backup and, just as importantly, what NOT to backup, to create super efficient, encrypted backups of changes to files, databases, package management state, even users and groups.

Migrate anywhere

An automated backup/restore system so powerful it would double as a migration mechanism to move or copy fully working systems anywhere in minutes instead of hours or days of error prone, frustrating manual labor.

It would be so easy you would, shockingly enough, actually test your backups. No more excuses. As frequently as you know you should be, avoiding unpleasant surprises at the worst possible timing.

One turn-key tool, simple and generic enough that you could just as easily use it to migrate a system:

  • from Ubuntu Hardy to Ubuntu Lucid (get it now?)
  • from a local deployment, to a cloud server
  • from a cloud server to any VPS
  • from a virtual machine to bare metal
  • from Ubuntu to Debian
  • from 32-bit to 64-bit

System smart

Of course, you can't do that with a conventional backup. It's too dumb. You need a vertically integrated backup that has system level awareness. That knows, for example, which configuration files you changed and which you didn't touch since installation. That can leverage the package management system to get appropriate versions of system binaries from package repositories instead of wasting backup space.

This backup tool would be smart enough to protect you from all the small paper-cuts that conspire to make restoring an ad-hoc backup such a nightmare. It would transparently handle technical stuff you'd rather not think about like fixing ownership and permission issues in the restored filesystem after merging users and groups from the backed up system.

Ninja secure, dummy proof

It would be a tool you could trust to always encrypt your data. But it would still allow you to choose how much convenience you're willing to trade off for security.

If data stealing ninjas keep you up at night, you could enable strong cryptographic passphrase protection for your encryption key that includes special countermeasures against dictionary attacks. But since your backup's worst enemy is probably staring you in the mirror, it would need to allow you to create an escrow key to store in a safe place in case you ever forget your super-duper passphrase.

On the other hand, nobody wants excessive security measures forced down their throats when they don't need them and in that case, the ideal tool would be designed to optimize for convenience. Your data would still be encrypted, but the key management stuff would happen transparently.

Ultra data durability

By default, your AES encrypted backup volumes would be uploaded to inexpensive, ultra-durable cloud storage designed to provide %99.999999999 durability. To put 11 nines of reliability in perspective, if you stored 10,000 backup volumes you could expect to lose a single volume once every 10 million years.

For maximum network performance, you would be routed automatically to the cloud storage datacenter closest to you.

Open source goodness

Naturally, the ideal backup system would be open source. You don't have to care about free software ideology to appreciate the advantages. As far as I'm concerned any code running on my servers doing something as critical as encrypted backups should be available for peer review and modification. No proprietary secret sauce. No pacts with a cloudy devil that expects you to give away your freedom, nay worse, your data, in exchange for a little bit of vendor-lock-in-flavored convenience.

Tall order huh?

All of this and more is what we set out to accomplish with TKLBAM. But this is not our wild eyed vision for a future backup system. We took our ideal and we made it work. In fact, we've been experimenting with increasingly sophisticated prototypes for a few months now, privately eating our own dog food, working out the kinks. This stuff is complex so there may be a few rough spots left, but the foundation should be stable by now.

Seeing is believing: a simple usage example

We have two installations of TurnKey Drupal6:

  1. Alpha, a virtual machine on my local laptop. I've been using it to develop the TurnKey Linux web site.
  2. Beta, an EC2 instance I just launched from the TurnKey Hub.

In the new TurnKey Linux 11.0 appliances, TKLBAM comes pre-installed. With older versions you'll need to install it first:

apt-get update
apt-get install tklbam webmin-tklbam

You'll also need to link TKLBAM to your TurnKey Hub account by providing the API-KEY. You can do that via the new Webmin module, or on the command line:

tklbam-init QPINK3GD7HHT3A

I now log into Alpha's command line as root (e.g., via the console, SSH or web shell) and do the following:

tklbam-backup

It's that simple. Unless you want to change defaults, no arguments or additional configuration required.

When the backup is done a new backup record will show up in my Hub account:

To restore I log into Beta and do this:

tklbam-restore 1

That's it! To see it in action watch the video below or better yet log into your TurnKey Hub account and try it for yourself.

Quick screencast (2 minutes)

Best viewed full-screen. Having problems with playback? Try the YouTube version.

The screencast shows TKLBAM command line usage, but users who dislike the command line can now do everything from the comfort of their web browser, thanks to the new Webmin module.

Getting started

TKLBAM's front-end interface is provided by the TurnKey Hub, an Amazon-powered cloud backup and server deployment web service currently in private beta.

If you don't have a Hub account already, request an invitation. We'll do our best to grant them as fast as we can scale capacity on a first come, first served basis. Update: currently we're doing ok in terms of capacity so we're granting invitation requests within the hour.

To get started log into your Hub account and follow the basic usage instructions. For more detail, see the documentation.

Feel free to ask any questions in the comments below. But you'll probably want to check with the FAQ first to see if they've already been answered.

Upcoming features

  • PostgreSQL support: PostgreSQL support is in development but currently only MySQL is supported. That means TKLBAM doesn't yet work on the three PostgreSQL based TurnKey appliances (PostgreSQL, LAPP, and OpenBravo).
  • Built-in integration: TKLBAM will be included by default in all future versions of TurnKey appliances. In the future when you launch a cloud server from the Hub it will be ready for action immediately. No installation or initialization necessary.
  • Webmin integration: we realize not everyone is comfortable with the command line, so we're going to look into developing a custom webmin module for TKLBAM. Update: we've added the new TKLBAM webmin module to the 11.0 RC images based on Lucid. In older images, the webmin-tklbam package can also be installed via the package manager.

Special salute to the TurnKey community

First, many thanks to the brave souls who tested TKLBAM and provided feedback even before we officially announced it. Remember, with enough eyeballs all bugs are shallow, so if you come across anything else, don't rely on someone else to report it. Speak up!

Also, as usual during a development cycle we haven't been able to spend as much time on the community forums as we'd like. Many thanks to everyone who helped keep the community alive and kicking in our relative absence.

Remember, if the TurnKey community has helped you, try to pay it forward when you can by helping others.

Finally, I'd like to give extra special thanks to three key individuals that have gone above and beyond in their contributions to the community.

By alphabetical order:

  • Adrian Moya: for developing appliances that rival some of our best work.
  • Basil Kurian: for storming through appliance development at a rate I can barely keep up with.
  • JedMeister: for continuing to lead as our most helpful and tireless community member for nearly a year and a half now. This guy is a frigging one man support army.

Also special thanks to Bob Marley, the legend who's been inspiring us as of late to keep jamming till the sun was shining. :)

Final thoughts

TKLBAM is a major milestone for TurnKey. We're very excited to finally unveil it to the world. It's actually been a not-so-secret part of our vision from the start. A chance to show how TurnKey can innovate beyond just bundling off the shelf components.

With TKLBAM out of the way we can now focus on pushing out the next release batch of Lucid based appliances. Thanks to the amazing work done by our star TKLPatch developers, we'll be able to significantly expand our library so by the next release we'll be showcasing even more of the world's best open source software. Stir It Up!

TKLBAM: a new kind of smart backup/restore system that just works

// September 8th, 2010 // Comments Off // Uncategorized

Drum roll please...

Today, I'm proud to officially unveil TKLBAM (AKA TurnKey Linux Backup and Migration): the easiest, most powerful system-level backup anyone has ever seen. Skeptical? I would be too. But if you read all the way through you'll see I'm not exaggerating and I have the screencast to prove it. Aha!

This was the missing piece of the puzzle that has been holding up the Ubuntu Lucid based release batch. You'll soon understand why and hopefully agree it was worth the wait.

We set out to design the ideal backup system

Imagine the ideal backup system. That's what we did.

Pain free

A fully automated backup and restore system with no pain. That you wouldn't need to configure. That just magically knows what to backup and, just as importantly, what NOT to backup, to create super efficient, encrypted backups of changes to files, databases, package management state, even users and groups.

Migrate anywhere

An automated backup/restore system so powerful it would double as a migration mechanism to move or copy fully working systems anywhere in minutes instead of hours or days of error prone, frustrating manual labor.

It would be so easy you would, shockingly enough, actually test your backups. No more excuses. As frequently as you know you should be, avoiding unpleasant surprises at the worst possible timing.

One turn-key tool, simple and generic enough that you could just as easily use it to migrate a system:

  • from Ubuntu Hardy to Ubuntu Lucid (get it now?)
  • from a local deployment, to a cloud server
  • from a cloud server to any VPS
  • from a virtual machine to bare metal
  • from Ubuntu to Debian
  • from 32-bit to 64-bit

System smart

Of course, you can't do that with a conventional backup. It's too dumb. You need a vertically integrated backup that has system level awareness. That knows, for example, which configuration files you changed and which you didn't touch since installation. That can leverage the package management system to get appropriate versions of system binaries from package repositories instead of wasting backup space.

This backup tool would be smart enough to protect you from all the small paper-cuts that conspire to make restoring an ad-hoc backup such a nightmare. It would transparently handle technical stuff you'd rather not think about like fixing ownership and permission issues in the restored filesystem after merging users and groups from the backed up system.

Ninja secure, dummy proof

It would be a tool you could trust to always encrypt your data. But it would still allow you to choose how much convenience you're willing to trade off for security.

If data stealing ninjas keep you up at night, you could enable strong cryptographic passphrase protection for your encryption key that includes special countermeasures against dictionary attacks. But since your backup's worst enemy is probably staring you in the mirror, it would need to allow you to create an escrow key to store in a safe place in case you ever forget your super-duper passphrase.

On the other hand, nobody wants excessive security measures forced down their throats when they don't need them and in that case, the ideal tool would be designed to optimize for convenience. Your data would still be encrypted, but the key management stuff would happen transparently.

Ultra data durability

By default, your AES encrypted backup volumes would be uploaded to inexpensive, ultra-durable cloud storage designed to provide %99.999999999 durability. To put 11 nines of reliability in perspective, if you stored 10,000 backup volumes you could expect to lose a single volume once every 10 million years.

For maximum network performance, you would be routed automatically to the cloud storage datacenter closest to you.

Open source goodness

Naturally, the ideal backup system would be open source. You don't have to care about free software ideology to appreciate the advantages. As far as I'm concerned any code running on my servers doing something as critical as encrypted backups should be available for peer review and modification. No proprietary secret sauce. No pacts with a cloudy devil that expects you to give away your freedom, nay worse, your data, in exchange for a little bit of vendor-lock-in-flavored convenience.

Tall order huh?

All of this and more is what we set out to accomplish with TKLBAM. But this is not our wild eyed vision for a future backup system. We took our ideal and we made it work. In fact, we've been experimenting with increasingly sophisticated prototypes for a few months now, privately eating our own dog food, working out the kinks. This stuff is complex so there may be a few rough spots left, but the foundation should be stable by now.

Seeing is believing: a simple usage example

We have two installations of TurnKey Drupal6:

  1. Alpha, a virtual machine on my local laptop. I've been using it to develop the TurnKey Linux web site.
  2. Beta, an EC2 instance I just launched from the TurnKey Hub.

In the new TurnKey Linux 11.0 appliances, TKLBAM comes pre-installed. With older versions you'll need to install it first:

apt-get update
apt-get install tklbam webmin-tklbam

You'll also need to link TKLBAM to your TurnKey Hub account by providing the API-KEY. You can do that via the new Webmin module, or on the command line:

tklbam-init QPINK3GD7HHT3A

I now log into Alpha's command line as root (e.g., via the console, SSH or web shell) and do the following:

tklbam-backup

It's that simple. Unless you want to change defaults, no arguments or additional configuration required.

When the backup is done a new backup record will show up in my Hub account:

To restore I log into Beta and do this:

tklbam-restore 1

That's it! To see it in action watch the video below or better yet log into your TurnKey Hub account and try it for yourself.

Quick screencast (2 minutes)

Best viewed full-screen. Having problems with playback? Try the YouTube version.

The screencast shows TKLBAM command line usage, but users who dislike the command line can now do everything from the comfort of their web browser, thanks to the new Webmin module.

Getting started

TKLBAM's front-end interface is provided by the TurnKey Hub, an Amazon-powered cloud backup and server deployment web service currently in private beta.

If you don't have a Hub account already, request an invitation. We'll do our best to grant them as fast as we can scale capacity on a first come, first served basis. Update: currently we're doing ok in terms of capacity so we're granting invitation requests within the hour.

To get started log into your Hub account and follow the basic usage instructions. For more detail, see the documentation.

Feel free to ask any questions in the comments below. But you'll probably want to check with the FAQ first to see if they've already been answered.

Upcoming features

  • PostgreSQL support: PostgreSQL support is in development but currently only MySQL is supported. That means TKLBAM doesn't yet work on the three PostgreSQL based TurnKey appliances (PostgreSQL, LAPP, and OpenBravo).
  • Built-in integration: TKLBAM will be included by default in all future versions of TurnKey appliances. In the future when you launch a cloud server from the Hub it will be ready for action immediately. No installation or initialization necessary.
  • Webmin integration: we realize not everyone is comfortable with the command line, so we're going to look into developing a custom webmin module for TKLBAM. Update: we've added the new TKLBAM webmin module to the 11.0 RC images based on Lucid. In older images, the webmin-tklbam package can also be installed via the package manager.

Special salute to the TurnKey community

First, many thanks to the brave souls who tested TKLBAM and provided feedback even before we officially announced it. Remember, with enough eyeballs all bugs are shallow, so if you come across anything else, don't rely on someone else to report it. Speak up!

Also, as usual during a development cycle we haven't been able to spend as much time on the community forums as we'd like. Many thanks to everyone who helped keep the community alive and kicking in our relative absence.

Remember, if the TurnKey community has helped you, try to pay it forward when you can by helping others.

Finally, I'd like to give extra special thanks to three key individuals that have gone above and beyond in their contributions to the community.

By alphabetical order:

  • Adrian Moya: for developing appliances that rival some of our best work.
  • Basil Kurian: for storming through appliance development at a rate I can barely keep up with.
  • JedMeister: for continuing to lead as our most helpful and tireless community member for nearly a year and a half now. This guy is a frigging one man support army.

Also special thanks to Bob Marley, the legend who's been inspiring us as of late to keep jamming till the sun was shining. :)

Final thoughts

TKLBAM is a major milestone for TurnKey. We're very excited to finally unveil it to the world. It's actually been a not-so-secret part of our vision from the start. A chance to show how TurnKey can innovate beyond just bundling off the shelf components.

With TKLBAM out of the way we can now focus on pushing out the next release batch of Lucid based appliances. Thanks to the amazing work done by our star TKLPatch developers, we'll be able to significantly expand our library so by the next release we'll be showcasing even more of the world's best open source software. Stir It Up!