So what is Ensemble anyway?

Posted by clint // June 3rd, 2011 // Uncategorized

Have you heard of Ensemble? Are you excited about Cloud/Service Orchestration? What? Ok you’re not alone if you are scratching your head.

Ensemble is an implementation of a new idea that has been taking shape the last couple of years. Ever since Amazon hooked up a remote API to thousands of machines to provide access to their virtual infrastructure (and called it macaroni? err.. AWS), people have been dreaming up ways to take advantage of what is basically a robotic “NOC guy”. No longer do you have to pre-rack servers or call your vendor frantically to get servers sent next-day to your colo. Right?

Naturally, the system administrators that would normally be in charge of racking servers, applied their existing tools to the job, to mixed success. Config management is really good at modelling identical hosts. But with virtual hosts instantly available, this left those thinking at a higher level wanting more. Chef in particular implemented a nice set of tools and functionality to allow this high level “service” definition with their knife tools and simple ruby API.

But how easy are Chef’s cookbooks to share and use without modification? How easy are they to integrate together? Puppet has modules that are also capable of similar functionality, and the recent integration of Mcollective, plus puppet Faces, has certainly added a lot of the same things Chef had to support this kind of application modelling, but again, the modules seem to require a lot of convention and assumption, and tweaking to get useful.

Its my opinion, that this is very much like the way tarballs+autoconf became the de-facto standard for distributing free software. It was *so much* better than writing a Makefile by hand, and it achieved an enormous amount of portability, so developers adopted it rapidly. In fact, it is still the dominant way to distribute portable open source applications.

But at some point, the limitations of this became clear. There was a need for something more concise, that could distribute both the source, and binaries, built for a platform. There was some limited early success with tarballs built by convention. But then, Enter RPM and DPKG. These included ways to express facts about software, like its dependencies, architecture, and the revisions made to it to work on the target platform. This allowed distributors of software to more easily maintain their systems, and enabled users to manage the software in their environments.

At that point, some smart guy figured out that we should be able to download and automatically configure all of the software needed for one application to work properly, just from its packaging information. To my mind, apt-get was my first experience with this, though FreeBSD ports authors may disagree there. Either way, this made it very easy for admins and users to install software without spending hours in the 7 levels of dependency hell.

In many ways, Service Orchestration is a way of bringing the benefits of packaging to the cloud. It should allow us to build out our cloud in a sane way, taking advantage of the knowledge that has been gained by others. For the bits that we need to finely tune, it should step aside and allow that without compromising the system.

Ensemble is an implementation of this idea, and Principia is a collection of “Formulas” for Ensemble. They are tightly coupled to Ubuntu, as they are in many ways meant to be the dpkg and apt-get for Ubuntu in the cloud.

Its pretty easy to try out Ensemble and Principia on Ubuntu. Right now you’ll need an EC2 account with an access key setup, though we’re working on making this work with just your local machine for rapid development.

Its been pointed out to me that the version of principia-tools that was available at the time of this writing didn’t include /usr/share/principia-tools/tests. I’ve uploaded a fixed version to the ensemble PPA, so if you tried these instructions and failed, please try updating principia-tools. If that fails, you can get the tests with bzr branch lp:principia-tools.


sudo add-apt-repository ppa:ensemble/ppa
sudo apt-get update
sudo apt-get install principia-tools
export AWS_SECRET_ACCESS_KEY=xxxxxxxxxxxxxxxx
export AWS_SECRET_KEY_ID=0123456789ABCDEF
ensemble bootstrap
principia getall /some/path/for/formulas
/usr/share/principia-tools/tests/mediawiki.sh /some/path/for/formulas

What does this give you, well it should give you a 7 node mediawiki cluster of t1.micro’s in the us-east-1 region of EC2. I just ran it and now I have this:

machines:
  0: {dns-name: ec2-50-19-158-109.compute-1.amazonaws.com, instance-id: i-215dd84f}
  1: {dns-name: ec2-50-17-16-228.compute-1.amazonaws.com, instance-id: i-8d58dde3}
  2: {dns-name: ec2-72-44-49-114.compute-1.amazonaws.com, instance-id: i-9558ddfb}
  3: {dns-name: ec2-50-19-47-106.compute-1.amazonaws.com, instance-id: i-6d5bde03}
  4: {dns-name: ec2-174-129-132-248.compute-1.amazonaws.com, instance-id: i-7f5bde11}
  5: {dns-name: ec2-50-19-152-136.compute-1.amazonaws.com, instance-id: i-755bde1b}
  6: {dns-name: '', instance-id: i-4b5bde25}
services:
  demo-wiki:
    formula: local:mediawiki-62
    relations: {cache: wiki-cache, db: wiki-db, website: wiki-balancer}
    units:
      demo-wiki/0:
        machine: 2
        relations: {}
        state: null
      demo-wiki/1:
        machine: 6
        relations: {}
        state: null
  wiki-balancer:
    formula: local:haproxy-13
    relations: {reverseproxy: demo-wiki}
    units:
      wiki-balancer/0:
        machine: 4
        relations: {}
        state: null
  wiki-cache:
    formula: local:memcached-10
    relations: {cache: demo-wiki}
    units:
      wiki-cache/0:
        machine: 3
        relations: {}
        state: null
      wiki-cache/1:
        machine: 5
        relations: {}
        state: null
  wiki-db:
    formula: local:mysql-93
    relations: {db: demo-wiki}
    units:
      wiki-db/0:
        machine: 1
        relations: {}
        state: null

At the top you see the machines that ensemble spun up in EC2 in the ‘machines’ section. The numbers there correspond to the ‘machine: #’ in the service/units definitions below. If you look through, you’ll see above that wiki-balancer is machine 4, which has a hostname of ec2-174-129-132-248.compute-1.amazonaws.com. If you go to that hostname, once all relations are up (I like to use ‘watch ensemble status’ to see when this happens), you should see a working mediawiki. But not just a working mediawiki, a scalable one. If you want to pour on the traffic, spin up 3 more demo-wiki’s to handle the app server load:


ensemble add-unit demo-wiki
ensemble add-unit demo-wiki
ensemble add-unit demo-wiki

These will of course take a minute or two to spin up. Once they’re ready they’ll show up in the status output:

services:
  demo-wiki:
    formula: local:mediawiki-62
    relations: {cache: wiki-cache, db: wiki-db, website: wiki-balancer}
    units:
      demo-wiki/0:
        machine: 2
        relations:
          cache: {state: up}
          db: {state: up}
          website: {state: up}
        state: started
      demo-wiki/1:
        machine: 6
        relations:
          cache: {state: up}
          db: {state: up}
          website: {state: up}
        state: started
      demo-wiki/2:
        machine: 7
        relations:
          cache: {state: up}
          db: {state: up}
          website: {state: up}
        state: started
      demo-wiki/3:
        machine: 8
        relations:
          cache: {state: up}
          db: {state: up}
          website: {state: up}
        state: started
      demo-wiki/4:
        machine: 9
        relations:
          cache: {state: up}
          db: {state: up}
          website: {state: up}
        state: started

How about a little test then? After I got to this point, I logged in as WikiSysop (change the password folks! its change-me) and imported the Wikipedia exports for “Ubuntu” and “EC2″. After that I used harvestman to spider the site and then saved all the urls in a file, urls.txt. Alright! Now lets fire up *siege* from a machine outside the cluster, but in the same availability zone / security group (so at least we’re only dealing with EC2′s latency and not my net connection), and see if we can take this cluster down!


$ siege -i -c 5 -f urls.txt
...
Transactions: 563 hits
Availability: 100.00 %
Elapsed time: 95.58 secs
Data transferred: 2.64 MB
Response time: 0.35 secs
Transaction rate: 5.89 trans/sec
Throughput: 0.03 MB/sec
Concurrency: 2.04
Successful transactions: 544
Failed transactions: 0
Longest transaction: 13.54
Shortest transaction: 0.00

This is, btw, the best run I got out of t1.micro’s. Sometimes it would get quite ugly:


Transactions: 892 hits
Availability: 99.55 %
Elapsed time: 221.69 secs
Data transferred: 3.64 MB
Response time: 0.61 secs
Transaction rate: 4.02 trans/sec
Throughput: 0.02 MB/sec
Concurrency: 2.45
Successful transactions: 849
Failed transactions: 4
Longest transaction: 27.41
Shortest transaction: 0.00

Lets try the whole thing over with m1.small. First I edit ~/.ensemble/environments.yaml and add an override for the default-instance-type:


ensemble: environments

environments:
  sample:
    type: ec2
    default-instance-type: m1.small
    control-bucket: ensemble-12345678901234567890
    admin-secret: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Then I re-run the whole test:


Transactions: 290 hits
Availability: 98.98 %
Elapsed time: 81.79 secs
Data transferred: 0.78 MB
Response time: 0.53 secs
Transaction rate: 3.55 trans/sec
Throughput: 0.01 MB/sec
Concurrency: 1.89
Successful transactions: 277
Failed transactions: 3
Longest transaction: 1.50
Shortest transaction: 0.00

Oops! I forgot to add my 3 extra nodes. Note that these two m1.smalls are already almost keeping up. Now as I add these, I keep siege running. Its pretty cool to watch the response times drop as nodes come online to carry some of the load.

Now with 5 m1.small’s:


Transactions: 273 hits
Availability: 100.00 %
Elapsed time: 54.27 secs
Data transferred: 0.99 MB
Response time: 0.47 secs
Transaction rate: 5.03 trans/sec
Throughput: 0.02 MB/sec
Concurrency: 2.38
Successful transactions: 260
Failed transactions: 0
Longest transaction: 19.92
Shortest transaction: 0.00

And with higher concurrency raised from 5 to 10:


Transactions: 327 hits
Availability: 100.00 %
Elapsed time: 42.20 secs
Data transferred: 1.30 MB
Response time: 0.66 secs
Transaction rate: 7.75 trans/sec
Throughput: 0.03 MB/sec
Concurrency: 5.12
Successful transactions: 318
Failed transactions: 0
Longest transaction: 25.51
Shortest transaction: 0.00

And now if we add 2 more, for a total of 7 nodes, concurrency of 10 gets even better:


Transactions: 531 hits
Availability: 100.00 %
Elapsed time: 53.37 secs
Data transferred: 1.75 MB
Response time: 0.44 secs
Transaction rate: 9.95 trans/sec
Throughput: 0.03 MB/sec
Concurrency: 4.35
Successful transactions: 507
Failed transactions: 0
Longest transaction: 15.49
Shortest transaction: 0.00

And with 2 more (total of 9 units in demo-wiki serving the app):


Transactions: 354 hits
Availability: 100.00 %
Elapsed time: 34.41 secs
Data transferred: 1.23 MB
Response time: 0.41 secs
Transaction rate: 10.29 trans/sec
Throughput: 0.04 MB/sec
Concurrency: 4.22
Successful transactions: 337
Failed transactions: 0
Longest transaction: 11.45
Shortest transaction: 0.00

Anyway, this isn’t a Mediawiki benchmark. This is to show you how easy it is to scale up and down in response to load with Ensemble. We all know that scaling out works, these graphs show it nicely:

Response Time
Transactions per Second

Notice how the transactions/second went up all the time, but the response time went up drastically with the jump in concurrency. This is where you need to have the ability to scale quickly, and where, if you can live with the other limitations of EC2 or any other IaaS provider, the cloud should actually win you business, since better response time means more happy users.

Now that my siege is over, I can safely remove the unnecessary units one by one with ‘ensemble remove-unit demo-wiki/9′, etc. etc. There’s still a lot of room for sugar to be added. We could say “ensemble resize-service demo-wiki 5″ and it might just pick 5 to keep and remove the rest, or add 3 to fulfill the request. There are also a ton of other ideas just bubbling up that are really exciting.

Come say hi and hack on ensemble with us in Freenode, #ubuntu-ensemble and on the mailing list.

Want to let us know your opinion? Leave us a comment right here

16 Responses to “So what is Ensemble anyway?”

  1. Todd Rosner says:

    sweet!

  2. Someone Special says:

    Am I the only one who doesn’t understand what the hell this article and this product are supposed to do? None of this made any sense to me, and I’m a developer! Is this meant to apply to or affect end-users in any way whatsoever, or only people running Ubuntu Server as a web application provider or something? I really have no idea…

  3. kim0 says:

    Hi, It shouldn’t affect end-users really, Ensemble is a way to deploy applications (web apps are typical) to the cloud (and soon plain hardware) the cool thing is that the deployment formulas are abstracted and generic such that you can collaborate with the community on making them better, and use them any way you like. So assuming you have no idea how to deploy a drupal website, with caching, you just ask Ensemble to deploy that system, and in a minute it’s running. When you want to scale it, you just “add-unit” to it to make it 10 application servers for instance instead of one. It’s really quite simple but powerful. Let me know if you have further questions

  4. Howard Hoyt says:

    Excellent obfuscation of subject !

  5. kim0 says:

    Hi Howard, Did you find the article confusing? I’m very interested to know more? which part was most confusing and how do you think it could be better written
    Thanks for the feedback

  6. Troy Ready says:

    For what it’s worth, I thought this was an amazing article. Just enough technical detail to give background on how it’s all happening, and then finishing off by showing how simple it all is when it comes together.

    I don’t really deploy much web stuff, but as a sys-admin I’m really impressed with how Ensemble is shaping up.

  7. kim0 says:

    Hey Troy,
    Glad you liked the article and more importantly Ensemble. Do you think you could start experimenting with it for some of your own work? It doesn’t have to do web-apps only you know. If you’re interested to start experimenting, I’m interested to start helping you :) Let me know your thoughts

  8. touristguy87 says:

    I thought the article made a lot of sense, I just would like to see a step by step implementation, actual shell commands…somewhere near the middle the author began to wave their hands while talking. Which means that either you do know what they are talking about at that point, or you don’t and you have to take their word on what would happen if you actually figured out how to do what they are describing, and the cloud ops actually work as described. the ods are that by the time you figure out how to actually implement this all on a working machine the “cloud ops” will have changed and this will be obsolete.

    so yeah a step by step shell script that’s locked down and well-maintained would be nice.
    issues with passwords & security aside.
    perhaps that could be done in an abstract way with the output displayed …”run demo” and it’s all done for you with each command displayed with output

  9. touristguy87 says:

    oh one other thing
    what happens if you do the equivalent of apt-get autoremove…and large holes are blown in your Ubuntu install, necessitating a reinstall just to get the application installer to work, because the system can’t translate the addresses…I did this just yesterday, trying to get libusb removed from a U9.1 install so I could do a fresh install of it (apt-get remove libusb and it removed everything that relied on libusb, and so on), I couldn’t reinstall anything because it couldn’t access the software repositories…now I have a system that won’t do anything but run memtest on bootup. When that finishes I have the option to reboot.

    I take it that Ensemble doesn’t have this problem?
    if the idea is to remove the dependency problem by making the application install & configure process “smart”, is it smart enough not to destroy itself to the point where apps can no longer be un/installed and configured?

  10. touristguy87 says:

    especially in this case since now you’re “smartening” the entire process from hardware installation through server initialization & configuration & clustering…meaning that that much more happens with every click of the return key

  11. kim0 says:

    Hi touristguy87, So you’d like to see the step by step shell commands used to spin up that demo, here they are
    http://bazaar.launchpad.net/~ensemble-composers/principia-tools/trunk/view/head:/tests/mediawiki.sh

    To answer your second comment, yes leaving the instance management to ensemble should be smart enough not to destroy itself. This smartness really is within the formula code, there will be varying qualities of formulas, however the official ones should indeed never destroy a running system, as long as there’s no manual intervention. Hope that helps

  12. Hi touristguy. I’m not sure where the hand waving comes in. There are actually step by step instructions for repeating everything in the blog post right in the middle of the blog post.

    If you’d like to read the formulas.. after the step where you do ‘principia getall /some/path/to/formulas’, just go into /some/path/to/formulas and check out the formulas. They’re all completely open source, and most are just a few bash and/or python scripts.

    Re not being able to destroy things… of course you’ll be able to screw things up. The tool needs to be powerful, and with power comes responsibility. If you remove libusb and anything else that depends on it, apt will tell you everything its going to do before it goes forward. Likewise, that should be how ensemble works.

  13. Huri says:

    Wow, this sounds fantastic! I hope it’ll be able to work with the Rackspace Cloud too.
    Awesome work!

  14. kim0 says:

    Hi Huri, Glad you like it, indeed rackspace support is planned, and hardware support as well through orchestra project :) interested to start writing a formula ? ping me and I can get you started :)

  15. mike schueler says:

    So this seems like a CloudFormation clone with nicer syntax.

    Principia is a new CM based around shell scripts?

  16. kim0 says:

    Hey Mike, the analogy is not exactly accurate though. Ensemble is not only about deploying multiple machines in a “template” style. Ensemble is a living and breathing system, it’s more about deploying and managing an application during its lifecycle. With ensemble, you write hooks that continuously react to events, inform other nodes of needed changes, which in-turn react to events and so on. Principia is not based on shell scripts, rather it is language agnostic. You can write hooks in any programming language. Ensemble provides command line tools that you can execute from any script (in any language) to exchange information and trigger events across machines.

    One really beings to appreciate the design, once you’ve written your first formula. I would encourage you to start playing with it, to get a real feel. Jump in to #ubuntu-ensemble on freenode irc and let us know your thoughts. Cheers