Posts Tagged ‘ensemble’

Ensemble: the Service Orchestration framework for hard core DevOps

// August 19th, 2011 // Comments Off // Uncategorized

I've seen Ensemble evolve from a series of design-level conversations (Brussels May 2010), through a year of fast-paced Canonical-style development, and participated in Ensemble sprints (Cape Town March 2011, and Dublin June 2011).  I've observed Ensemble at first as an outsider, then provided feedback as a stake-holder, and have now contributed code as a developer to Ensemble and authored Formulas.


Think about bzr or git circa 2004/2005, or apt circa 1998/1999, or even dpkg circa 1993/1994...  That's where we are today with Ensemble circa 2011. 

Ensemble is a radical, outside-of-the-box approach to a problem that the Cloud ecosystem is just starting to grok: Service Orchestration.  I'm quite confident that in a few years, we're going to look back at 2011 and the work we're doing with Ensemble and Ubuntu and see an clear inflection point in the efficiency of workload management in The Cloud.

From my perspective as the leader of Canonical's Systems Integration Team, Ensemble is now the most important tool in our software tool belt when building complex cloud solutions.

Period.

Juan, Marc, Brian, and I are using Ensemble to generate modern solutions around new service deployments to the cloud.  We have contributed many formulas already to Ensemble's collection, and continue to do so every day.

There's a number of novel ideas and unique approaches in Ensemble.  You can deep dive into the technical details here.  For me, there's one broad concept in Ensemble that just rocks my world...  Ensemble deals in individual service units, with the ability to replicate, associate, and scale those units quite dynamically.  Service units in practice are cloud instances (or if you're using Orchestra + Ensemble, bare metal systems!).  Service units are federated together to deliver a (perhaps large and complicated) user facing service.

Okay, that's a lot of words, and at a very high level.  Let to me try to break that down into something a bit more digestable...

I've been around Red Hat and Debian packaging for many years now.  Debian packaging is particularly amazing at defining prerequisites packages, pre- and post- installation procedures, and are just phenomenal at rolling upgrades.  I've worked with hundreds (thousands?) of packages at this point, including some mind bogglingly complex ones!

It's truly impressive how much can be accomplished within traditional Debian packaging.  But it has its limits.  These limits really start to bare their teeth when you need to install packages on multiple separate systems, and then federate those services together.  It's one thing if you need to install a web application on a single, local system:  depend on Apache, depend on MySQL, install, configure, restart the services...

sudo apt-get install your-web-app

...

Profit!

That's great.  But what if you need to install MySQL on two different nodes, set them up in a replicating configuration, install your web app and Apache on a third node, and put a caching reverse proxy on a fourth?  Oh, and maybe you want to do that a few times over.  And then scale them out.  Ummmm.....

sudo apt-get errrrrrr....yeah, not gonna work :-(

But these are exactly the type(s) of problems that Ensemble solves!  And quite elegantly in fact.

Once you've written your Formula, you'd simply:

ensemble bootstrap

ensemble deploy your-web-app
...
Profit!

Stay tuned here and I'll actually show some real Ensemble examples in a series of upcoming posts.  I'll also write a bit about how Ensemble and Orchestra work together.

In the mean time, get primed on the Ensemble design and usage details here, and definitely check out some of Juan's awesome Ensemble how-to posts!

After that, grab the nearest terminal and come help out!

We are quite literally at the edge of something amazing here, and we welcome your contributions!  All of Ensemble and our Formula Repository are entirely free software, building on years of best practice open source development on Ubuntu at Canonical.  Drop into the #ubuntu-ensemble channel in irc.freenode.net, introduce yourself, and catch one of the earliest waves of something big.  Really, really big.

:-Dustin

HPCC with Ubuntu Server and Ensemble

// August 19th, 2011 // Comments Off // Uncategorized

** This is an updated post reflecting the new name of the project formerly known as Ensemble now known as Juju **

Let's start this post with a bit of background on the technologies that I'll be using:
  • What is Ubuntu?
    • Ubuntu is a fast, secure and easy-to-use operating system used by millions of people around the world.  
    • Secure, fast and powerful, Ubuntu Server is transforming IT environments worldwide. Realise the full potential of your infrastructure with a reliable, easy-to-integrate technology platform.
  • What is Juju?
    • Juju is a next generation service orchestration framework. It has been likened to APT for the cloud. With juju, different authors are able to create service charms independently, and make those services coordinate their communication through a simple protocol. Users can then take the product of different authors and very comfortably deploy those services in an environment. The result is multiple machines and components transparently collaborating towards providing the requested service.
  • What is HPCC?
    • HPCC (High Performance Computing Cluster) is a massive parallel-processing computing platform that solves Big Data problems. The platform is now Open Source!
 Now that we are all caught up, let's delve right into it.  I will be discussing the details of my newly created hpcc juju charm.

The hpcc charm has been one of the trickiest one to date to get working properly so, I want to take some time to explain some of the challenges that I encountered.

hpcc seems to use ssh keys for authentication and a single xml file to hold it's configuration.  All nodes that are part of the cluster should have identical keys and xml configuration file.

  • The ssh keys are pretty easy to do ( there is even a script that will do it all for you located at /opt/HPCCSystems/sbin/keygen.sh ).  You can just run: ssh-keygen -f path_where_to_save_keys/id_rsa -N "" -q
  • The configuration file environment.xml is a lot trickier to configure so, I will use cheetah templates to help make a template out of this enormous file.  
    • According to their website:
      • Cheetah is an open source template engine and code generation tool, written in Python. It can be used standalone or combined with other tools and frameworks. Web development is its principle use, but Cheetah is very flexible and is also being used to generate C++ game code, Java, sql, form emails and even Python code.
With cheetah, I can create self contained templates that can be generated into their intended file by just calling cheetah.  This is because we can embed python code inside the template itself, making the template ( environment.tmpl in our case ) more or less a python program that generates a fully functional environment.xml file ready for hpcc to use.

Another, very important, reason to use a template engine is the ability to create identical configuration files from each node without having to pass them around.  In other words, each node can create it's own configuration file and, since all nodes are using the same methods and data to create the file, they will all be exactly the same.

The hpcc configuration file is huge so, I'll just talk about some of the interesting bits of it here:
#import random
#import subprocess
#set $rel_structure = { $subprocess.check_output(['facter', 'install_time']).strip() : { 'name' : $subprocess.check_output(['hostname', '-f']).strip(), 'netAddress' : $subprocess.check_output(['facter','ipaddress']).strip(), 'uuid' : $subprocess.check_output(['facter','uuid']).strip()  } }
#for $member in $subprocess.check_output(['relation-list']).strip().split():
   #set $rel_structure[$subprocess.check_output(['relation-get','install_time', $member]).strip()] = { 'name' : $subprocess.check_output(['relation-get','name', $member]).strip(), 'netAddress' : $subprocess.check_output(['relation-get','netAddress', $member]).strip(), 'uuid' : $subprocess.check_output(['relation-get', 'uuid', $member]).strip() }
#end for
#set $nodes = []
#for $index in $sorted($rel_structure.keys()):
   $nodes.append(($rel_structure[$index]['netAddress'], $rel_structure[$index]['name']))
#end for
The above piece of code is what I am currently using to populate a list with the FQDN and address of each cluster member sorted by install time.  This puts the "master" of the cluster at the top of the list which will become useful when populating certain parts of the configuration file.


As we can see by the code above, the main piece of information that we use in this template is the node list.  Here is a sample of how we use it in the environment.tmpl template file:
 #for $netAddress, $name in $nodes:
  <Computer computerType="linuxmachine"
            domain="localdomain"
            name="$name"
            netAddress="$netAddress"/>
#end for
I encourage you to download the charm here and examine the environment.tmpl file in the templates directory.

Here is the complete environment.tmpl file... I know it's pretty small and, you can just download the charm and read the file at your leasure but, I wanted to give you an idea of the size and complexity of hpcc's configuration file.

====

#import random
#import subprocess
#set $rel_structure = { $subprocess.check_output(['facter', 'install_time']).strip() : { 'name' : $subprocess.check_output(['hostname', '-f']).strip(), 'netAddress' : $subprocess.check_output(['facter','ipaddress']).strip(), 'uuid' : $subprocess.check_output(['facter','uuid']).strip()  } }
#for $member in $subprocess.check_output(['relation-list']).strip().split():
   #set $rel_structure[$subprocess.check_output(['relation-get','install_time', $member]).strip()] = { 'name' : $subprocess.check_output(['relation-get','name', $member]).strip(), 'netAddress' : $subprocess.check_output(['relation-get','netAddress', $member]).strip(), 'uuid' : $subprocess.check_output(['relation-get', 'uuid', $member]).strip() }
#end for
#set $nodes = []
#for $index in $sorted($rel_structure.keys()):
   $nodes.append(($rel_structure[$index]['netAddress'], $rel_structure[$index]['name']))
#end for
<?xml version="1.0" encoding="UTF-8"?>
<!-- Edited with ConfigMgr on ip 71.204.190.179 on 2011-08-16T00:39:16 -->
<Environment>
 <EnvSettings>
  <blockname>HPCCSystems</blockname>
  <configs>/etc/HPCCSystems</configs>
  <environment>environment.xml</environment>
  <group>hpcc</group>
  <home>/home</home>
  <interface>eth0</interface>
  <lock>/var/lock/HPCCSystems</lock>
  <log>/var/log/HPCCSystems</log>
  <path>/opt/HPCCSystems</path>
  <pid>/var/run/HPCCSystems</pid>
  <runtime>/var/lib/HPCCSystems</runtime>
  <sourcedir>/etc/HPCCSystems/source</sourcedir>
  <user>hpcc</user>
 </EnvSettings>
 <Hardware>
 #for $netAddress, $name in $nodes:
  <Computer computerType="linuxmachine"
            domain="localdomain"
            name="$name"
            netAddress="$netAddress"/>
#end for
  <ComputerType computerType="linuxmachine"
                manufacturer="unknown"
                name="linuxmachine"
                opSys="linux"/>
  <Domain name="localdomain" password="" username=""/>
  <Switch name="Switch"/>
 </Hardware>
 <Programs>
  <Build name="community_3.0.4" url="/opt/HPCCSystems">
   <BuildSet installSet="deploy_map.xml"
             name="dafilesrv"
             path="componentfiles/dafilesrv"
             processName="DafilesrvProcess"
             schema="dafilesrv.xsd"/>
   <BuildSet installSet="deploy_map.xml"
             name="dali"
             path="componentfiles/dali"
             processName="DaliServerProcess"
             schema="dali.xsd"/>
   <BuildSet installSet="deploy_map.xml"
             name="dfuplus"
             path="componentfiles/dfuplus"
             processName="DfuplusProcess"
             schema="dfuplus.xsd"/>
   <BuildSet installSet="deploy_map.xml"
             name="dfuserver"
             path="componentfiles/dfuserver"
             processName="DfuServerProcess"
             schema="dfuserver.xsd"/>
   <BuildSet deployable="no"
             installSet="deploy_map.xml"
             name="DropZone"
             path="componentfiles/DropZone"
             processName="DropZone"
             schema="dropzone.xsd"/>
   <BuildSet installSet="deploy_map.xml"
             name="eclagent"
             path="componentfiles/eclagent"
             processName="EclAgentProcess"
             schema="eclagent_config.xsd"/>
   <BuildSet installSet="deploy_map.xml" name="eclminus" path="componentfiles/eclminus"/>
   <BuildSet installSet="deploy_map.xml"
             name="eclplus"
             path="componentfiles/eclplus"
             processName="EclPlusProcess"
             schema="eclplus.xsd"/>
   <BuildSet installSet="eclccserver_deploy_map.xml"
             name="eclccserver"
             path="componentfiles/configxml"
             processName="EclCCServerProcess"
             schema="eclccserver.xsd"/>
   <BuildSet installSet="eclscheduler_deploy_map.xml"
             name="eclscheduler"
             path="componentfiles/configxml"
             processName="EclSchedulerProcess"
             schema="eclscheduler.xsd"/>
   <BuildSet installSet="deploy_map.xml"
             name="esp"
             path="componentfiles/esp"
             processName="EspProcess"
             schema="esp.xsd"/>
   <BuildSet deployable="no"
             installSet="deploy_map.xml"
             name="espsmc"
             path="componentfiles/espsmc"
             processName="EspService"
             schema="espsmcservice.xsd">
    <Properties defaultPort="8010"
                defaultResourcesBasedn="ou=SMC,ou=EspServices,ou=ecl"
                defaultSecurePort="18010"
                type="WsSMC">
     <Authenticate access="Read"
                   description="Root access to SMC service"
                   path="/"
                   required="Read"
                   resource="SmcAccess"/>
     <AuthenticateFeature description="Access to SMC service"
                          path="SmcAccess"
                          resource="SmcAccess"
                          service="ws_smc"/>
     <AuthenticateFeature description="Access to thor queues"
                          path="ThorQueueAccess"
                          resource="ThorQueueAccess"
                          service="ws_smc"/>
     <AuthenticateFeature description="Access to super computer environment"
                          path="ConfigAccess"
                          resource="ConfigAccess"
                          service="ws_config"/>
     <AuthenticateFeature description="Access to DFU"
                          path="DfuAccess"
                          resource="DfuAccess"
                          service="ws_dfu"/>
     <AuthenticateFeature description="Access to DFU XRef"
                          path="DfuXrefAccess"
                          resource="DfuXrefAccess"
                          service="ws_dfuxref"/>
     <AuthenticateFeature description="Access to machine information"
                          path="MachineInfoAccess"
                          resource="MachineInfoAccess"
                          service="ws_machine"/>
     <AuthenticateFeature description="Access to SNMP metrics information"
                          path="MetricsAccess"
                          resource="MetricsAccess"
                          service="ws_machine"/>
     <AuthenticateFeature description="Access to remote execution"
                          path="ExecuteAccess"
                          resource="ExecuteAccess"
                          service="ws_machine"/>
     <AuthenticateFeature description="Access to DFU workunits"
                          path="DfuWorkunitsAccess"
                          resource="DfuWorkunitsAccess"
                          service="ws_fs"/>
     <AuthenticateFeature description="Access to DFU exceptions"
                          path="DfuExceptionsAccess"
                          resource="DfuExceptions"
                          service="ws_fs"/>
     <AuthenticateFeature description="Access to spraying files"
                          path="FileSprayAccess"
                          resource="FileSprayAccess"
                          service="ws_fs"/>
     <AuthenticateFeature description="Access to despraying of files"
                          path="FileDesprayAccess"
                          resource="FileDesprayAccess"
                          service="ws_fs"/>
     <AuthenticateFeature description="Access to dkcing of key files"
                          path="FileDkcAccess"
                          resource="FileDkcAccess"
                          service="ws_fs"/>
     <AuthenticateFeature description="Access to files in dropzone"
                          path="FileIOAccess"
                          resource="FileIOAccess"
                          service="ws_fileio"/>
     <AuthenticateFeature description="Access to WS ECL service"
                          path="WsEclAccess"
                          resource="WsEclAccess"
                          service="ws_ecl"/>
     <AuthenticateFeature description="Access to Roxie queries and files"
                          path="RoxieQueryAccess"
                          resource="RoxieQueryAccess"
                          service="ws_roxiequery"/>
     <AuthenticateFeature description="Access to cluster topology"
                          path="ClusterTopologyAccess"
                          resource="ClusterTopologyAccess"
                          service="ws_topology"/>
     <AuthenticateFeature description="Access to own workunits"
                          path="OwnWorkunitsAccess"
                          resource="OwnWorkunitsAccess"
                          service="ws_workunits"/>
     <AuthenticateFeature description="Access to others&apos; workunits"
                          path="OthersWorkunitsAccess"
                          resource="OthersWorkunitsAccess"
                          service="ws_workunits"/>
     <AuthenticateFeature description="Access to ECL direct service"
                          path="EclDirectAccess"
                          resource="EclDirectAccess"
                          service="ecldirect"/>
     <ProcessFilters>
      <Platform name="Windows">
       <ProcessFilter name="any">
        <Process name="dafilesrv"/>
       </ProcessFilter>
       <ProcessFilter name="AttrServerProcess">
        <Process name="attrserver"/>
       </ProcessFilter>
       <ProcessFilter name="DaliProcess">
        <Process name="daserver"/>
       </ProcessFilter>
       <ProcessFilter multipleInstances="true" name="DfuServerProcess">
        <Process name="dfuserver"/>
       </ProcessFilter>
       <ProcessFilter multipleInstances="true" name="EclCCServerProcess">
        <Process name="eclccserver"/>
       </ProcessFilter>
       <ProcessFilter multipleInstances="true" name="EspProcess">
        <Process name="esp"/>
        <Process name="dafilesrv" remove="true"/>
       </ProcessFilter>
       <ProcessFilter name="FTSlaveProcess">
        <Process name="ftslave"/>
       </ProcessFilter>
       <ProcessFilter name="RoxieServerProcess">
        <Process name="ccd"/>
       </ProcessFilter>
       <ProcessFilter name="RoxieSlaveProcess">
        <Process name="ccd"/>
       </ProcessFilter>
       <ProcessFilter name="SchedulerProcess">
        <Process name="scheduler"/>
       </ProcessFilter>
       <ProcessFilter name="ThorMasterProcess">
        <Process name="thormaster"/>
       </ProcessFilter>
       <ProcessFilter name="ThorSlaveProcess">
        <Process name="thorslave"/>
       </ProcessFilter>
       <ProcessFilter name="SashaServerProcess">
        <Process name="saserver"/>
       </ProcessFilter>
      </Platform>
      <Platform name="Linux">
       <ProcessFilter name="any">
        <Process name="dafilesrv"/>
       </ProcessFilter>
       <ProcessFilter name="AttrServerProcess">
        <Process name="attrserver"/>
       </ProcessFilter>
       <ProcessFilter name="DaliProcess">
        <Process name="daserver"/>
       </ProcessFilter>
       <ProcessFilter multipleInstances="true" name="DfuServerProcess">
        <Process name="."/>
       </ProcessFilter>
       <ProcessFilter multipleInstances="true" name="EclCCServerProcess">
        <Process name="."/>
       </ProcessFilter>
       <ProcessFilter multipleInstances="true" name="EspProcess">
        <Process name="."/>
        <Process name="dafilesrv" remove="true"/>
       </ProcessFilter>
       <ProcessFilter name="FTSlaveProcess">
        <Process name="ftslave"/>
       </ProcessFilter>
       <ProcessFilter name="GenesisServerProcess">
        <Process name="httpd"/>
        <Process name="atftpd"/>
        <Process name="dhcpd"/>
       </ProcessFilter>
       <ProcessFilter name="RoxieServerProcess">
        <Process name="ccd"/>
       </ProcessFilter>
       <ProcessFilter name="RoxieSlaveProcess">
        <Process name="ccd"/>
       </ProcessFilter>
       <ProcessFilter name="SchedulerProcess">
        <Process name="scheduler"/>
       </ProcessFilter>
       <ProcessFilter name="ThorMasterProcess">
        <Process name="thormaster"/>
       </ProcessFilter>
       <ProcessFilter name="ThorSlaveProcess">
        <Process name="thorslave"/>
       </ProcessFilter>
       <ProcessFilter name="SashaServerProcess">
        <Process name="saserver"/>
       </ProcessFilter>
      </Platform>
     </ProcessFilters>
    </Properties>
   </BuildSet>
   <BuildSet installSet="deploy_map.xml"
             name="ftslave"
             path="componentfiles/ftslave"
             processName="FTSlaveProcess"
             schema="ftslave_linux.xsd"/>
   <BuildSet installSet="deploy_map.xml"
             name="hqltest"
             path="componentfiles/hqltest"
             processName="HqlTestProcess"/>
   <BuildSet deployable="no"
             installSet="deploy_map.xml"
             name="ldapServer"
             path="componentfiles/ldapServer"
             processName="LDAPServerProcess"
             schema="ldapserver.xsd"/>
   <BuildSet deployable="no"
             installSet="auditlib_deploy_map.xml"
             name="plugins_auditlib"
             path="componentfiles/configxml"
             processName="PluginProcess"
             schema="plugin.xsd"/>
   <BuildSet deployable="no"
             installSet="debugservices_deploy_map.xml"
             name="plugins_debugservices"
             path="componentfiles/configxml"
             processName="PluginProcess"
             schema="plugin.xsd"/>
   <BuildSet deployable="no"
             installSet="fileservices_deploy_map.xml"
             name="plugins_fileservices"
             path="componentfiles/configxml"
             processName="PluginProcess"
             schema="plugin.xsd"/>
   <BuildSet deployable="no"
             installSet="logging_deploy_map.xml"
             name="plugins_logging"
             path="componentfiles/configxml"
             processName="PluginProcess"
             schema="plugin.xsd"/>
   <BuildSet deployable="no"
             installSet="parselib_deploy_map.xml"
             name="plugins_parselib"
             path="componentfiles/configxml"
             processName="PluginProcess"
             schema="plugin.xsd"/>
   <BuildSet deployable="no"
             installSet="stringlib_deploy_map.xml"
             name="plugins_stringlib"
             path="componentfiles/configxml"
             processName="PluginProcess"
             schema="plugin.xsd"/>
   <BuildSet deployable="no"
             installSet="unicodelib_deploy_map.xml"
             name="plugins_unicodelib"
             path="componentfiles/configxml"
             processName="PluginProcess"
             schema="plugin.xsd"/>
   <BuildSet deployable="no"
             installSet="workunitservices_deploy_map.xml"
             name="plugins_workunitservices"
             path="componentfiles/configxml"
             processName="PluginProcess"
             schema="plugin.xsd"/>
   <BuildSet installSet="roxie_deploy_map.xml"
             name="roxie"
             path="componentfiles/configxml"
             processName="RoxieCluster"
             schema="roxie.xsd"/>
   <BuildSet installSet="deploy_map.xml" name="roxieconfig" path="componentfiles/roxieconfig"/>
   <BuildSet installSet="deploy_map.xml"
             name="sasha"
             path="componentfiles/sasha"
             processName="SashaServerProcess"
             schema="sasha.xsd"/>
   <BuildSet deployable="no"
             installSet="deploy_map.xml"
             name="SiteCertificate"
             path="componentfiles/SiteCertificate"
             processName="SiteCertificate"
             schema="SiteCertificate.xsd"/>
   <BuildSet installSet="deploy_map.xml"
             name="soapplus"
             path="componentfiles/soapplus"
             processName="SoapPlusProcess"
             schema="soapplus.xsd"/>
   <BuildSet installSet="deploy_map.xml"
             name="thor"
             path="componentfiles/thor"
             processName="ThorCluster"
             schema="thor.xsd"/>
   <BuildSet deployable="no"
             installSet="deploy_map.xml"
             name="topology"
             path="componentfiles/topology"
             processName="Topology"
             schema="topology.xsd"/>
   <BuildSet deployable="no"
             installSet="deploy_map.xml"
             name="ws_ecl"
             path="componentfiles/ws_ecl"
             processName="EspService"
             schema="esp_service_wsecl2.xsd">
    <Properties bindingType="ws_eclSoapBinding"
                defaultPort="8002"
                defaultResourcesBasedn="ou=WsEcl,ou=EspServices,ou=ecl"
                defaultSecurePort="18002"
                plugin="ws_ecl"
                type="ws_ecl">
     <Authenticate access="Read"
                   description="Root access to WS ECL service"
                   path="/"
                   required="Read"
                   resource="WsEclAccess"/>
     <AuthenticateFeature description="Access to WS ECL service"
                          path="WsEclAccess"
                          resource="WsEclAccess"
                          service="ws_ecl"/>
    </Properties>
   </BuildSet>
   <BuildSet deployable="no"
             installSet="deploy_map.xml"
             name="ecldirect"
             path="componentfiles/ecldirect"
             processName="EspService"
             schema="esp_service_ecldirect.xsd">
    <Properties bindingType="EclDirectSoapBinding"
                defaultPort="8008"
                defaultResourcesBasedn="ou=EclDirectAccess,ou=EspServices,ou=ecl"
                defaultSecurePort="18008"
                plugin="ecldirect"
                type="ecldirect">
     <Authenticate access="Read"
                   description="Root access to ECL Direct service"
                   path="/"
                   required="Read"
                   resource="EclDirectAccess"/>
     <AuthenticateFeature description="Access to ECL Direct service"
                          path="EclDirectAccess"
                          resource="EclDirectAccess"
                          service="ecldirect"/>
    </Properties>
   </BuildSet>
  </Build>
 </Programs>
 <Software>
  <DafilesrvProcess build="community_3.0.4"
                    buildSet="dafilesrv"
                    description="DaFileSrv process"
                    name="mydafilesrv"
                    version="1">
#for $netAddress, $name in $nodes:                    
   <Instance computer="$name"
             directory="/var/lib/HPCCSystems/mydafilesrv"
             name="s_$name"
             netAddress="$netAddress"/>
#end for             
  </DafilesrvProcess>
  <DaliServerProcess build="community_3.0.4"
                     buildSet="dali"
                     environment="/etc/HPCCSystems/environment.xml"
                     name="mydali"
                     recoverFromIncErrors="true">                     
   <Instance computer="$nodes[0][1]"
             directory="/var/lib/HPCCSystems/mydali"
             name="s_$nodes[0][1]"
             netAddress="$nodes[0][0]"
             port="7070"/>
  </DaliServerProcess>
  <DfuServerProcess build="community_3.0.4"
                    buildSet="dfuserver"
                    daliServers="mydali"
                    description="DFU Server"
                    monitorinterval="900"
                    monitorqueue="dfuserver_monitor_queue"
                    name="mydfuserver"
                    queue="dfuserver_queue"
                    transferBufferSize="65536">
#for $netAddress, $name in $nodes:                    
   <Instance computer="$name"
             directory="/var/lib/HPCCSystems/mydfuserver"
             name="s_$name"
             netAddress="$netAddress"/>
#end for    
#raw        
   <SSH SSHidentityfile="$HOME/.ssh/id_rsa"
#end raw
        SSHpassword=""
        SSHretries="3"
        SSHtimeout="0"
        SSHusername="hpcc"/>
  </DfuServerProcess>
  <Directories name="HPCCSystems">
   <Category dir="/var/log/[NAME]/[INST]" name="log"/>
   <Category dir="/var/lib/[NAME]/[INST]" name="run"/>
   <Category dir="/etc/[NAME]/[INST]" name="conf"/>
   <Category dir="/var/lib/[NAME]/[INST]/temp" name="temp"/>
   <Category dir="/var/lib/[NAME]/hpcc-data/[COMPONENT]" name="data"/>
   <Category dir="/var/lib/[NAME]/hpcc-data2/[COMPONENT]" name="data2"/>
   <Category dir="/var/lib/[NAME]/hpcc-data3/[COMPONENT]" name="data3"/>
   <Category dir="/var/lib/[NAME]/hpcc-mirror/[COMPONENT]" name="mirror"/>
   <Category dir="/var/lib/[NAME]/queries/[INST]" name="query"/>
   <Category dir="/var/lock/[NAME]/[INST]" name="lock"/>
  </Directories>
  <DropZone build="community_3.0.4"
            buildSet="DropZone"
            computer="$nodes[0][0]"
            description="DropZone process"
            directory="/var/lib/HPCCSystems/dropzone"
            name="mydropzone"/>
  <EclAgentProcess allowedPipePrograms="*"
                   build="community_3.0.4"
                   buildSet="eclagent"
                   daliServers="mydali"
                   description="EclAgent process"
                   name="myeclagent"
                   pluginDirectory="/opt/HPCCSystems/plugins/"
                   thorConnectTimeout="600"
                   traceLevel="0"
                   wuQueueName="myeclagent_queue">
#for $netAddress, $name in $nodes:                    
   <Instance computer="$name"
             directory="/var/lib/HPCCSystems/myeclagent"
             name="s_$name"
             netAddress="$netAddress"/>
#end for  
  </EclAgentProcess>
  <EclCCServerProcess build="community_3.0.4"
                      buildSet="eclccserver"
                      daliServers="mydali"
                      description="EclCCServer process"
                      enableSysLog="true"
                      maxCompileThreads="4"
                      name="myeclccserver"
                      traceLevel="1">
#for $netAddress, $name in $nodes:                    
   <Instance computer="$name"
             directory="/var/lib/HPCCSystems/myeclccserver"
             name="s_$name"
             netAddress="$netAddress"/>
#end for
  </EclCCServerProcess>
  <EclSchedulerProcess build="community_3.0.4"
                       buildSet="eclscheduler"
                       daliServers="mydali"
                       description="EclScheduler process"
                       name="myeclscheduler">
#for $netAddress, $name in $nodes:                    
   <Instance computer="$name"
             directory="/var/lib/HPCCSystems/myeclscheduler"
             name="s_$name"
             netAddress="$netAddress"/>
#end for
  </EclSchedulerProcess>
  <EspProcess build="community_3.0.4"
              buildSet="esp"
              componentfilesDir="/opt/HPCCSystems/componentfiles"
              daliServers="mydali"
              description="ESP server"
              enableSEHMapping="true"
              formOptionsAccess="false"
              httpConfigAccess="true"
              logLevel="1"
              logRequests="false"
              logResponses="false"
              maxBacklogQueueSize="200"
              maxConcurrentThreads="0"
              maxRequestEntityLength="8000000"
              name="myesp"
              perfReportDelay="60"
              portalurl="http://hpccsystems.com/download">
   <Authentication ldapAuthMethod="kerberos"
                   ldapConnections="10"
                   ldapServer=""
                   method="none"/>
   <EspBinding defaultForPort="true"
               defaultServiceVersion=""
               name="myespsmc"
               port="8010"
               protocol="http"
               resourcesBasedn="ou=SMC,ou=EspServices,ou=ecl"
               service="EclWatch"
               workunitsBasedn="ou=workunits,ou=ecl"
               wsdlServiceAddress="">
    <Authenticate access="Read"
                  description="Root access to SMC service"
                  path="/"
                  required="Read"
                  resource="SmcAccess"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to SMC service"
                         path="SmcAccess"
                         resource="SmcAccess"
                         service="ws_smc"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to thor queues"
                         path="ThorQueueAccess"
                         resource="ThorQueueAccess"
                         service="ws_smc"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to super computer environment"
                         path="ConfigAccess"
                         resource="ConfigAccess"
                         service="ws_config"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to DFU"
                         path="DfuAccess"
                         resource="DfuAccess"
                         service="ws_dfu"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to DFU XRef"
                         path="DfuXrefAccess"
                         resource="DfuXrefAccess"
                         service="ws_dfuxref"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to machine information"
                         path="MachineInfoAccess"
                         resource="MachineInfoAccess"
                         service="ws_machine"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to SNMP metrics information"
                         path="MetricsAccess"
                         resource="MetricsAccess"
                         service="ws_machine"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to remote execution"
                         path="ExecuteAccess"
                         resource="ExecuteAccess"
                         service="ws_machine"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to DFU workunits"
                         path="DfuWorkunitsAccess"
                         resource="DfuWorkunitsAccess"
                         service="ws_fs"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to DFU exceptions"
                         path="DfuExceptionsAccess"
                         resource="DfuExceptions"
                         service="ws_fs"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to spraying files"
                         path="FileSprayAccess"
                         resource="FileSprayAccess"
                         service="ws_fs"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to despraying of files"
                         path="FileDesprayAccess"
                         resource="FileDesprayAccess"
                         service="ws_fs"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to dkcing of key files"
                         path="FileDkcAccess"
                         resource="FileDkcAccess"
                         service="ws_fs"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to files in dropzone"
                         path="FileIOAccess"
                         resource="FileIOAccess"
                         service="ws_fileio"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to WS ECL service"
                         path="WsEclAccess"
                         resource="WsEclAccess"
                         service="ws_ecl"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to Roxie queries and files"
                         path="RoxieQueryAccess"
                         resource="RoxieQueryAccess"
                         service="ws_roxiequery"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to cluster topology"
                         path="ClusterTopologyAccess"
                         resource="ClusterTopologyAccess"
                         service="ws_topology"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to own workunits"
                         path="OwnWorkunitsAccess"
                         resource="OwnWorkunitsAccess"
                         service="ws_workunits"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to others&apos; workunits"
                         path="OthersWorkunitsAccess"
                         resource="OthersWorkunitsAccess"
                         service="ws_workunits"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to ECL direct service"
                         path="EclDirectAccess"
                         resource="EclDirectAccess"
                         service="ecldirect"/>
   </EspBinding>
   <EspBinding defaultForPort="true"
               defaultServiceVersion=""
               name="myws_ecl"
               port="8002"
               protocol="http"
               resourcesBasedn="ou=WsEcl,ou=EspServices,ou=ecl"
               service="myws_ecl"
               workunitsBasedn="ou=workunits,ou=ecl"
               wsdlServiceAddress="">
    <Authenticate access="Read"
                  description="Root access to WS ECL service"
                  path="/"
                  required="Read"
                  resource="WsEclAccess"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to WS ECL service"
                         path="WsEclAccess"
                         resource="WsEclAccess"
                         service="ws_ecl"/>
   </EspBinding>
   <EspBinding defaultForPort="true"
               defaultServiceVersion=""
               name="myecldirect"
               port="8008"
               protocol="http"
               resourcesBasedn="ou=EclDirectAccess,ou=EspServices,ou=ecl"
               service="myecldirect"
               workunitsBasedn="ou=workunits,ou=ecl"
               wsdlServiceAddress="">
    <Authenticate access="Read"
                  description="Root access to ECL Direct service"
                  path="/"
                  required="Read"
                  resource="EclDirectAccess"/>
    <AuthenticateFeature authenticate="Yes"
                         description="Access to ECL Direct service"
                         path="EclDirectAccess"
                         resource="EclDirectAccess"
                         service="ecldirect"/>
   </EspBinding>
   <HTTPS acceptSelfSigned="true"
          CA_Certificates_Path="ca.pem"
          certificateFileName="certificate.cer"
          city=""
          country="US"
          daysValid="365"
          enableVerification="false"
          organization="Customer of HPCCSystems"
          organizationalUnit=""
          passphrase=""
          privateKeyFileName="privatekey.cer"
          regenerateCredentials="false"
          requireAddressMatch="false"
          state=""
          trustedPeers="anyone"/>
   #for $netAddress, $name in $nodes:                    
   <Instance computer="$name"
             directory="/var/lib/HPCCSystems/myesp"
             FQDN=""
             name="s_$name"
             netAddress="$netAddress"/>
   #end for
   <ProtocolX authTimeout="3"
              defaultTimeout="21"
              idleTimeout="3600"
              maxTimeout="36"
              minTimeout="30"
              threadCount="2"/>
  </EspProcess>
  <EspService allowNewRoxieOnDemandQuery="false"
              AWUsCacheTimeout="15"
              build="community_3.0.4"
              buildSet="espsmc"
              description="ESP services for SMC"
              disableUppercaseTranslation="false"
              eclServer="myeclccserver"
              enableSystemUseRewrite="false"
              excludePartitions="/,/dev*,/sys,/usr,/proc/*"
              monitorDaliFileServer="false"
              name="EclWatch"
              pluginsPath="/opt/HPCCSystems/plugins"
              syntaxCheckQueue=""
              viewTimeout="1000"
              warnIfCpuLoadOver="95"
              warnIfFreeMemoryUnder="5"
              warnIfFreeStorageUnder="5">
   <Properties defaultPort="8010"
               defaultResourcesBasedn="ou=SMC,ou=EspServices,ou=ecl"
               defaultSecurePort="18010"
               type="WsSMC">
    <Authenticate access="Read"
                  description="Root access to SMC service"
                  path="/"
                  required="Read"
                  resource="SmcAccess"/>
    <AuthenticateFeature description="Access to SMC service"
                         path="SmcAccess"
                         resource="SmcAccess"
                         service="ws_smc"/>
    <AuthenticateFeature description="Access to thor queues"
                         path="ThorQueueAccess"
                         resource="ThorQueueAccess"
                         service="ws_smc"/>
    <AuthenticateFeature description="Access to super computer environment"
                         path="ConfigAccess"
                         resource="ConfigAccess"
                         service="ws_config"/>
    <AuthenticateFeature description="Access to DFU"
                         path="DfuAccess"
                         resource="DfuAccess"
                         service="ws_dfu"/>
    <AuthenticateFeature description="Access to DFU XRef"
                         path="DfuXrefAccess"
                         resource="DfuXrefAccess"
                         service="ws_dfuxref"/>
    <AuthenticateFeature description="Access to machine information"
                         path="MachineInfoAccess"
                         resource="MachineInfoAccess"
                         service="ws_machine"/>
    <AuthenticateFeature description="Access to SNMP metrics information"
                         path="MetricsAccess"
                         resource="MetricsAccess"
                         service="ws_machine"/>
    <AuthenticateFeature description="Access to remote execution"
                         path="ExecuteAccess"
                         resource="ExecuteAccess"
                         service="ws_machine"/>
    <AuthenticateFeature description="Access to DFU workunits"
                         path="DfuWorkunitsAccess"
                         resource="DfuWorkunitsAccess"
                         service="ws_fs"/>
    <AuthenticateFeature description="Access to DFU exceptions"
                         path="DfuExceptionsAccess"
                         resource="DfuExceptions"
                         service="ws_fs"/>
    <AuthenticateFeature description="Access to spraying files"
                         path="FileSprayAccess"
                         resource="FileSprayAccess"
                         service="ws_fs"/>
    <AuthenticateFeature description="Access to despraying of files"
                         path="FileDesprayAccess"
                         resource="FileDesprayAccess"
                         service="ws_fs"/>
    <AuthenticateFeature description="Access to dkcing of key files"
                         path="FileDkcAccess"
                         resource="FileDkcAccess"
                         service="ws_fs"/>
    <AuthenticateFeature description="Access to files in dropzone"
                         path="FileIOAccess"
                         resource="FileIOAccess"
                         service="ws_fileio"/>
    <AuthenticateFeature description="Access to WS ECL service"
                         path="WsEclAccess"
                         resource="WsEclAccess"
                         service="ws_ecl"/>
    <AuthenticateFeature description="Access to Roxie queries and files"
                         path="RoxieQueryAccess"
                         resource="RoxieQueryAccess"
                         service="ws_roxiequery"/>
    <AuthenticateFeature description="Access to cluster topology"
                         path="ClusterTopologyAccess"
                         resource="ClusterTopologyAccess"
                         service="ws_topology"/>
    <AuthenticateFeature description="Access to own workunits"
                         path="OwnWorkunitsAccess"
                         resource="OwnWorkunitsAccess"
                         service="ws_workunits"/>
    <AuthenticateFeature description="Access to others&apos; workunits"
                         path="OthersWorkunitsAccess"
                         resource="OthersWorkunitsAccess"
                         service="ws_workunits"/>
    <AuthenticateFeature description="Access to ECL direct service"
                         path="EclDirectAccess"
                         resource="EclDirectAccess"
                         service="ecldirect"/>
    <ProcessFilters>
     <Platform name="Windows">
      <ProcessFilter name="any">
       <Process name="dafilesrv"/>
      </ProcessFilter>
      <ProcessFilter name="AttrServerProcess">
       <Process name="attrserver"/>
      </ProcessFilter>
      <ProcessFilter name="DaliProcess">
       <Process name="daserver"/>
      </ProcessFilter>
      <ProcessFilter multipleInstances="true" name="DfuServerProcess">
       <Process name="dfuserver"/>
      </ProcessFilter>
      <ProcessFilter multipleInstances="true" name="EclCCServerProcess">
       <Process name="eclccserver"/>
      </ProcessFilter>
      <ProcessFilter multipleInstances="true" name="EspProcess">
       <Process name="esp"/>
       <Process name="dafilesrv" remove="true"/>
      </ProcessFilter>
      <ProcessFilter name="FTSlaveProcess">
       <Process name="ftslave"/>
      </ProcessFilter>
      <ProcessFilter name="RoxieServerProcess">
       <Process name="ccd"/>
      </ProcessFilter>
      <ProcessFilter name="RoxieSlaveProcess">
       <Process name="ccd"/>
      </ProcessFilter>
      <ProcessFilter name="SchedulerProcess">
       <Process name="scheduler"/>
      </ProcessFilter>
      <ProcessFilter name="ThorMasterProcess">
       <Process name="thormaster"/>
      </ProcessFilter>
      <ProcessFilter name="ThorSlaveProcess">
       <Process name="thorslave"/>
      </ProcessFilter>
      <ProcessFilter name="SashaServerProcess">
       <Process name="saserver"/>
      </ProcessFilter>
     </Platform>
     <Platform name="Linux">
      <ProcessFilter name="any">
       <Process name="dafilesrv"/>
      </ProcessFilter>
      <ProcessFilter name="AttrServerProcess">
       <Process name="attrserver"/>
      </ProcessFilter>
      <ProcessFilter name="DaliProcess">
       <Process name="daserver"/>
      </ProcessFilter>
      <ProcessFilter multipleInstances="true" name="DfuServerProcess">
       <Process name="."/>
      </ProcessFilter>
      <ProcessFilter multipleInstances="true" name="EclCCServerProcess">
       <Process name="."/>
      </ProcessFilter>
      <ProcessFilter multipleInstances="true" name="EspProcess">
       <Process name="."/>
       <Process name="dafilesrv" remove="true"/>
      </ProcessFilter>
      <ProcessFilter name="FTSlaveProcess">
       <Process name="ftslave"/>
      </ProcessFilter>
      <ProcessFilter name="GenesisServerProcess">
       <Process name="httpd"/>
       <Process name="atftpd"/>
       <Process name="dhcpd"/>
      </ProcessFilter>
      <ProcessFilter name="RoxieServerProcess">
       <Process name="ccd"/>
      </ProcessFilter>
      <ProcessFilter name="RoxieSlaveProcess">
       <Process name="ccd"/>
      </ProcessFilter>
      <ProcessFilter name="SchedulerProcess">
       <Process name="scheduler"/>
      </ProcessFilter>
      <ProcessFilter name="ThorMasterProcess">
       <Process name="thormaster"/>
      </ProcessFilter>
      <ProcessFilter name="ThorSlaveProcess">
       <Process name="thorslave"/>
      </ProcessFilter>
      <ProcessFilter name="SashaServerProcess">
       <Process name="saserver"/>
      </ProcessFilter>
     </Platform>
    </ProcessFilters>
   </Properties>
  </EspService>
  <EspService build="community_3.0.4"
              buildSet="ws_ecl"
              description="WS ECL Service"
              name="myws_ecl">
   <Properties bindingType="ws_eclSoapBinding"
               defaultPort="8002"
               defaultResourcesBasedn="ou=WsEcl,ou=EspServices,ou=ecl"
               defaultSecurePort="18002"
               plugin="ws_ecl"
               type="ws_ecl">
    <Authenticate access="Read"
                  description="Root access to WS ECL service"
                  path="/"
                  required="Read"
                  resource="WsEclAccess"/>
    <AuthenticateFeature description="Access to WS ECL service"
                         path="WsEclAccess"
                         resource="WsEclAccess"
                         service="ws_ecl"/>
   </Properties>
  </EspService>
  <EspService build="community_3.0.4"
              buildSet="ecldirect"
              clusterName="hthor"
              description="ESP service for running raw ECL queries"
              name="myecldirect">
   <Properties bindingType="EclDirectSoapBinding"
               defaultPort="8008"
               defaultResourcesBasedn="ou=EclDirectAccess,ou=EspServices,ou=ecl"
               defaultSecurePort="18008"
               plugin="ecldirect"
               type="ecldirect">
    <Authenticate access="Read"
                  description="Root access to ECL Direct service"
                  path="/"
                  required="Read"
                  resource="EclDirectAccess"/>
    <AuthenticateFeature description="Access to ECL Direct service"
                         path="EclDirectAccess"
                         resource="EclDirectAccess"
                         service="ecldirect"/>
   </Properties>
  </EspService>
  <FTSlaveProcess build="community_3.0.4"
                  buildSet="ftslave"
                  description="FTSlave process"
                  name="myftslave"
                  version="1">
#for $netAddress, $name in $nodes:                    
   <Instance computer="$name"
             directory="/var/lib/HPCCSystems/myftslave"
             name="s_$name"
             netAddress="$netAddress"
             program="/opt/HPCCSystems/bin/ftslave"/>
#end for
  </FTSlaveProcess>
  <PluginProcess build="community_3.0.4"
                 buildSet="plugins_auditlib"
                 description="plugin process"
                 name="myplugins_auditlib"/>
  <PluginProcess build="community_3.0.4"
                 buildSet="plugins_debugservices"
                 description="plugin process"
                 name="myplugins_debugservices"/>
  <PluginProcess build="community_3.0.4"
                 buildSet="plugins_fileservices"
                 description="plugin process"
                 name="myplugins_fileservices"/>
  <PluginProcess build="community_3.0.4"
                 buildSet="plugins_logging"
                 description="plugin process"
                 name="myplugins_logging"/>
  <PluginProcess build="community_3.0.4"
                 buildSet="plugins_parselib"
                 description="plugin process"
                 name="myplugins_parselib"/>
  <PluginProcess build="community_3.0.4"
                 buildSet="plugins_stringlib"
                 description="plugin process"
                 name="myplugins_stringlib"/>
  <PluginProcess build="community_3.0.4"
                 buildSet="plugins_unicodelib"
                 description="plugin process"
                 name="myplugins_unicodelib"/>
  <PluginProcess build="community_3.0.4"
                 buildSet="plugins_workunitservices"
                 description="plugin process"
                 name="myplugins_workunitservices"/>
  <RoxieCluster allowRoxieOnDemand="false"
                baseDataDir="/var/lib/HPCCSystems/hpcc-data/roxie"
                blindLogging="false"
                blobCacheMem="0"
                build="community_3.0.4"
                buildSet="roxie"
                callbackRetries="3"
                callbackTimeout="500"
                channel0onPrimary="true"
                checkCompleted="true"
                checkFileDate="true"
                checkingHeap="0"
                checkPrimaries="true"
                checkState="false"
                checkVersion="true"
                clusterWidth="$len($nodes)"
                copyResources="true"
                crcResources="false"
                cyclicOffset="1"
                dafilesrvLookupTimeout="10000"
                daliServers="mydali"
                debugPermitted="true"
                defaultConcatPreload="0"
                defaultFetchPreload="0"
                defaultFullKeyedJoinPreload="0"
                defaultHighPriorityTimeLimit="0"
                defaultHighPriorityTimeWarning="5000"
                defaultKeyedJoinPreload="0"
                defaultLowPriorityTimeLimit="0"
                defaultLowPriorityTimeWarning="0"
                defaultMemoryLimit="0"
                defaultParallelJoinPreload="0"
                defaultPrefetchProjectPreload="10"
                defaultSLAPriorityTimeLimit="0"
                defaultSLAPriorityTimeWarning="5000"
                defaultStripLeadingWhitespace="1"
                deleteUnneededFiles="false"
                description="Roxie cluster"
                directory="/var/lib/HPCCSystems/myroxie"
                diskReadBufferSize="65536"
                diskReadStable="true"
                doIbytiDelay="true"
                enableForceKeyDiffCopy="false"
                enableHeartBeat="true"
                enableKeyDiff="true"
                enableSNMP="true"
                enableSysLog="true"
                fastLaneQueue="true"
                fieldTranslationEnabled="false"
                flushJHtreeCacheOnOOM="true"
                forceStdLog="false"
                highTimeout="2000"
                ignoreMissingFiles="false"
                indexReadChunkSize="60000"
                indexReadStable="true"
                initIbytiDelay="100"
                jumboFrames="false"
                keyedJoinFlowLimit="1000"
                keyedJoinStable="true"
                lazyOpen="false"
                leafCacheMem="50"
                linuxYield="true"
                localFilesExpire="-1"
                localSlave="false"
                logFullQueries="false"
                logQueueDrop="32"
                logQueueLen="512"
                lowTimeout="10000"
                maxBlockSize="10000000"
                maxLocalFilesOpen="4000"
                maxLockAttempts="5"
                maxRemoteFilesOpen="1000"
                memoryStatsInterval="60"
                memTraceLevel="1"
                memTraceSizeLimit="0"
                minFreeDiskSpace="1073741824"
                minIbytiDelay="0"
                minLocalFilesOpen="2000"
                minRemoteFilesOpen="500"
                miscDebugTraceLevel="0"
                monitorDaliFileServer="false"
                multicastBase="239.1.1.1"
                multicastLast="239.1.254.254"
                name="myroxie"
                nodeCacheMem="100"
                nodeCachePreload="false"
                numChannels="$len($nodes)"
                numDataCopies="2"
                parallelAggregate="0"
                perChannelFlowLimit="10"
                pingInterval="60"
                pluginsPath="/opt/HPCCSystems/plugins"
                preabortIndexReadsThreshold="100"
                preabortKeyedJoinsThreshold="100"
                preferredSubnet=""
                preferredSubnetMask=""
                remoteFilesExpire="3600000"
                resolveFilesInPackage="false"
                roxieMulticastEnabled="true"
                serverSideCacheSize="0"
                serverThreads="30"
                simpleLocalKeyedJoins="true"
                siteCertificate=""
                slaTimeout="2000"
                slaveConfig="cyclic redundancy"
                slaveThreads="30"
                smartSteppingChunkRows="100"
                soapTraceLevel="1"
                socketCheckInterval="5000"
#raw                
                SSHidentityfile="$HOME/.ssh/id_rsa"
#end raw                
                SSHpassword=""
                SSHretries="3"
                SSHtimeout="0"
                SSHusername="hpcc"
                statsExpiryTime="3600"
                syncCluster="false"
                systemMonitorInterval="60000"
                totalMemoryLimit="1073741824"
                traceLevel="1"
                trapTooManyActiveQueries="true"
                udpFlowSocketsSize="131071"
                udpInlineCollation="false"
                udpInlineCollationPacketLimit="50"
                udpLocalWriteSocketSize="131071"
                udpMaxRetryTimedoutReqs="0"
                udpMaxSlotsPerClient="2147483647"
                udpMulticastBufferSize="131071"
                udpOutQsPriority="0"
                udpQueueSize="100"
                udpRequestToSendTimeout="5"
                udpResendEnabled="true"
                udpRetryBusySenders="0"
                udpSendCompletedInData="false"
                udpSendQueueSize="50"
                udpSnifferEnabled="true"
                udpTraceLevel="1"
                useHardLink="false"
                useLogQueue="true"
                useMemoryMappedIndexes="false"
                useRemoteResources="true"
                useTreeCopy="false">
   <RoxieFarmProcess dataDirectory="/var/lib/HPCCSystems/hpcc-data/roxie"
                     listenQueue="200"
                     name="farm1"
                     numThreads="30"
                     port="9876"
                     requestArrayThreads="5">
#for $netAddress, $name in $nodes:                    
    <RoxieServerProcess computer="$name" name="farm1_$name"/>
#end for
   </RoxieFarmProcess>
#for $netAddress, $name in $nodes:                    
   <RoxieServerProcess computer="$name"
                       dataDirectory="/var/lib/HPCCSystems/hpcc-data/roxie"
                       listenQueue="200"
                       name="farm1_s_$name"
                       netAddress="$netAddress"
                       numThreads="30"
                       port="9876"
                       requestArrayThreads="5"/>
#end for
#for $netAddress, $name in $nodes:                    
   <RoxieSlave computer="$name" name="s_$name">
    <RoxieChannel dataDirectory="/var/lib/HPCCSystems/hpcc-data/roxie" number="$random.randint(1,$len($nodes))"/>
    <RoxieChannel dataDirectory="/var/lib/HPCCSystems/hpcc-data2/roxie" number="$random.randint(1,$len($nodes))"/>
   </RoxieSlave>
#end for
#set $channel = 1
#for $netAddress, $name in $nodes:                    
   <RoxieSlaveProcess channel="$channel"
                      computer="$name"
                      dataDirectory="/var/lib/HPCCSystems/hpcc-data/roxie"
                      name="s_$name"
                      netAddress="$netAddress"/>
#set $channel = $channel + 1                      
#end for
  </RoxieCluster>
  <SashaServerProcess autoRestartInterval="0"
                      build="community_3.0.4"
                      buildSet="sasha"
                      cachedWUat="* * * * *"
                      cachedWUinterval="24"
                      cachedWUlimit="100"
                      coalesceAt="* * * * *"
                      coalesceInterval="1"
                      dafsmonAt="* * * * *"
                      dafsmonInterval="0"
                      dafsmonList="*"
                      daliServers="mydali"
                      description="Sasha Server process"
                      DFUrecoveryAt="* * * * *"
                      DFUrecoveryCutoff="4"
                      DFUrecoveryInterval="12"
                      DFUrecoveryLimit="20"
                      DFUWUat="* * * * *"
                      DFUWUcutoff="14"
                      DFUWUduration="0"
                      DFUWUinterval="24"
                      DFUWUlimit="1000"
                      DFUWUthrottle="0"
                      ExpiryAt="* 3 * * *"
                      ExpiryInterval="24"
                      keepResultFiles="false"
                      LDSroot="LDS"
                      logDir="."
                      minDeltaSize="50000"
                      name="mysasha"
                      recoverDeltaErrors="false"
                      thorQMonInterval="1"
                      thorQMonQueues="*"
                      thorQMonSwitchMinTime="0"
                      WUat="* * * * *"
                      WUbackup="0"
                      WUcutoff="8"
                      WUduration="0"
                      WUinterval="6"
                      WUlimit="1000"
                      WUretryinterval="7"
                      WUthrottle="0"
                      xrefAt="* 2 * * *"
                      xrefCutoff="1"
                      xrefEclWatchProvider="true"
                      xrefInterval="0"
                      xrefList="*">
#for $netAddress, $name in $nodes:                    
   <Instance computer="$name"
             directory="/var/lib/HPCCSystems/mysasha"
             name="s_$name"
             netAddress="$netAddress"
             port="8877"/>
#end for             
  </SashaServerProcess>
  <ThorCluster autoCopyBackup="false"
               build="community_3.0.4"
               buildSet="thor"
               computer="$nodes[0][1]"
               daliServers="mydali"
               description="Thor process"
               localThor="false"
               monitorDaliFileServer="true"
               multiSlaves="false"
               name="mythor"
               pluginsPath="/opt/HPCCSystems/plugins/"
               replicateAsync="true"
               replicateOutputs="true"
               slaves="8"
               watchdogEnabled="true"
               watchdogProgressEnabled="true">
   <Debug/>
#raw   
   <SSH SSHidentityfile="$HOME/.ssh/id_rsa"
#end raw   
        SSHpassword=""
        SSHretries="3"
        SSHtimeout="0"
        SSHusername="hpcc"/>
   <Storage/>
   <SwapNode/>
   <ThorMasterProcess computer="$nodes[0][1]" name="m_$nodes[0][1]"/>
#if $len($nodes) > 1:
#for $netAddress, $name in $nodes[1:]:                    
   <ThorSlaveProcess computer="$name" name="s_$name"/>
#end for   
#end if
   <Topology>
    <Node process="m_$nodes[0][1]">
#if $len($nodes) > 1:
#for $netAddress, $name in $nodes[1:]:                    
     <Node process="s_$name"/>
#end for   
#end if
    </Node>
   </Topology>
  </ThorCluster>
  <Topology build="community_3.0.4" buildSet="topology" name="topology">
   <Cluster name="hthor" prefix="hthor">
    <EclAgentProcess process="myeclagent"/>
    <EclCCServerProcess process="myeclccserver"/>
    <EclSchedulerProcess process="myeclscheduler"/>
   </Cluster>
   <Cluster name="thor" prefix="thor">
    <EclAgentProcess process="myeclagent"/>
    <EclCCServerProcess process="myeclccserver"/>
    <EclSchedulerProcess process="myeclscheduler"/>
    <ThorCluster process="mythor"/>
   </Cluster>
   <Cluster name="roxie" prefix="roxie">
    <EclAgentProcess process="myeclagent"/>
    <EclCCServerProcess process="myeclccserver"/>
    <EclSchedulerProcess process="myeclscheduler"/>
    <RoxieCluster process="myroxie"/>
   </Cluster>
  </Topology>
 </Software>
</Environment>
====


Even just scrolling past the file takes a while!!  This behemoth of a file was tamed thanks to cheetah, I highly encourage you to read up on it.

This charm may require some changes to your environment.yaml file in ~/.juju as hpcc will only run on 64-bit instances.  Make sure that your juju environment has been properly shutdown before you edit this file ( juju destroy-environment ).  Here is my environment.yaml file where I show you the important part to check:
juju: environments

environments:
  sample:
    type: ec2
    access-key: ( removed ... get your own :) )
    secret-key: ( removed ... get your own :) )
    control-bucket: juju-fbb790f292e14a0394353bb4b63a3403
    admin-secret: 604d18a77fd24e3f91e1df398fcbe9f2
The emphasized parts are the important ones.  You can just copy them from here and paste them into your ~/.juju/environment.yaml file.
Now, let's take a look at the charm starting with the metadata.yaml file:
name: hpcc
revision: 1
summary: HPCC (High Performance Computing Cluster)
description: |
  HPCC (High Performance Computing Cluster) is a massive 
  parallel-processing computing platform that solves Big Data problems.
provides:
  hpcc:
    interface: hpcc
requires:
  hpcc-thor:
    interface: hpcc-thor
  hpcc-roxie:
    interface: hpcc-roxie
peers:
  hpcc-cluster:
    interface: hpcc-cluster

 There are various provides and requires interfaces in this metadata.yaml file but, for now, only the peers interface is being used.  I'll work on the other ones as the charm matures.


Let's look at the hpcc-cluster interface.  More specifically the hpcc-cluster-relation-changed hook where the new configuration is created:
#!/bin/bash
CWD=$(dirname $0)
cheetah fill --oext=xml --odir=/etc/HPCCSystems/ ${CWD}/../templates/environment.tmpl
service hpcc-init restart
It's pretty simple isn't it?   Since the "heavy lifting" is being done with the self contained cheetah template , we don't have much to do here but, to generate the configuration file and restart hpcc.

The other files in this charm are pretty self explanatory and simple so, I am leaving the details of them as an exercise to the reader.

All of the complexities in hpcc has been distilled to the following commands:

  • juju bootstrap
  • bzr branch lp:~negronjl/+junk/hpcc
  • juju deploy --repository . hpcc
    • wait a minute of two
  • juju status
    • you should see something similar to this:

negronjl@negronjl-laptop:~/src/juju/charms$ juju status
2011-08-18 16:00:54,413 INFO Connecting to environment.


machines: 
  0: {dns-name: ec2-184-73-109-244.compute-1.amazonaws.com, instance-id: i-6d61460c} 
  1: {dns-name: ec2-50-16-60-94.compute-1.amazonaws.com, instance-id: i-d5694eb4}


services:
  hpcc:
    charm: local:hpcc-1
    relations: {hpcc-cluster: hpcc}
    units:
      hpcc/0:
        machine: 1
        relations: {}
        state: null


2011-08-18 16:00:58,374 INFO 'status' command finished successfully
negronjl@negronjl-laptop:~/src/juju/charms$
The above commands, will give you a single node.

You can access the web interface of your node by pointing your browser to http://<FQDN>:8010 Where FQDN is the Fully Qualified Domain Name or Public IP Address of your hpcc instance.  On the left side, there should be a menu, explore the items on the Topology section.  The Target Clusters section should look something similar to this:


To experience the true power of hpcc, you should probably throw in some more nodes at it.  Let's do just that with:


  • juju add-unit hpcc 
    • do this as many times as you feel comfortable   
    • Each command will give you a new node in the cluster
    • wait a minute or two and you should see something similar to this:

negronjl@negronjl-laptop:~$ juju status
2011-08-18 16:25:55,739 INFO Connecting to environment.
machines:
  0: {dns-name: ec2-184-73-109-244.compute-1.amazonaws.com, instance-id: i-6d61460c}
  1: {dns-name: ec2-50-16-60-94.compute-1.amazonaws.com, instance-id: i-d5694eb4}
  2: {dns-name: ec2-50-19-181-98.compute-1.amazonaws.com, instance-id: i-a5795ec4}
  3: {dns-name: ec2-184-72-147-67.compute-1.amazonaws.com, instance-id: i-25446344}
services:
  hpcc:
    charm: local:hpcc-1
    relations: {hpcc-cluster: hpcc}
    units:
      hpcc/0:
        machine: 1
        relations:
          hpcc-cluster: {state: up}
        state: started
      hpcc/1:
        machine: 2
        relations:
          hpcc-cluster: {state: up}
        state: started
      hpcc/2:
        machine: 3
        relations:
          hpcc-cluster: {state: up}
        state: started
2011-08-18 16:26:01,837 INFO 'status' command finished successfully
Notice how we now have more hpcc nodes :)  Here is what the web interface could look like:


Again....we have more nodes :)

Now that we have a working cluster, let's try it.  We'll first do the mandatory Hello World in ECL.  It looks something like this (hello.ecl):
Output('Hello world');
 We have to compile our hello.ecl so we can use it.  We do that by logging into one of the nodes ( I used juju ssh 1 to log on to the first/master node ) and typing the following:
eclcc hello.ecl -o
We run the file just like we would any other binary:
./hello
... and the output is:
ubuntu@ip-10-111-19-210:~$ ./hello
Hello world
ubuntu@ip-10-111-19-210:~$ 
There are far more interesting examples in the Learning ECL Documentation here.  I highly encourage you to go and read about it.

That's it for now.  Feedback is always welcome of course so, let me know how I'm doing.

-Juan

Ensemble security and firewall enhancements

// August 16th, 2011 // 3 Comments » // Uncategorized

A lot of work is going into making sure Ensemble is more secure and enterprise ready. As part of that, all deployed services are now firewalled by default and for a formula deployed service to be publically accessible, the formula author has to specify which ports are open and when, as well as the operator needs to signal wanting to open that port. All formulas that expose ports should use open-port (and optionally close-port) diligently. Here’s what you need to know

Updating formulas for the new expose functionality

This is the only change necessary for the WordPress/MySQL example, in example/wordpress/hooks/db-relation-changed:

  # Make it publicly visible, once the wordpress service is exposed
  open-port 80/tcp

It is important that formulas open ports only when ready. So in the WordPress example, you wouldn’t want to do this port opening until Apache has been successfully configured and restarted. Otherwise, there’s a chance that users might see “It works!” before the desired page is available.

Firewall changes also are a two-step process. The hooks for a service unit need to open ports (and they can also close ports), but the Ensemble administrator must also expose the service. For the WordPress example, you can expose it any time after the service has been deployed with the following:

  ensemble expose wordpress

Just expose the services you’re interested in exposing, possibly as soon as immediately after deployment. Again, it’s the formula author’s responsibility to ensure that port opening is done at the right time.

The service can be subsequently unexposed with

  ensemble unexpose wordpress

You can see if a service is exposed with ensemble status. This would result in output similar to the following:

$ ensemble status
2011-08-09 17:59:29,704 INFO Connecting to environment.
machines:
  0: {dns-name: ec2-50-18-5-80.us-west-1.compute.amazonaws.com,
instance-id: i-1119e556}
  1: {dns-name: ec2-50-18-73-159.us-west-1.compute.amazonaws.com,
instance-id: i-531ae614}
  2: {dns-name: ec2-50-18-139-254.us-west-1.compute.amazonaws.com,
instance-id: i-671ae620}
services:
  mysql:
    formula: local:mysql-11
    relations: {db: wordpress}
    units:
      mysql/0:
        machine: 1
        relations:
          db: {state: up}
        state: started
  wordpress:
    exposed: true
    formula: local:wordpress-30
    relations: {db: mysql}
    units:
      wordpress/0:
        machine: 2
        open-ports: [80/tcp]
        relations:
          db: {state: up}
        state: started
2011-08-09 17:59:36,031 INFO 'status' command finished successfully

This work is only part of the effort to ensure Ensemble uses secure mechanisms in its operations. Recent work also made sure all state information between cloud nodes are properly access controlled to avoid leaking any confidential data. Ensemble is rapidly progressing, and now is a great time to start playing with the technology, and to start writing your own formulas!

Interested? Join the friendly Ensemble community at #ubuntu-ensemble on IRC freenode, drop in, say hi, and grab me (kim0) for any questions

MongoDB replicaset on Amazon ec2 with Ensemble

// August 15th, 2011 // 3 Comments » // Uncategorized

MongoDB is such a great piece of open-source technology. It supports some very interesting features such as sharding and replica-sets. I have seen demos of MongoDB, where the speaker happily calls creating the replica-set cluster a “one hour thing“! I decided to sprinkle some Ensemble magic on this problem, using Jaun’s formulas, the problem becomes a “10 second thing” basically! Spinning up a Mongo replica-set cluster could not be easier! Check this video out


If you can’t see the embedded video, here’s a direct link http://youtu.be/CVgMA6Hi7rw

Yep that’s how simple it is! If you want to create more read-slaves, you only need to ask Ensemble to do it for you:

$ ensemble add-unit mongodb

If you’re interested to learn more about exactly how this “magic” works, check out this indepth guide dissecting how the Mongo Ensemble formulas exactly works by “Juan Negron” the formula author.

So was this useful? Will you be deploying your next mongodb servers with Ensemble?
Leave me a comment, let me know your thoughts! Also let me know what you’d like to see deployed next with Ensemble. Be sure to drop in to #ubuntu-ensemble on freenode irc and say hi

Easy cassandra deployments with Ubuntu Server and Ensemble

// August 12th, 2011 // 2 Comments » // Uncategorized

A very popular database used by many companies and projects these days seem to be Cassandra.
From their website: 
The Apache Cassandra Project develops a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and Bigtable's ColumnFamily-based data model.
I am by no means an expert on Cassandra but, I have done some medium size deployments on Amazon's cloud so, I wanted to translate my knowledge of Cassandra "rings" and develop an Ensemble formula that could use their peers interfaces to expand and contract the ring as needed.

For the impatient, the cassandra ensemble formula is here

The rest of us, let's move on to some details about the formula:

  • It should be simple ( ensemble deploy cassandra ... nothing more than that )
  • It should work stand-alone
  • It should be expandable via peers interfaces
    • grow the cluster/ring via ensemble add-unit cassandra
  • Make use of the Cassandra default configuration as much as possible.
  • Extract common variables from the configuration file(s) into the formula so they can be changed in the future.
The steps to install Cassandra can be distilled down to:

  • add repositories
  • install dependency packages
  • install cassandra
  • modify the configuration 
    • /etc/cassandra/cassandra-env.sh 
    • /etc/cassandra/cassandra.yaml
Now that we know the design goals and we have an idea on what's needed to get Cassandra up and running, let's delve into the formula.


metadata.yaml

ensemble: formula
name: cassandra
revision: 1
summary: distributed storage system for structured data
description: |
  Cassandra is a distributed (peer-to-peer) system for the management and
  storage of structured data.
provides:
  database:
    interface: cassandra
  jmx:
    interface: cassandra
peers:
  cluster:
    interface: cassandra-cluster

hooks/install
#!/bin/bash

set -ux

export LANG=en_US.UTF-8

# Install utility packages
DEBIAN_FRONTEND=noninteractive apt-get -y install python-software-properties

# Add facter and facter-plugins repository
echo deb http://ppa.launchpad.net/facter-plugins/ppa/ubuntu oneiric main  >> /etc/apt/sources.list.d/facter-plugins-ppa-oneiric.list
echo deb-src http://ppa.launchpad.net/facter-plugins/ppa/ubuntu oneiric main  >> /etc/apt/sources.list.d/facter-plugins-ppa-oneiric.list
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys B696B50DD8914A9290A4923D6383E098F7D4BE4B

#apt-add-repository ppa:facter-plugins/ppa

# Install the repositories
echo "deb http://www.apache.org/dist/cassandra/debian unstable main" > /etc/apt/sources.list.d/cassandra.list
echo "deb-src http://www.apache.org/dist/cassandra/debian unstable main" >> /etc/apt/sources.list.d/cassandra.list

# Add the key
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys F758CE318D77295D

# Update the repositories
apt-get update

# Install the package
DEBIAN_FRONTEND=noninteractive apt-get install -y openjdk-6-jre-headless jsvc libcommons-daemon-java adduser libjna-java facter facter-customfacts-plugin
cd /tmp
curl -O http://people.canonical.com/~negronjl/cassandra_0.8.3_all.deb
DEBIAN_FRONTEND=noninteractive dpkg -i /tmp/cassandra_0.8.3_all.deb

HOSTNAME=`hostname -f`
IP=`facter ipaddress`
CWD=`dirname $0`
DEFAULT_JMX_PORT=7199
DEFAULT_CLUSTER_PORT=7000
DEFAULT_CLIENT_PORT=9160
DEFAULT_CLUSTER_NAME="Test Cluster"

# Open the necessary ports
if [ -x /usr/bin/open-port ]; then
   open-port ${DEFAULT_JMX_PORT}/TCP
   open-port ${DEFAULT_CLUSTER_PORT}/TCP
   open-port ${DEFAULT_CLIENT_PORT}/TCP
fi

# Persist the data for future use
fact-add cassandra_hostname ${HOSTNAME}
fact-add cassandra_ip ${IP}
fact-add cassandra_default_jmx_port ${DEFAULT_JMX_PORT}
fact-add cassandra_default_cluster_port ${DEFAULT_CLUSTER_PORT}
fact-add cassandra_default_client_port ${DEFAULT_CLIENT_PORT}
fact-add cassandra_default_cluster_name ${DEFAULT_CLUSTER_NAME}

# Update the cassandra environment with the appropriate JMX port
sed -i -e "s/^JMX_PORT=.*/JMX_PORT=\"${DEFAULT_JMX_PORT}\"/" /etc/cassandra/cassandra-env.sh

# Construct the cassandra.yaml file from the appropriate information above
sed -i -e "s/^cluster_name:.*/cluster_name: \'${DEFAULT_CLUSTER_NAME}\'/" \
       -e "s/\- seeds:.*/\- seeds: \"${IP}\"/" \
       -e "s/^storage_port:.*/storage_port: ${DEFAULT_CLUSTER_PORT}/" \
       -e "s/^listen_address:.*/listen_address: ${IP}/" \
       -e "s/^rpc_address:.*/rpc_address: ${IP}/" \
       -e "s/^rpc_port:.*/rpc_port: ${DEFAULT_CLIENT_PORT}/" \
        /etc/cassandra/cassandra.yaml

service cassandra status && service cassandra restart || service cassandra start

Now we should have enough of a formula to deploy a single Cassandra node.  The other hooks in the formula are:
  • jmx-relation-joined ( mainly to advertise our jmx interface )
  • database-relation-joined ( mainly to advertise our database interface )
  • cluster-relation-joined ( persists some values that need to be available to all nodes in the ring )
  • cluster-relation-changed ( we use the data persisted by cluster-relation-joined to reconfigure Cassandra so it shares data with the other nodes and form a ring )
The most interesting hook of the ones above is the cluster-relation-changed one so, I'll show that one here:  

hooks/cluster-relation-changed
#!/bin/bash

set -x

CWD=`dirname $0`

for node in `relation-list`
do
   HOSTNAME=`relation-get hostname ${node}`
   IP=`relation-get ip`
   DEFAULT_JMX_PORT=`relation-get jmx_port ${node}`
   DEFAULT_CLUSTER_PORT=`relation-get cluster_port ${node}`
   DEFAULT_CLIENT_PORT=`relation-get client_port ${node}`
   [ -z ${TMP_SEEDS} ] && TMP_SEEDS=${IP} || TMP_SEEDS="${TMP_SEEDS},${IP}"
done

sed -i -e "s/\- seeds:.*/\- seeds: \"${TMP_SEEDS}\"/" /etc/cassandra/cassandra.yaml

service cassandra status && service cassandra restart || service cassandra start

echo $ENSEMBLE_REMOTE_UNIT modified its settings
echo Relation settings:
relation-get
echo Relation members:
relation-list

Inspection of the other hooks is left as an exercise to the reader :)

Deploying Cassandra

I'll assume that you have followed Ensemble's Getting Started Documentation and have Ensemble properly configured and ready to go.

bzr branch the Cassandra formula ( bzr branch lp:~negronjl/+junk/cassandra )
ensemble bootstrap ( wait a few minutes while the environment is set up )
negronjl@negronjl-laptop:~/src/ensemble/formulas$ ensemble bootstrap2011-08-11 20:45:26,976 INFO Bootstrapping environment 'sample' (type: ec2)...2011-08-11 20:45:37,804 INFO 'bootstrap' command finished successfully
ensemble status ( to ensure that the environment is up )
negronjl@negronjl-laptop:~/src/ensemble/formulas$ ensemble status2011-08-11 20:47:57,196 INFO Connecting to environment.machines:  0: {dns-name: ec2-50-16-150-73.compute-1.amazonaws.com, instance-id: i-57642336}services: {}2011-08-11 20:48:02,029 INFO 'status' command finished successfully
ensemble deploy --repository . cassandra ( to deploy the Cassandra formula )

negronjl@negronjl-laptop:~/src/ensemble/formulas$ ensemble deploy --repository . cassandra2011-08-11 20:48:41,251 INFO Connecting to environment.2011-08-11 20:48:48,659 INFO Formula deployed as service: 'cassandra'2011-08-11 20:48:48,662 INFO 'deploy' command finished successfully
ensemble status ( to ensure Cassandra deployed properly )

negronjl@negronjl-laptop:~/src/ensemble/formulas$ ensemble status2011-08-11 20:49:25,623 INFO Connecting to environment.machines:  0: {dns-name: ec2-50-16-150-73.compute-1.amazonaws.com, instance-id: i-57642336}  1: {dns-name: ec2-50-19-73-31.compute-1.amazonaws.com, instance-id: i-5f62253e}services:  cassandra:    formula: local:cassandra-1    relations: {cluster: cassandra}    units:      cassandra/0:        machine: 1        relations: {}        state: null <---- NOT READY2011-08-11 20:49:36,141 INFO 'status' command finished successfully


negronjl@negronjl-laptop:~/src/ensemble/formulas$ ensemble status
2011-08-11 21:02:36,264 INFO Connecting to environment.
machines:
  0: {dns-name: ec2-50-16-150-73.compute-1.amazonaws.com, instance-id: i-57642336}
  1: {dns-name: ec2-50-19-73-31.compute-1.amazonaws.com, instance-id: i-5f62253e}
services:
  cassandra:
    formula: local:cassandra-1
    relations: {cluster: cassandra}
    units:
      cassandra/0:
        machine: 1
        relations:
          cluster: {state: up}
        state: started  <---- NOW IT IS READY
2011-08-11 21:02:42,506 INFO 'status' command finished successfully
ensemble ssh 1 ( this will ssh into the Cassandra machine )

Once in the Cassandra machine, verify the status of it by typing:
  • nodetool -h `hostname -f` ring

ubuntu@ip-10-245-211-95:~$ nodetool -h `hostname -f` ring
Address         DC          Rack        Status State   Load            Owns    Token                                      
10.245.211.95   datacenter1 rack1       Up     Normal  6.55 KB         100.00% 124681228764612737621872162332718392045  
Back on your machine ( not the Cassandr one ), type the following to add more Cassandra nodes:
  • ensemble add-unit cassandra ( repeat as many times as you want )
negronjl@negronjl-laptop:~/src/ensemble/formulas$ ensemble status
2011-08-11 21:11:40,367 INFO Connecting to environment.
machines:
  0: {dns-name: ec2-50-16-150-73.compute-1.amazonaws.com, instance-id: i-57642336}
  1: {dns-name: ec2-50-19-73-31.compute-1.amazonaws.com, instance-id: i-5f62253e}
  2: {dns-name: ec2-50-17-90-104.compute-1.amazonaws.com, instance-id: i-e1703780}
  3: {dns-name: ec2-174-129-128-232.compute-1.amazonaws.com, instance-id: i-f1703790}
services:
  cassandra:
    formula: local:cassandra-1
    relations: {cluster: cassandra}
    units:
      cassandra/0:
        machine: 1
        relations:
          cluster: {state: up}
        state: started
      cassandra/1:
        machine: 2
        relations:
          cluster: {state: up}
        state: started
      cassandra/2:
        machine: 3
        relations:
          cluster: {state: up}
        state: started
2011-08-11 21:11:54,132 INFO 'status' command finished successfully
After the new nodes have been properly deployed ( you can see the status of the deployment by running ensemble status ), log back on the Cassandra node ( ensemble ssh 1 ) and type:
  • nodetool -h `hostname -f` ring ( to see that the new nodes are being added to the ring )
ubuntu@ip-10-245-211-95:~$ nodetool -h `hostname -f` ringAddress         DC          Rack        Status State   Load            Owns    Token                                                                                                                      124681228764612737621872162332718392045     10.38.33.97     datacenter1 rack1       Up     Normal  11.06 KB        69.21%  72298506053176682474361069083301352072      10.99.45.243    datacenter1 rack1       Up     Normal  15.34 KB        9.26%   88046943828017032654712668424156081726      10.245.211.95   datacenter1 rack1       Up     Normal  11.06 KB        21.53%  124681228764612737621872162332718392045     ubuntu@ip-10-245-211-95:~$ 
As you can see, once you create a formula on Ensemble, it's pretty easy to share and use.

If you have feedback about this ( or any other formula ), I would love to hear from you.
Drop me a line.

-Juan


Easy cassandra deployments with Ubuntu Server and Juju

// August 12th, 2011 // Comments Off // Uncategorized

** This is an updated post reflecting the new name of the project formerly known as Juju now known as Juju **

A very popular database used by many companies and projects these days seem to be Cassandra.
From their website: 
The Apache Cassandra Project develops a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and Bigtable's ColumnFamily-based data model.
I am by no means an expert on Cassandra but, I have done some medium size deployments on Amazon's cloud so, I wanted to translate my knowledge of Cassandra "rings" and develop an Juju charm that could use their peers interfaces to expand and contract the ring as needed.

For the impatient, the cassandra juju charm is here

The rest of us, let's move on to some details about the charm:

  • It should be simple ( juju deploy cassandra ... nothing more than that )
  • It should work stand-alone
  • It should be expandable via peers interfaces
    • grow the cluster/ring via juju add-unit cassandra
  • Make use of the Cassandra default configuration as much as possible.
  • Extract common variables from the configuration file(s) into the charm so they can be changed in the future.
The steps to install Cassandra can be distilled down to:

  • add repositories
  • install dependency packages
  • install cassandra
  • modify the configuration 
    • /etc/cassandra/cassandra-env.sh 
    • /etc/cassandra/cassandra.yaml
Now that we know the design goals and we have an idea on what's needed to get Cassandra up and running, let's delve into the charm.


metadata.yaml

name: cassandra
revision: 1
summary: distributed storage system for structured data
description: |
  Cassandra is a distributed (peer-to-peer) system for the management and
  storage of structured data.
provides:
  database:
    interface: cassandra
  jmx:
    interface: cassandra
peers:
  cluster:
    interface: cassandra-cluster

hooks/install
#!/bin/bash

set -ux

export LANG=en_US.UTF-8

# Install utility packages
DEBIAN_FRONTEND=noninteractive apt-get -y install python-software-properties

# Add facter and facter-plugins repository
echo deb http://ppa.launchpad.net/facter-plugins/ppa/ubuntu oneiric main  >> /etc/apt/sources.list.d/facter-plugins-ppa-oneiric.list
echo deb-src http://ppa.launchpad.net/facter-plugins/ppa/ubuntu oneiric main  >> /etc/apt/sources.list.d/facter-plugins-ppa-oneiric.list
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys B696B50DD8914A9290A4923D6383E098F7D4BE4B

#apt-add-repository ppa:facter-plugins/ppa

# Install the repositories
echo "deb http://www.apache.org/dist/cassandra/debian unstable main" > /etc/apt/sources.list.d/cassandra.list
echo "deb-src http://www.apache.org/dist/cassandra/debian unstable main" >> /etc/apt/sources.list.d/cassandra.list

# Add the key
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys F758CE318D77295D

# Update the repositories
apt-get update

# Install the package
DEBIAN_FRONTEND=noninteractive apt-get install -y openjdk-6-jre-headless jsvc libcommons-daemon-java adduser libjna-java facter facter-customfacts-plugin
cd /tmp
curl -O http://people.canonical.com/~negronjl/cassandra_0.8.3_all.deb
DEBIAN_FRONTEND=noninteractive dpkg -i /tmp/cassandra_0.8.3_all.deb

HOSTNAME=`hostname -f`
IP=`facter ipaddress`
CWD=`dirname $0`
DEFAULT_JMX_PORT=7199
DEFAULT_CLUSTER_PORT=7000
DEFAULT_CLIENT_PORT=9160
DEFAULT_CLUSTER_NAME="Test Cluster"

# Open the necessary ports
if [ -x /usr/bin/open-port ]; then
   open-port ${DEFAULT_JMX_PORT}/TCP
   open-port ${DEFAULT_CLUSTER_PORT}/TCP
   open-port ${DEFAULT_CLIENT_PORT}/TCP
fi

# Persist the data for future use
fact-add cassandra_hostname ${HOSTNAME}
fact-add cassandra_ip ${IP}
fact-add cassandra_default_jmx_port ${DEFAULT_JMX_PORT}
fact-add cassandra_default_cluster_port ${DEFAULT_CLUSTER_PORT}
fact-add cassandra_default_client_port ${DEFAULT_CLIENT_PORT}
fact-add cassandra_default_cluster_name ${DEFAULT_CLUSTER_NAME}

# Update the cassandra environment with the appropriate JMX port
sed -i -e "s/^JMX_PORT=.*/JMX_PORT=\"${DEFAULT_JMX_PORT}\"/" /etc/cassandra/cassandra-env.sh

# Construct the cassandra.yaml file from the appropriate information above
sed -i -e "s/^cluster_name:.*/cluster_name: \'${DEFAULT_CLUSTER_NAME}\'/" \
       -e "s/\- seeds:.*/\- seeds: \"${IP}\"/" \
       -e "s/^storage_port:.*/storage_port: ${DEFAULT_CLUSTER_PORT}/" \
       -e "s/^listen_address:.*/listen_address: ${IP}/" \
       -e "s/^rpc_address:.*/rpc_address: ${IP}/" \
       -e "s/^rpc_port:.*/rpc_port: ${DEFAULT_CLIENT_PORT}/" \
        /etc/cassandra/cassandra.yaml

service cassandra status && service cassandra restart || service cassandra start

Now we should have enough of a charm to deploy a single Cassandra node.  The other hooks in the charm are:
  • jmx-relation-joined ( mainly to advertise our jmx interface )
  • database-relation-joined ( mainly to advertise our database interface )
  • cluster-relation-joined ( persists some values that need to be available to all nodes in the ring )
  • cluster-relation-changed ( we use the data persisted by cluster-relation-joined to reconfigure Cassandra so it shares data with the other nodes and form a ring )
The most interesting hook of the ones above is the cluster-relation-changed one so, I'll show that one here:  

hooks/cluster-relation-changed
#!/bin/bash

set -x

CWD=`dirname $0`

for node in `relation-list`
do
   HOSTNAME=`relation-get hostname ${node}`
   IP=`relation-get ip`
   DEFAULT_JMX_PORT=`relation-get jmx_port ${node}`
   DEFAULT_CLUSTER_PORT=`relation-get cluster_port ${node}`
   DEFAULT_CLIENT_PORT=`relation-get client_port ${node}`
   [ -z ${TMP_SEEDS} ] && TMP_SEEDS=${IP} || TMP_SEEDS="${TMP_SEEDS},${IP}"
done

sed -i -e "s/\- seeds:.*/\- seeds: \"${TMP_SEEDS}\"/" /etc/cassandra/cassandra.yaml

service cassandra status && service cassandra restart || service cassandra start

echo $JUJU_REMOTE_UNIT modified its settings
echo Relation settings:
relation-get
echo Relation members:
relation-list

Inspection of the other hooks is left as an exercise to the reader :)

Deploying Cassandra

I'll assume that you have followed Juju's Getting Started Documentation and have Juju properly configured and ready to go.

bzr branch the Cassandra charm ( bzr branch lp:~negronjl/+junk/cassandra )
juju bootstrap ( wait a few minutes while the environment is set up )
negronjl@negronjl-laptop:~/src/juju/charms$ juju bootstrap2011-08-11 20:45:26,976 INFO Bootstrapping environment 'sample' (type: ec2)...2011-08-11 20:45:37,804 INFO 'bootstrap' command finished successfully
juju status ( to ensure that the environment is up )
negronjl@negronjl-laptop:~/src/juju/charms$ juju status2011-08-11 20:47:57,196 INFO Connecting to environment.machines:  0: {dns-name: ec2-50-16-150-73.compute-1.amazonaws.com, instance-id: i-57642336}services: {}2011-08-11 20:48:02,029 INFO 'status' command finished successfully
juju deploy --repository . cassandra ( to deploy the Cassandra charm )

negronjl@negronjl-laptop:~/src/juju/charms$ juju deploy --repository . cassandra2011-08-11 20:48:41,251 INFO Connecting to environment.2011-08-11 20:48:48,659 INFO Charm deployed as service: 'cassandra'2011-08-11 20:48:48,662 INFO 'deploy' command finished successfully
juju status ( to ensure Cassandra deployed properly )

negronjl@negronjl-laptop:~/src/juju/charms$ juju status2011-08-11 20:49:25,623 INFO Connecting to environment.machines:  0: {dns-name: ec2-50-16-150-73.compute-1.amazonaws.com, instance-id: i-57642336}  1: {dns-name: ec2-50-19-73-31.compute-1.amazonaws.com, instance-id: i-5f62253e}services:  cassandra:    charm: local:cassandra-1    relations: {cluster: cassandra}    units:      cassandra/0:        machine: 1        relations: {}        state: null <---- NOT READY2011-08-11 20:49:36,141 INFO 'status' command finished successfully


negronjl@negronjl-laptop:~/src/juju/charms$ juju status
2011-08-11 21:02:36,264 INFO Connecting to environment.
machines:
  0: {dns-name: ec2-50-16-150-73.compute-1.amazonaws.com, instance-id: i-57642336}
  1: {dns-name: ec2-50-19-73-31.compute-1.amazonaws.com, instance-id: i-5f62253e}
services:
  cassandra:
    charm: local:cassandra-1
    relations: {cluster: cassandra}
    units:
      cassandra/0:
        machine: 1
        relations:
          cluster: {state: up}
        state: started  <---- NOW IT IS READY
2011-08-11 21:02:42,506 INFO 'status' command finished successfully
juju ssh 1 ( this will ssh into the Cassandra machine )

Once in the Cassandra machine, verify the status of it by typing:
  • nodetool -h `hostname -f` ring

ubuntu@ip-10-245-211-95:~$ nodetool -h `hostname -f` ring
Address         DC          Rack        Status State   Load            Owns    Token                                      
10.245.211.95   datacenter1 rack1       Up     Normal  6.55 KB         100.00% 124681228764612737621872162332718392045  
Back on your machine ( not the Cassandr one ), type the following to add more Cassandra nodes:
  • juju add-unit cassandra ( repeat as many times as you want )
negronjl@negronjl-laptop:~/src/juju/charms$ juju status
2011-08-11 21:11:40,367 INFO Connecting to environment.
machines:
  0: {dns-name: ec2-50-16-150-73.compute-1.amazonaws.com, instance-id: i-57642336}
  1: {dns-name: ec2-50-19-73-31.compute-1.amazonaws.com, instance-id: i-5f62253e}
  2: {dns-name: ec2-50-17-90-104.compute-1.amazonaws.com, instance-id: i-e1703780}
  3: {dns-name: ec2-174-129-128-232.compute-1.amazonaws.com, instance-id: i-f1703790}
services:
  cassandra:
    charm: local:cassandra-1
    relations: {cluster: cassandra}
    units:
      cassandra/0:
        machine: 1
        relations:
          cluster: {state: up}
        state: started
      cassandra/1:
        machine: 2
        relations:
          cluster: {state: up}
        state: started
      cassandra/2:
        machine: 3
        relations:
          cluster: {state: up}
        state: started
2011-08-11 21:11:54,132 INFO 'status' command finished successfully
After the new nodes have been properly deployed ( you can see the status of the deployment by running juju status ), log back on the Cassandra node ( juju ssh 1 ) and type:
  • nodetool -h `hostname -f` ring ( to see that the new nodes are being added to the ring )
ubuntu@ip-10-245-211-95:~$ nodetool -h `hostname -f` ringAddress         DC          Rack        Status State   Load            Owns    Token                                                                                                                      124681228764612737621872162332718392045     10.38.33.97     datacenter1 rack1       Up     Normal  11.06 KB        69.21%  72298506053176682474361069083301352072      10.99.45.243    datacenter1 rack1       Up     Normal  15.34 KB        9.26%   88046943828017032654712668424156081726      10.245.211.95   datacenter1 rack1       Up     Normal  11.06 KB        21.53%  124681228764612737621872162332718392045     ubuntu@ip-10-245-211-95:~$ 
As you can see, once you create a charm on Juju, it's pretty easy to share and use.

If you have feedback about this ( or any other charm ), I would love to hear from you.
Drop me a line.

-Juan

MongoDB Replica Sets with Ubuntu Server and Juju

// August 11th, 2011 // Comments Off // Uncategorized

** This is an updated post reflecting the project formerly known as Ensemble, Juju **

I have always liked MongoDB and, recently Juju so, it was a matter of time until I came up with a MongoDB charm for Juju.

Here are some of the goals I set out to accomplish when I started working on this charm:
  • stand alone deployment. 
  • replica sets.  More information about replica sets can be found here.
  • master and server relationships
  • Don't try to solve all deployment scenarios just concentrate on the above ones for now.
Let's start with the stand-alone deployment first and, we'll add the other functionality a bit later.

Before we go into creating the directories and files, I should probably mention Charm Tools.  Charm Tools is ( as the name implies ) a set of tools that facilitates the creation of charms for juju.

You can get charm-tools on most supported release of Ubuntu in the Juju ppa:
sudo add-apt-repository ppa:juju/pkgs
sudo apt-get update
sudo apt-get install charm-tools
After installing charm-tools, go to the directory where you will be creating your charms and type the following to get started:
  • charm create mongodb
The above commands will look in your cache for a package called mongodb and create a "skeleton" structure with the metadata.yaml, hooks and descriptions already done for you into a directory called ( you guessed it ), mongodb.

The structure should look something like this:
mongodb:
mongodb/metadata.yaml
mongodb/hooks
mongodb/hooks/install
mongodb/hooks/start
mongodb/hooks/stop
mongodb/hooks/relation-name-relation-joined
mongodb/hooks/relation-name-relation-departed
mongodb/hooks/relation-name-relation-changed
mongodb/hooks/relation-name-relation-broken
metadata.yaml
At this point in the development, the metadata.yaml file should look very similar to this:

name: mongodb
revision: 1
summary: An object/document-oriented database (metapackage)
description: |
  MongoDB is a high-performance, open source, schema-free document-
  oriented  data store that's easy to deploy, manage and use. It's
  network accessible, written in C++ and offers the following features :
  * Collection oriented storage - easy storage of object-     style data
  * Full index support, including on inner objects   * Query profiling
  * Replication and fail-over support   * Efficient storage of binary
  data including large     objects (e.g. videos)   * Auto-sharding for
  cloud-level scalability (Q209) High performance, scalability, and
  reasonable depth of functionality are the goals for the project.  This
  is a metapackage that depends on all the mongodb parts.
provides:
  relation-name:
    interface: interface-name
requires:
  relation-name:
    interface: interface-name
peers:
  relation-name:
    interface: interface-name

For our purposes, let's change the emphasized lines to the following:

provides:
  database:
    interface: mongodb
peers:
  replica-set:
    interface: mongodb-replica-set

The peers section will be used when we start working with replica sets so, let's just ignore that one for now.

provides: is the way we "announce" what our particular charm ...well ... provides.  In this case we provide a database interface by the name of mongodb.


Not much else to do with metadata.yaml file as charm create did the brunt of work here for us.

hooks/install
charm create also took care of providing us with a basic install script based on the mongodb package already available in Ubuntu.  It should look very similar to this:



#!/bin/bash
# Here do anything needed to install the service
# i.e. apt-get install -y foo  or  bzr branch http://myserver/mycode /srv/webroot


apt-get install -y mongodb

After some trial and error and some debugging, here is what I came up with:




#!/bin/bash
# Here do anything needed to install the service
# i.e. apt-get install -y foo  or  bzr branch http://myserver/mycode /srv/webroot


set -ux


#################################################################################
# Install some utility packages needed for installation
#################################################################################
rm -f /etc/apt/sources.list.d/facter-plugins-ppa-oneiric.list
echo deb http://ppa.launchpad.net/facter-plugins/ppa/ubuntu oneiric main  >> /etc/apt/sources.list.d/facter-plugins-ppa-oneiric.list
echo deb-src http://ppa.launchpad.net/facter-plugins/ppa/ubuntu oneiric main  >> /etc/apt/sources.list.d/facter-plugins-ppa-oneiric.list
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys B696B50DD8914A9290A4923D6383E098F7D4BE4B


#apt-add-repository ppa:facter-plugins/ppa
apt-get update
DEBIAN_FRONTEND=noninteractive apt-get -y install facter facter-customfacts-plugin


##################################################################################
# Set some variables that we'll need for later
##################################################################################


DEFAULT_REPLSET_NAME="myset"
HOSTNAME=`hostname -f`
EPOCH=`date +%s`
fact-add replset-name ${DEFAULT_REPLSET_NAME}
fact-add install-time ${EPOCH}




##################################################################################
# Install mongodb
##################################################################################


DEBIAN_FRONTEND=noninteractive apt-get install -y mongodb




##################################################################################
# Change the default mongodb configuration to bind to relfect that we are a master
##################################################################################


sed -e "s/#master = true/master = true/" -e "s/bind_ip/#bind_ip/" -i /etc/mongodb.conf




##################################################################################
# Reconfigure the upstart script to include the replica-set option.
# We'll need this so, when we add nodes, they can all talk to each other.
# Replica sets can only talk to each other if they all belong to the same
# set.  In our case, we have defaulted to "myset".
##################################################################################
sed -i -e "s/ -- / -- --replSet ${DEFAULT_REPLSET_NAME} /" /etc/init/mongodb.conf




##################################################################################
# stop then start ( *** not restart **** )  mongodb so we can finish the configuration
##################################################################################
service mongodb stop
# There is a bug in the upstart script that leaves a lock file orphaned.... Let's wipe that file out
rm -f /var/lib/mongodb/mongod.lock
service mongodb start




##################################################################################
# Register the port
##################################################################################
[ -x /usr/bin/open-port ] && open-port 27017/TCP

I have tried to comment the install script so you have an idea of what's going on ...


hooks/start
This is the script that Juju will call to start mongodb.  Here is what mine looks like:



#!/bin/bash
# Here put anything that is needed to start the service.
# Note that currently this is run directly after install
# i.e. 'service apache2 start'


service mongodb status && service mongodb restart || service mongodb start


It's simple enough.

hooks/stop

#!/bin/bash
# This will be run when the service is being torn down, allowing you to disable
# it in various ways..
# For example, if your web app uses a text file to signal to the load balancer
# that it is live... you could remove it and sleep for a bit to allow the load
# balancer to stop sending traffic.
# rm /srv/webroot/server-live.txt && sleep 30


service mongodb stop
rm -f /var/lib/mongodb/mongod.lock

This is the script that Juju calls when it needs to stop a service.

hooks/relation-name-relation-[joined|changed|broken|departed]
These files are templates for the relationships ( provides, requires, peers, etc. ) declared in the metadata.yaml file.  Here is a look at the ones that I have for mongodb:
  • Per the metadata.yaml, we need to define the following relationships:
    • database
    • replica-set
Based on that information, here are the files that I created for this charm:
database-relation-joined
#!/bin/bash
# This must be renamed to the name of the relation. The goal here is to
# affect any change needed by relationships being formed
# This script should be idempotent.

set -ux

relation-set hostname=`hostname -f` replset=`facter replset-name`

echo $JUJU_REMOTE_UNIT joined

replica-set-relation-joined
#!/bin/bash
# This must be renamed to the name of the relation. The goal here is to
# affect any change needed by relationships being formed
# This script should be idempotent.

set -ux

relation-set hostname=`hostname -f` replset=`facter replset-name` install-time=`facter install-time`

echo $JUJU_REMOTE_UNIT joined

replica-set-relation-changed
#!/bin/bash
# This must be renamed to the name of the relation. The goal here is to
# affect any change needed by relationships being formed, modified, or broken
# This script should be idempotent.

##################################################################################
# Set debugging information
##################################################################################
set -ux

##################################################################################
# Set some global variables
##################################################################################
MY_HOSTNAME=`hostname -f`
MY_REPLSET=`facter replset-name`
MY_INSTALL_TIME=`facter install-time`

MASTER_HOSTNAME=${MY_HOSTNAME}
MASTER_REPLSET=${MY_REPLSET}
MASTER_INSTALL_TIME=${MY_INSTALL_TIME}

echo "My hosntmae: ${MY_HOSTNAME}"
echo "My ReplSet: ${MY_REPLSET}"
echo "My install time: ${MY_INSTALL_TIME}"

##################################################################################
# Here we need to find out which is the first node ( we record the install time ).
# The one with the lowest install time is the master.
# Initialize the master node.
# Add the other nodes to the master's replica set.
##################################################################################
# Find the master ( lowest install time )
for MEMBER in `relation-list`
do
   HOSTNAME=`relation-get hostname ${MEMBER}`
   REPLSET=`relation-get replset ${MEMBER}`
   INSTALL_TIME=`relation-get install-time ${MEMBER}`
   [ ${INSTALL_TIME} -lt ${MASTER_INSTALL_TIME} ] && MASTER_INSTALL_TIME=${INSTALL_TIME}
done

echo "Master install-time: ${MASTER_INSTALL_TIME}"

# We should now have the lowest member of this relationship.  Let's get all of the information about it.
for MEMBER in `relation-list`
do
   HOSTNAME=`relation-get hostname ${MEMBER}`
   REPLSET=`relation-get replset ${MEMBER}`
   INSTALL_TIME=`relation-get install-time ${MEMBER}`
   if [ ${INSTALL_TIME} -eq ${MASTER_INSTALL_TIME} ]; then
      MASTER_HOSTNAME=${HOSTNAME}
      MASTER_REPLSET=${REPLSET}
   fi
done

echo "Master Hostname: ${MASTER_HOSTNAME}"
echo "Master ReplSet: ${MASTER_REPLSET}"
echo "Master install time: ${MASTER_INSTALL_TIME}"

# We should now have all the information about the master node.
# If the node has already been initialized, it will just inform you
# about it with no other consequence.
if [ ${MASTER_INSTALL_TIME} -eq ${MY_INSTALL_TIME} ]; then
   mongo --eval "rs.initiate()"
else
   mongo --host ${MASTER_HOSTNAME} --eval "rs.initiate()"
fi

# Now we need to add the rest of nodes to the replica set
for MEMBER in `relation-list`
do
   HOSTNAME=`relation-get hostname ${MEMBER}`
   REPLSET=`relation-get replset ${MEMBER}`
   INSTALL_TIME=`relation-get install-time ${MEMBER}`
   if [ ${MASTER_INSTALL_TIME} -ne ${INSTALL_TIME} ]; then
      if [ ${INSTALL_TIME} -eq ${MY_INSTALL_TIME} ]; then
         mongo --eval "rs.add(\""${HOSTNAME}"\")"
      else
         mongo --host ${MASTER_HOSTNAME} --eval "rs.add(\""${HOSTNAME}"\")"
      fi
   fi
done

echo $JUJU_REMOTE_UNIT modified its settings
echo Relation settings:
relation-get
echo Relation members:
relation-list

You can delete the relation-name* files now that you have created the real ones needed for this charm.
In case typing all of this is not to your liking, the charm can be found here.

You can now deploy this charm as follows:
  • juju bootstrap # bootstraps the system
  • juju deploy --repository . mongodb # deploys mongodb
  • juju status # to see what's going on 
The above commands satisfy one of our design goals, standalone deployment.  Let's check out the replica sets.  Type this:
  • juju add-unit mongodb
And that's all that is needed to add a new mongodb node that will automatically create a replica set with the existing node.  You can continue to "add-unit" to add more nodes to the replica set.  Notice that all of the configuration, is taken care of with the replica-set-relation-joined and replica-set-relation-changed hook scripts that we wrote above.

The beauty of this charm is that the user doesn't really have to know exactly what is needed to get a replica set cluster up and running.  Juju charms are self-contained and idempotent.  This means portability.






MongoDB Replica Sets with Ubuntu Server and Ensemble

// August 11th, 2011 // 2 Comments » // Uncategorized

I have always liked MongoDB and, recently Ensemble so, it was a matter of time until I came up with a MongoDB formula for Ensemble.

Here are some of the goals I set out to accomplish when I started working on this formula:
  • stand alone deployment. 
  • replica sets.  More information about replica sets here.
  • master and server relationships
  • Don't try to solve all deployment scenarios just concentrate on the above ones for now.
Let's start with the stand-alone deployment first and, we'll add the other functionality a bit later.

Before we go into creating the directories and files, I should probably mention Principia Tools.  Principia Tools is ( as the name implies ) a set of tools that facilitates the creation of formulas for ensemble.

You can get principia-tools on most supported release of Ubuntu in the Ensemble ppa:
sudo add-apt-repository ppa:ensemble/ppa
sudo apt-get update
sudo apt-get install principia-tools
After installing principia-tools, go to the directory where you will be creating your formulas and type the following to get started:
  • principia formulate mongodb
The above commands will look in your cache for a package called mongodb and create a "skeleton" structure with the metadata.yaml, hooks and descriptions already done for you into a directory called ( you guessed it ), mongodb.

The structure should look something like this:
mongodb:
mongodb/metadata.yaml
mongodb/hooks
mongodb/hooks/install
mongodb/hooks/start
mongodb/hooks/stop
mongodb/hooks/relation-name-relation-joined
mongodb/hooks/relation-name-relation-departed
mongodb/hooks/relation-name-relation-changed
mongodb/hooks/relation-name-relation-broken
metadata.yaml
At this point in the development, the metadata.yaml file should look very similar to this:

ensemble: formula
name: mongodb
revision: 1
summary: An object/document-oriented database (metapackage)
description: |
  MongoDB is a high-performance, open source, schema-free document-
  oriented  data store that's easy to deploy, manage and use. It's
  network accessible, written in C++ and offers the following features :
  * Collection oriented storage - easy storage of object-     style data
  * Full index support, including on inner objects   * Query profiling
  * Replication and fail-over support   * Efficient storage of binary
  data including large     objects (e.g. videos)   * Auto-sharding for
  cloud-level scalability (Q209) High performance, scalability, and
  reasonable depth of functionality are the goals for the project.  This
  is a metapackage that depends on all the mongodb parts.
provides:
  relation-name:
    interface: interface-name
requires:
  relation-name:
    interface: interface-name
peers:
  relation-name:
    interface: interface-name

For our purposes, let's change the emphasized lines to the following:

provides:
  database:
    interface: mongodb
peers:
  replica-set:
    interface: mongodb-replica-set

The peers section will be used when we start working with replica sets so, let's just ignore that one for now.

provides: is the way we "announce" what our particular formula ...well ... provides.  In this case we provide a database interface by the name of mongodb.


Not much else to do with metadata.yaml file as principia formulate did the brunt of work here for us.

hooks/install
formulate also took care of providing us with a basic install script based on the mongodb package already available in Ubuntu.  It should look very similar to this:



#!/bin/bash
# Here do anything needed to install the service
# i.e. apt-get install -y foo  or  bzr branch http://myserver/mycode /srv/webroot


apt-get install -y mongodb

After some trial and error and some debugging, here is what I came up with:




#!/bin/bash
# Here do anything needed to install the service
# i.e. apt-get install -y foo  or  bzr branch http://myserver/mycode /srv/webroot


set -ux


#################################################################################
# Install some utility packages needed for installation
#################################################################################
rm -f /etc/apt/sources.list.d/facter-plugins-ppa-oneiric.list
echo deb http://ppa.launchpad.net/facter-plugins/ppa/ubuntu oneiric main  >> /etc/apt/sources.list.d/facter-plugins-ppa-oneiric.list
echo deb-src http://ppa.launchpad.net/facter-plugins/ppa/ubuntu oneiric main  >> /etc/apt/sources.list.d/facter-plugins-ppa-oneiric.list
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys B696B50DD8914A9290A4923D6383E098F7D4BE4B


#apt-add-repository ppa:facter-plugins/ppa
apt-get update
DEBIAN_FRONTEND=noninteractive apt-get -y install facter facter-customfacts-plugin


##################################################################################
# Set some variables that we'll need for later
##################################################################################


DEFAULT_REPLSET_NAME="myset"
HOSTNAME=`hostname -f`
EPOCH=`date +%s`
fact-add replset-name ${DEFAULT_REPLSET_NAME}
fact-add install-time ${EPOCH}




##################################################################################
# Install mongodb
##################################################################################


DEBIAN_FRONTEND=noninteractive apt-get install -y mongodb




##################################################################################
# Change the default mongodb configuration to bind to relfect that we are a master
##################################################################################


sed -e "s/#master = true/master = true/" -e "s/bind_ip/#bind_ip/" -i /etc/mongodb.conf




##################################################################################
# Reconfigure the upstart script to include the replica-set option.
# We'll need this so, when we add nodes, they can all talk to each other.
# Replica sets can only talk to each other if they all belong to the same
# set.  In our case, we have defaulted to "myset".
##################################################################################
sed -i -e "s/ -- / -- --replSet ${DEFAULT_REPLSET_NAME} /" /etc/init/mongodb.conf




##################################################################################
# stop then start ( *** not restart **** )  mongodb so we can finish the configuration
##################################################################################
service mongodb stop
# There is a bug in the upstart script that leaves a lock file orphaned.... Let's wipe that file out
rm -f /var/lib/mongodb/mongod.lock
service mongodb start




##################################################################################
# Register the port
##################################################################################
[ -x /usr/bin/open-port ] && open-port 27017/TCP

I have tried to comment the install script so you have an idea of what's going on ...


hooks/start
This is the script that Ensemble will call to start mongodb.  Here is what mine looks like:



#!/bin/bash
# Here put anything that is needed to start the service.
# Note that currently this is run directly after install
# i.e. 'service apache2 start'


service mongodb status && service mongodb restart || service mongodb start


It's simple enough.

hooks/stop

#!/bin/bash
# This will be run when the service is being torn down, allowing you to disable
# it in various ways..
# For example, if your web app uses a text file to signal to the load balancer
# that it is live... you could remove it and sleep for a bit to allow the load
# balancer to stop sending traffic.
# rm /srv/webroot/server-live.txt && sleep 30


service mongodb stop
rm -f /var/lib/mongodb/mongod.lock

This is the script that Ensemble calls when it needs to stop a service.

hooks/relation-name-relation-[joined|changed|broken|departed]
These files are templates for the relationships ( provides, requires, peers, etc. ) declared in the metadata.yaml file.  Here is a look at the ones that I have for mongodb:
  • Per the metadata.yaml, we need to define the following relationships:
    • database
    • replica-set
Based on that information, here are the files that I created for this formula:
database-relation-joined
#!/bin/bash
# This must be renamed to the name of the relation. The goal here is to
# affect any change needed by relationships being formed
# This script should be idempotent.

set -ux

relation-set hostname=`hostname -f` replset=`facter replset-name`

echo $ENSEMBLE_REMOTE_UNIT joined

replica-set-relation-joined
#!/bin/bash
# This must be renamed to the name of the relation. The goal here is to
# affect any change needed by relationships being formed
# This script should be idempotent.

set -ux

relation-set hostname=`hostname -f` replset=`facter replset-name` install-time=`facter install-time`

echo $ENSEMBLE_REMOTE_UNIT joined

replica-set-relation-changed
#!/bin/bash
# This must be renamed to the name of the relation. The goal here is to
# affect any change needed by relationships being formed, modified, or broken
# This script should be idempotent.

##################################################################################
# Set debugging information
##################################################################################
set -ux

##################################################################################
# Set some global variables
##################################################################################
MY_HOSTNAME=`hostname -f`
MY_REPLSET=`facter replset-name`
MY_INSTALL_TIME=`facter install-time`

MASTER_HOSTNAME=${MY_HOSTNAME}
MASTER_REPLSET=${MY_REPLSET}
MASTER_INSTALL_TIME=${MY_INSTALL_TIME}

echo "My hosntmae: ${MY_HOSTNAME}"
echo "My ReplSet: ${MY_REPLSET}"
echo "My install time: ${MY_INSTALL_TIME}"

##################################################################################
# Here we need to find out which is the first node ( we record the install time ).
# The one with the lowest install time is the master.
# Initialize the master node.
# Add the other nodes to the master's replica set.
##################################################################################
# Find the master ( lowest install time )
for MEMBER in `relation-list`
do
   HOSTNAME=`relation-get hostname ${MEMBER}`
   REPLSET=`relation-get replset ${MEMBER}`
   INSTALL_TIME=`relation-get install-time ${MEMBER}`
   [ ${INSTALL_TIME} -lt ${MASTER_INSTALL_TIME} ] && MASTER_INSTALL_TIME=${INSTALL_TIME}
done

echo "Master install-time: ${MASTER_INSTALL_TIME}"

# We should now have the lowest member of this relationship.  Let's get all of the information about it.
for MEMBER in `relation-list`
do
   HOSTNAME=`relation-get hostname ${MEMBER}`
   REPLSET=`relation-get replset ${MEMBER}`
   INSTALL_TIME=`relation-get install-time ${MEMBER}`
   if [ ${INSTALL_TIME} -eq ${MASTER_INSTALL_TIME} ]; then
      MASTER_HOSTNAME=${HOSTNAME}
      MASTER_REPLSET=${REPLSET}
   fi
done

echo "Master Hostname: ${MASTER_HOSTNAME}"
echo "Master ReplSet: ${MASTER_REPLSET}"
echo "Master install time: ${MASTER_INSTALL_TIME}"

# We should now have all the information about the master node.
# If the node has already been initialized, it will just inform you
# about it with no other consequence.
if [ ${MASTER_INSTALL_TIME} -eq ${MY_INSTALL_TIME} ]; then
   mongo --eval "rs.initiate()"
else
   mongo --host ${MASTER_HOSTNAME} --eval "rs.initiate()"
fi

# Now we need to add the rest of nodes to the replica set
for MEMBER in `relation-list`
do
   HOSTNAME=`relation-get hostname ${MEMBER}`
   REPLSET=`relation-get replset ${MEMBER}`
   INSTALL_TIME=`relation-get install-time ${MEMBER}`
   if [ ${MASTER_INSTALL_TIME} -ne ${INSTALL_TIME} ]; then
      if [ ${INSTALL_TIME} -eq ${MY_INSTALL_TIME} ]; then
         mongo --eval "rs.add(\""${HOSTNAME}"\")"
      else
         mongo --host ${MASTER_HOSTNAME} --eval "rs.add(\""${HOSTNAME}"\")"
      fi
   fi
done

echo $ENSEMBLE_REMOTE_UNIT modified its settings
echo Relation settings:
relation-get
echo Relation members:
relation-list

You can delete the relation-name* files now that you have created the real ones needed for this formula.
In case typing all of this is not to your liking, the formula can be found here.

You can now deploy this formula as follows:
  • ensemble bootstrap # bootstraps the system
  • ensemble deploy --repository . mongodb # deploys mongodb
  • ensemble status # to see what's going on 
The above commands satisfy one of our design goals, standalone deployment.  Let's check out the replica sets.  Type this:
  • ensemble add-unit mongodb
And that's all that is needed to add a new mongodb node that will automatically create a replica set with the existing node.  You can continue to "add-unit" to add more nodes to the replica set.  Notice that all of the configuration, is taken care of with the replica-set-relation-joined and replica-set-relation-changed hook scripts that we wrote above.

The beauty of this formula is that the user doesn't really have to know exactly what is needed to get a replica set cluster up and running.  Ensemble formulas are self-contained and idempotent.  This means portability.







Hadoop cluster with Ubuntu server and Ensemble

// August 8th, 2011 // 7 Comments » // Uncategorized

A while back I started experimenting with Ensemble and was intrigued by the notion of services instead of machines.

A bit of background on Ensemble from their website:

  • Ensemble is a next generation service orchestration framework. It has been likened to APT for the cloud. With Ensemble, different authors are able to create service formulas independently, and make those services coordinate their communication through a simple protocol. Users can then take the product of different authors and very comfortably deploy those services in an environment. The result is multiple machines and components transparently collaborating towards providing the requested service.

I come from a DevOps background and know first hand the troubles and tribulations of deploying production services, webapps, etc.  One that's particularly "thorny" is hadoop.

To deploy a hadoop cluster, we would need to download the dependencies ( java, etc. ), download hadoop, configure it and deploy it.  This process is somewhat different depending on the type of node that you're deploying ( ie: namenode, job-tracker, etc. ).  This is a multi-step process that requires too much human intervention.  It is also a process that is difficult to automate and reproduce.  Imagine 10, 20 or 50 node cluster using this method.  It can get frustrating quickly and it is prone to mistake.

With this experience in mind ( and a lot of reading ), I set out to deploy a hadoop cluster using an Ensemble formula.

First things first, let's install Ensemble.  Follow the Getting Started documentation on the Ensemble site here.

According to the Ensemble documenation, we just need to follow some file naming conventions for what they call "hooks" ( executable scripts in your language of choice that perform certain actions ).  These "hooks" control the installation, relationships, start, stop, etc of your formula.  We also need to summarize the description of the formula in a file called metadata.yaml.  The metadata.yaml file describes the formula, it's interfaces, what it requires and provides among other things.  More on this file later when I show you the one for hadoop-master and hadoop-slave.

Armed with a bit of knowledge and a desire for simplicity, I decided to split the hadoop cluster in two:

  • hadoop-master (namenode and jobtracker )
  • hadoop-slave ( datanode and tasktracker )
I know this is not an all-encompassing list but, this will take care of a good portion of deployments and, the ensemble formulas are easy enough to modify that you can work your changes into them.

One of my colleagues, Brian Thomason did a lot of packaging for these formulas so, my job is now easier.  The configuration for the packages has been distilled down to three questions:

  1. namenode ( leave blank if you are the namenode )
  2. jobtracker ( leave blank if you are the jobtracker )
  3. hdfs data directory ( leave blank to use the default: /var/lib/hadoop-0.20/dfs/data )
Due to the magic of Ubuntu packaging, we can even "preseed" the answers to those questions to avoid being asked about them ( and stopping the otherwise automatic process ). We'll use the utility debconf-set-selections for this.  Here is a piece of the code that I use to preseed the values in my formula:
  • echo debconf hadoop/namenode string ${NAMENODE}| /usr/bin/debconf-set-selections
  • echo debconf hadoop/jobtracker string ${JOBTRACKER}| /usr/bin/debconf-set-selections
  • echo debconf hadoop/hdfsdatadir string ${HDFSDATADIR}| /usr/bin/debconf-set-selections
The variable names should be self explanatory.  

Thanks to Brian's work, I now just have to install the packages ( hadoop-0.20-namenode and hadoop-0.20-jobtracker).  Let's put all of this together into an ensemble formula.

  • Create a directory for the hadoop-master formula ( mkdir hadoop-master )
  • Make a directory for the hooks of this formula ( mkdir hadoop-master/hooks )
  • Let's start with the always needed metadata.yaml file ( hadoop-master/metadata.yaml ):
ensemble: formula
name: hadoop-master
revision: 1
summary: Master Node for Hadoop
description: |
  The Hadoop Distributed Filesystem (HDFS) requires one unique server, the
  namenode, which manages the block locations of files on the
  filesystem.  The jobtracker is a central service which is responsible
  for managing the tasktracker services running on all nodes in a
  Hadoop Cluster.  The jobtracker allocates work to the tasktracker
  nearest to the data with an available work slot.
provides:
  hadoop-master:
    interface: hadoop-master

  • Every Ensemble formula has an install script ( in our case: hadoop-master/hooks/install ).  This is an executable file in your language of choice that ensemble will run when it's time to install your formula.  Anything and everything that needs to happen for your formula to install, needs to be inside of that file.  Let's take a look at the install script of hadoop-master:
#!/bin/bash
# Here do anything needed to install the service
# i.e. apt-get install -y foo  or  bzr branch http://myserver/mycode /srv/webroot


##################################################################################
# Set debugging
##################################################################################
set -ux
ensemble-log "install script"


##################################################################################
# Add the repositories
##################################################################################
export TERM=linux
# Add the Hadoop PPA
ensemble-log "Adding ppa"
apt-add-repository ppa:canonical-sig/thirdparty
ensemble-log "updating cache"
apt-get update


##################################################################################
# Calculate our IP Address
##################################################################################
ensemble-log "calculating ip"
IP_ADDRESS=`hostname -f`
ensemble-log "Private IP: ${IP_ADDRESS}"


##################################################################################
# Preseed our Namenode, Jobtracker and HDFS Data directory
##################################################################################
NAMENODE="${IP_ADDRESS}"
JOBTRACKER="${IP_ADDRESS}"
HDFSDATADIR="/var/lib/hadoop-0.20/dfs/data"
ensemble-log "Namenode: ${NAMENODE}"
ensemble-log "Jobtracker: ${JOBTRACKER}"
ensemble-log "HDFS Dir: ${HDFSDATADIR}"

echo debconf hadoop/namenode string ${NAMENODE}| /usr/bin/debconf-set-selections
echo debconf hadoop/jobtracker string ${JOBTRACKER}| /usr/bin/debconf-set-selections
echo debconf hadoop/hdfsdatadir string ${HDFSDATADIR}| /usr/bin/debconf-set-selections


##################################################################################
# Install the packages
##################################################################################
ensemble-log "installing packages"
apt-get install -y hadoop-0.20-namenode
apt-get install -y hadoop-0.20-jobtracker


##################################################################################
# Open the necessary ports
##################################################################################
if [ -x /usr/bin/open-port ];then
   open-port 50010/TCP
   open-port 50020/TCP
   open-port 50030/TCP
   open-port 50105/TCP
   open-port 54310/TCP
   open-port 54311/TCP
   open-port 50060/TCP
   open-port 50070/TCP
   open-port 50075/TCP
   open-port 50090/TCP
fi


  • There a few other files that we need to create ( start and stop ) to get the hadoop-master formula installed.  Let's see those files:
    • start
#!/bin/bash
# Here put anything that is needed to start the service.
# Note that currently this is run directly after install
# i.e. 'service apache2 start'

set -x
service hadoop-0.20-namenode status && service hadoop-0.20-namenode restart || service hadoop-0.20-namenode start
service hadoop-0.20-jobtracker status && service hadoop-0.20-jobtracker restart || service hadoop-0.20-jobtracker start

    • stop
#!/bin/bash
# This will be run when the service is being torn down, allowing you to disable
# it in various ways..
# For example, if your web app uses a text file to signal to the load balancer
# that it is live... you could remove it and sleep for a bit to allow the load
# balancer to stop sending traffic.
# rm /srv/webroot/server-live.txt && sleep 30

set -x
ensemble-log "stop script"
service hadoop-0.20-namenode stop
service hadoop-0.20-jobtracker stop

Let's go back to the metadata.yaml file and examin it in more detail:

ensemble: formula
name: hadoop-master
revision: 1
summary: Master Node for Hadoop
description: |
  The Hadoop Distributed Filesystem (HDFS) requires one unique server, the
  namenode, which manages the block locations of files on the
  filesystem.  The jobtracker is a central service which is responsible
  for managing the tasktracker services running on all nodes in a
  Hadoop Cluster.  The jobtracker allocates work to the tasktracker
  nearest to the data with an available work slot.
provides:
  hadoop-master:
    interface: hadoop-master

The emphasized section ( provides ) tells ensemble that this formula provides an interface named hadoop-master that can be used in relationships with other formulas ( in our case we'll be using it to connect the hadoop-master with the hadoop-slave formula that we'll be writing a bit later ).  For this relationship to work, we need to let Ensemble know what to do ( More detailed information about relationships in formulas can be found here ).

Per the Ensemble documentation, we need to name our relationship hooks hadoop-master-relation-joined  and it should also be an executable script in your language of choice.  Let's see what that file looks like:

#!/bin/sh
# This must be renamed to the name of the relation. The goal here is to
# affect any change needed by relationships being formed
# This script should be idempotent.

set -x

ensemble-log "joined script started"

# Calculate our IP Address
IP_ADDRESS=`hostname -f`

# Preseed our Namenode, Jobtracker and HDFS Data directory
NAMENODE="${IP_ADDRESS}"
JOBTRACKER="${IP_ADDRESS}"
HDFSDATADIR="/var/lib/hadoop-0.20/dfs/data"

relation-set namenode="${NAMENODE}" jobtracker="${JOBTRACKER}" hdfsdatadir="${HDFSDATADIR}"

echo $ENSEMBLE_REMOTE_UNIT joined

Your formula directory should now look something like this:
hadoop-master
hadoop-master/metadata.yaml
hadoop-master/hooks/install
hadoop-master/hooks/start
hadoop-master/hooks/stop
hadoop-master/hooks/hadoop-master-relation-joined
 This formula should now be complete...  It's not too exciting yet as it doesn't have the hadoop-slave counterpart to it but, it is a complete formula.

The latest version of the hadoop-master formula can be found here if you want to get it.

The hadoop-slave formula is almost the same as the hadoop-master formula with some exceptions.  Those I'll leave as an exercise for the reader.

The hadoop-slave formula can be found here if you want to get it.

Once you have both formulas ( hadoop-master and hadoop-slave ) you can easily deploy your cluster by typing:

  • ensemble bootstrap   # ( creates/bootstraps the ensemble environment)
  • ensemble deploy --repository . hadoop-master # ( deploys hadoop-master )
  • ensemble deploy --repository . hadoop-slave # ( deploys hadoop-slave )
  • ensemble add-relation hadoop-slave hadoop-master # ( connects the hadoop-slave to the hadoop-master )
As you can see, once you have the formula written and tested, deploying the cluster is really a matter of a few commands.  The above example gives you one hadoop-master ( namenode, jobtracker ) and one hadoop-slave ( datanode, tasktracker ).

To add another node to this existing hadoop cluster, we add:

  • ensemble add-unit hadoop-slave # ( this adds one more slave )
Run the above command multiple times to continue to add hadoop-slave nodes to your cluster.

Ensemble allows you to catalog the steps needed to get your service/application installed, configured and running properly.  Once your knowledge has been captured in an ensemble formula, it can be re-used by you or others without much knowledge of what's needed to get the application/service running.

In the DevOps world, this code re-usability can save time, effort and money by providing self contained formulas that provide a service or application.


Ensemble meets Hadoop on the cloud

// August 8th, 2011 // 8 Comments » // Uncategorized

Hadoop

So you wanted to play with hadoop to crunch on some big-data problems, except that, well getting a hadoop cluster up and running in not exactly a one minute thing! Let me show you how to make it “a one minute thing” using Ensemble! Since Ensemble now has formulas for creating hadoop master and slave nodes, thanks to the great work of Juan Negron. Spinning up a hadoop cluster could not be easier! Check this video out


If you can’t see the embedded video, here’s a direct link http://youtu.be/e8IKkWJj7bA

Yep that’s how simple it is! If you want to scale-out the cluster, you only need to ask Ensemble to do it for you:
$ ensemble add-unit hadoop-slave

If you’re interested to learn more about exactly how this “magic” works, check out this indepth guide dissecting how the hadoop Ensemble formulas exactly work by non-other than Juan Negron, the formula author.

So is this easier than configuring a hadoop cluster manually? Leave me a comment, let me know your thoughts! Also let me know what you’d like to see deployed next with Ensemble. Be sure to drop in to #ubuntu-ensemble on freenode irc and say hi