Provisioning tools are the foundation of the DevOps movement. But you don't need to compare all the provisioning tools in some kind of March Madness bracket system. If you have only a handful of machines with the same OS, you can probably get by with bash. Let's see what it can do. But first, a little history on some provisioning tools.

Masterless with Salt and Puppet

If you know anything about automated provisioning tools, you probably know that masterless is the way to go. All the common tools (except for ansible) weren't designed for masterless operation from the start. Salt-SSH has always been treated as a second class citizen. And puppet-masterless is just an undocumented pattern that you may or may not adopt, there's nothing official.

A lot of the time, you're on a small team and most of the other members feel like choosing one of these tools is so critical of a choice, that nobody wants to take the first step and use *any* of them for fear of making a bad choice. Fear not, bash is never a wrong choice ;)

Truth be told, there are some good ideas about structure that all these provision tools put forward. The downside is that learning each system is very steep. Being good at a chosen system is very, very steep. I have committed one fix to Salt-SSH, so I feel like I know what I'm talking about.

Salt has the concept of states and pillars. States are recipes that keep your server in a certain state of configuration. If you want apache running, it makes sure apache is in the running state, whether it has to install it and start it, restart it, or do nothing. The pillars are the folders and files that tie specific values to each server.

The best example of this is probably user accounts. You can have a state that says "make user accounts active, ensure they have SSH keys, and make their default VIM configuration like so". The *pillars* say which user accounts, which passwords and which SSH keys go with which machines.

Puppet has a similar separate between actions and values. The puppet value system is called "Hiera". It is relatively new, and you'll find a lot of documentation referencing both the old and new ways of separating information. It's new enough that it wasn't core to the system for a long time.

See? These systems are so complicated that it takes me half a blog post just to setup the reason to use Bash. Anyway, on to bash land.

The magic of SSH

Did you know you can send a command to ssh to run instead of just opening a shell? Yeah. Well, when you think about it, you can actually pipe an entire program to an interpreter on the other end. Let me show you.


echo "export" | ssh localhost /bin/bash

You can use any language you want...


echo "php -m" | ssh localhost /bin/bash
#this should show you a list of modules if you have PHP CLI installed

You're only limited by what you can pipe to SSH.

All these demos work better when you properly have your SSH keys setup on the remote hosts and password-less sudo for the user account you're using.

Starting some real code

Let's make sure that our NGINX workers are always 1 per CPU core.


PROG='
CPUNUM=$(sudo cat /proc/cpuinfo  | grep "processor" | wc -l)
sudo sed -i -e "s/worker_processes.*;/worker_processes $CPUNUM;/" /etc/nginx/nginx.conf
sudo nginx -s reload
'
echo "$PROG" | ssh localhost /bin/bash

What we're doing is dumping all the CPU info into grep, only getting the lines that have the word "processor", and then counting the number of lines. This should be a round-a-bout way to get us the number of CPUs. We execute this in a sub-shell so the CPUNUM variable properly gets set to the last output of the piped programs. Next we do a simple sed program on the /etc/nginx/nginx.conf file and send a reload signal.

Let's add some structure

This is all very well and clever up to this point, but how does this become useful for managing a "handful of servers" like I said? Well, let's start by factoring out the PROG and executing this against any number of servers.


brovision.sh
├── tasks
│   └── nginx
│       └── run.sh
├── node_vars
│   ├── localhost.sh
│   └── raspbian.sh

Let's assume we have 2 hosts, our localhost and a Raspbery Pi machine running Raspbian. Our nginx task will be renamed to "run.sh" under a tasks/nginx set of folders. Our main bash script that sends programs to hosts will be called brovision.sh (mish mash of bash and provision, not a typo).


NODE=$1
if [ -z "$NODE" ] 
then
    echo "didn't pass node on command line."
    exit 2;
fi

if [ ! -z node_vars/$NODE.sh ] 
then
    echo "sourcing $NODE settings from node_vars/$NODE.sh"
    source node_vars/$NODE.sh
fi

if [ -z $TASKS ] 
then
    echo "didn't set TASKS array in node_vars file: declare -a TASKS=(nginx,php,python)"
    exit 2;
fi

for task in "$TASKS{@}"
do
CMD="$CMD
$(<tasks/$task/run.sh)
"
done

echo "ssh to $NODE on port $PORT as $USER"

echo "$CMD" | ssh -p $PORT $USER@$HOST sudo /bin/bash
#let's sudo right here so we don't have to do it 100 times in each task file

Now, in our node_vars files we just need to export $USER, $HOST, $PORT, and $TASKS


#node_vars/localhost.sh
USER=myuser
HOST=localhost
PORT=22
declare -a TASKS=(nginx)

#node_vars/raspbian.sh
USER=pi
HOST=192.168.1.25
PORT=22
declare -a TASKS=(nginx,virtual-env)

The tasks/nginx/run.sh can be beefed up a little (remember we took out individual sudos for one GIANT sudo at the ssh level):


export DEBIAN_FRONTEND="noninteractive"
apt-get update
apt-get install -y nginx
CPUNUM=$(cat /proc/cpuinfo  | grep "processor" | wc -l)
sed -i -e "s/worker_processes.*;/worker_processes $CPUNUM;/" /etc/nginx/nginx.conf
nginx -s reload

We can now run any set of arbitrary commands on any hosts with:


./brovision.sh [host-id]

Managing Multiple Clients

If you're like me, you probably want to separate passwords and settings for different clients, but keep the same recipes and setup routines across your clients. This can be accomplished easily by adding two more level of directories above the node_vars.


brovision.sh
├── tasks
│   ├── mysql-backup-s3
│   │   └── run.sh
│   └── nginx
│       └── run.sh
├── clients
│   ├── CLIENT_A #separate git repo
│   │    └── node_vars/
│   │       ├── vincent.sh
│   │       └── gene.sh
│   └── CLIENT_B  #separate git repo
│       └── node_vars/
│           ├── global.accounts.server.sh
│           ├── db-prod.sh
│           └── db-dev.sh

Where if falls short

If you've copied and pasted along this far you understand the basics of what provising tools do. Instead of sending simple bash scripts to your servers, the larger provisioning tools will compile sophisticated python or ruby scripts. This allows them to:

  • Work with mixed OSes (extrapolate yum, apt, and pacman commands)
  • Report failures
  • Skip already completed steps (you can apply new vhosts without always installing nginx)
  • Automatically communicate with a master

I know masterless is the way to go, but this technique cannot scale up to massive data-center needs. If analysis paralysis is keeping you from picking a major provisioning tool, you can try injecting a project like this at your work.

More!

If you want more simple provisioning without the steep learning curve, checkout brovision on github: https://github.com/markkimsal/brovision