Cassandra Tutorial 4: Using Packer and Ansible to create and manage EC2 Cassandra instances in AWS

Cloud DevOps: Using Packer, Ansible/SSH and AWS command line tools to create and manage EC2 Cassandra instances in AWS

This article is useful for developers and DevOps staff who want to create AWS AMI images and manage those EC2 instances with Ansible. Although this article is part of a series about setting up Cassandra images and doing DevOps with Cassandra clusters, the topics we cover apply to AWS DevOps in general - even if you don't use Cassandra at all.

The cassandra-image project has been using Vagrant and Ansible to set up a Cassandra Cluster for local testing. Now we are going to use Packer, Ansible and EC2. We will install systemd daemons to send OS logs and metrics to AWS CloudWatch.

Overview

This article covers the following:

Creating images (EC2 AMIs) with Packer
Using Packer from Ansible to provision an image (AWS AMI)
Installing systemd services that depend on other services and will auto-restart on failure
AWS command line tools to launch an EC2 instance
Setting up ansible to manage our EC2 instance (ansible uses ssh)
Setting up a ssh-agent and adding ssh identities (ssh-add)
Setting ssh using ~/.ssh/config so we don't have to pass credentials around
Using ansible dynamic inventory with EC2
AWS command line tools to manage DNS entries with Route 53

If you are doing DevOps with AWS, Ansible dynamic inventory management with EC2 is awesome. Also mastering ssh config is a must. You should also master the AWS command line tools to automate common tasks.

This article covers how to use Packer to create images and how to run commands for provisioning with Ansible. The Packer script is used create an image with Cassandra installed as a systemd service. With Packer, we also install metricsd, a tool to send OS metrics to AWS CloudWatch metrics, and systemd-cloud-watch, a tool to send systemd journald logs to AWS CloudWatch log. Later the article shows how to setup ansible to talk to those our EC2 instances. Then the article covers how to use ansible dynamic inventory management with EC2. A related topic, which is crucial for DevOps is how to setup ssh config to simplify logging into remote machines: like creating simple aliases, auto-passing id files, auto-passing user id information.

Lastly, we use AWS Route 53 to setup a DNS name for our new instance which we then use from ansible, and never have to reconfigure our ansible config when we create a new AMI or EC2 instance. If you are using EC2 and you are not using ssh config files and ansible and you are doing DevOps, this article is a must.

This is part 4 of this series of articles on creating a Cassandra image and DevOps. You don't truly need those articles to follow this one, but they might provide a lot of contexts. If you are new to ansible the last article on ansbible would be good to at least skim. This one picks up and covers ansible more deeply with regards to AWS/EC2.

You can find the source for the first, second, third and this article at our Cloudurable Cassandra Image for Packer, EC2, Docker, AWS and Vagrant. In later articles, we will set up a working cluster in AWS using VPCs, Subnets, Availability Zones and Placement groups. We will continue to use ansible to manage our Cassandra cluster.

The source code for this article is in this branch on github.

Retrospective - Past Articles in this DevOps series

The first article in this series was about setting up a Cassandra cluster with Vagrant (also appeared on DZone with some additional content DZone Setting up a Cassandra Cluster with Vagrant. The second article in this series was about setting up SSL for a Cassandra cluster using Vagrant (which also appeared with more content as DZone Setting up a Cassandra Cluster with SSL). The third article in this series was about configuring and using Ansible (building on the first two articles). This article (the 4th) will cover applying the tools and techniques from the first three articles to produce an image (EC2 AMI to be precise) that we can deploy to AWS/EC2. To do this explanation, we will use Packer, Ansible, and the Aws Command Line tools. The AWS command line tools are essential for doing DevOps with AWS.

Where do you go if you have a problem or get stuck?

We set up a google group for this project and set of articles. If you just can't get something to work or you are getting an error message, please report it here. Between the mailing list and the github issues, we can support you with quite a few questions and issues.

Packer EC2 support and Ansible

Packer is used to create machine and container images for multiple platforms from a single source configuration. We use Packer to create AWS EC2 AMIs (images) and Docker images. (We use Vagrant to setup dev images on Virtual Box.) Packer like Vagrant is from HashiCorp. Packer can use Ansible playbooks.

Cloudurable developers are big fans of HashiCorp. We love Consul, Vagrant, Packer, Atlas, and the rest.

Packer script to create Cassandra EC2 image

This code listing is our Packer script to create an EC2 instance with Cassandra installed.

packer-ec2.json - Packer creattion script for EC2 Cassandra instance

{
  "variables": {
    "aws_access_key": "",
    "aws_secret_key": "",
    "aws_region": "us-west-2",
    "aws_ami_image": "ami-d2c924b2",
    "aws_instance_type": "m4.large",
    "image_version" : "0.2.2"
  },
  "builders": [
    {
      "type": "amazon-ebs",
      "access_key": "{{user `aws_access_key`}}",
      "secret_key": "{{user `aws_secret_key`}}",
      "region": "{{user `aws_region`}}",
      "source_ami": "{{user `aws_ami_image`}}",
      "instance_type": "{{user `aws_instance_type`}}",
      "ssh_username": "centos",
      "ami_name": "cloudurable-cassandra-{{user `image_version`}}",
      "tags": {
        "Name": "cloudurable-cassandra-{{user `image_version`}}",
        "OS_Version": "LinuxCentOs7",
        "Release": "7",
        "Description": "CentOS 7 image for Cloudurable Cassandra image"
      },
      "user_data_file": "config/user-data.sh"
    }
  ],
  "provisioners": [
    {
      "type": "file",
      "source": "scripts",
      "destination": "/home/centos/"
    },
    {
      "type": "file",
      "source": "resources",
      "destination": "/home/centos/"
    },
    {
      "type": "shell",
      "scripts": [
        "scripts/000-ec2-provision.sh"
      ]
    },
    {
      "type": "ansible",
      "playbook_file": "playbooks/ssh-addkey.yml"
    }
  ]
}

Notice that we are using a packer amazon-ebs builder to build an AMI image based on our local dev boxes EC2 setup.

Also, notice that we use a series of Packer provisioners. The packer file provisioner can copy files or directories to a machine image. The packer shell provisioner can run shell scripts. Lastly the packer ansible provisioner can run ansible playbooks. We covered what playbooks/ssh-addkey.yml does in the previous article, but in short it sets up the keys so we use ansible with our Cassandra cluster nodes.

Bash provisioning

Before we started using ansible to do provisioning, we used bash scripts that get reused for packer/docker, packer/aws, and vagrant/virtual-box. The script 000-ec2-provision.sh invokes these provisioning scripts which the first three articles covered at varying degrees (skim those articles if you are curious or the source code, but you don't need it per se to follow). This way we can use the same provisioning scripts with AMIs, VirtualBox, and AWS EC2.

scripts/000-ec2-provision.sh

#!/bin/bash
set -e

sudo cp -r /home/centos/resources/ /root/
sudo mv /home/centos/scripts/ /root/

echo RUNNING PROVISION
sudo /root/scripts/000-provision.sh
echo Building host file
sudo /root/scripts/002-hosts.sh
echo RUNNING TUNE OS
sudo /root/scripts/010-tune-os.sh
echo RUNNING INSTALL CASSANDRA
sudo /root/scripts/020-cassandra.sh
echo RUNNING INSTALL CASSANDRA CLOUD
sudo /root/scripts/030-cassandra-cloud.sh
echo RUNNING INSTALL CERTS
sudo /root/scripts/040-install-certs.sh
echo RUNNING SYTSTEMD SETUP
sudo /root/scripts/050-systemd-setup.sh

sudo chown -R cassandra /opt/cassandra/
sudo chown -R cassandra /etc/cassandra/

We covered what each of those provisioning scripts does in the first three articles, but for those just joining us, they install packages, programs and configure stuff.

metricsd to send OS metrics to AWS

We are using metricsd to read OS metrics and send data to AWS CloudWatch Metrics. Metricsd gathers OS KPIs for AWS CloudWatch Metrics. We install this as a systemd process which depends on cassandra. We also install Cassandra as a systemd process.

We use systemd unit quite a bit. We use systemd to start up Cloudurable Cassandra config scripts. We use systemd to start up Cassandra/Kafka, and to shut Cassandra/Kakfa (this article does not cover Kafka at all) down nicely. Since systemd is pervasive in all new mainstream Linux distributions, you can see that systemd is an important concept for DevOps.

Metricsd gets installed as a systemd service by our provisioning scripts.

Installing metricsd systemd from our provisioning scripts

cp ~/resources/etc/systemd/system/metricsd.service /etc/systemd/system/metricsd.service
cp ~/resources/etc/metricsd.conf /etc/metricsd.conf
systemctl enable metricsd
systemctl start  metricsd

We use systemctl enable to install metricsd to start up on system start. We then use systemctl start to start metricsd.

We could write a whole article on metricsd and AWS CloudWatch metrics, and perhaps we will. For more informatino about metricsd please see the metricsd github project.

The metricsd system unit depends on the Cassandra service. The unit file is as follows.

/etc/systemd/system/metricsd.service

[Unit]
Description=MetricsD OS Metrics
Requires=cassandra.service
After=cassandra.service

[Service]
ExecStart=/opt/cloudurable/bin/metricsd

WorkingDirectory=/opt/cloudurable
Restart=always
RestartSec=60
TimeoutStopSec=60
TimeoutStartSec=60


[Install]
WantedBy=multi-user.target

systemd-cloud-watch to send OS logs to AWS log aggregation

We are using systemd-cloud-watch to read OS logs from systemd/journald and send data to AWS CloudWatch Log. The systemd-cloud-watch daemon journald logs and aggregates them to AWS CloudWatch Logging. Just like metricsd we install systemd-cloud-watch as a systemd process which depends on cassandra. Remember that we also install Cassandra as a systemd process, which we will cover in a moment.

The systemd-cloud-watch daemon gets installed as a systemd service by our provisioning scripts.

Installing systemd-cloud-watch systemd service from our provisioning scripts

cp ~/resources/etc/systemd/system/systemd-cloud-watch.service /etc/systemd/system/systemd-cloud-watch.service
cp ~/resources/etc/systemd-cloud-watch.conf /etc/systemd-cloud-watch.conf
systemctl enable systemd-cloud-watch
systemctl start  systemd-cloud-watch

We use systemctl enable to install systemd-cloud-watch to start up when the system starts. We then use systemctl start to start systemd-cloud-watch.

The systemd-cloud-watch system unit depends on the Cassandra service. The unit file is as follows:

/etc/systemd/system/systemd-cloud-watch.service

[Unit]
Description=SystemD Cloud Watch Sends Journald logs to CloudWatch
Requires=cassandra.service
After=cassandra.service

[Service]
ExecStart=/opt/cloudurable/bin/systemd-cloud-watch /etc/systemd-cloud-watch.conf

WorkingDirectory=/opt/cloudurable
Restart=always
RestartSec=60
TimeoutStopSec=60
TimeoutStartSec=60


[Install]
WantedBy=multi-user.target

Note to use metricsd and systemd-cloud-watch we have to set up the right AWS IAM roles, and then associate that IAM instance role with our instances when we start them up.

The systemd-cloud-watch.conf is set up to use the AWS log group cassandra as follows:

systemd-cloud-watch.conf

log_priority=7
debug=true
log_group="cassandra"
batchSize=5

For this to work, we will have to create a log group called cassandra.

Creating a AWS CloudWatch log group

$ aws logs create-log-group  --log-group-name cassandra

To learn more about systemd-cloud-watch, please see the systemd-cloud-watch GitHub project.

Running Cassandra as a systemd service

If Cassandra stops for whatever reason, systemd can attempt to restart it. The systemd unit file can ensure that our Cassandra service stays running. The systemd-cloud-watch utility will be sure to log all restarts to AWS CloudWatch.

Here is the systemd unit file for Cassandra.

/etc/systemd/system/cassandra.service

[Unit]
Description=Cassandra Service

[Service]
Type=forking
PIDFile=/opt/cassandra/PID

ExecStartPre=- /sbin/swapoff -a
ExecStartPre=- /bin/chown -R cassandra /opt/cassandra
ExecStart=/opt/cassandra/bin/cassandra  -p /opt/cassandra/PID

WorkingDirectory=/opt/cassandra
Restart=always
RestartSec=60
TimeoutStopSec=60
TimeoutStartSec=60
User=cassandra

[Install]
WantedBy=multi-user.target

The above will tells systemd to restart Cassandra in one minute if it goes down. Since we are using OS log aggregation to AWS Cloudwatch every time Cassandra goes down or is restarted by systemd, we will get log messages that we can create alerts and trigger in CloudWatch to then run AWS Lambdas that work with the rest of the AWS ecosystem. Critical bugs in queries or UDF or UFA could cause Cassandra to go down. These could be hard to track down and sporadic. Logging aggregation helps.

Using Packer to build our ec2 AMI

To build the AWS AMI, we use packer build as follows.

Building the AWS AMI

$ packer build packer-ec2.json

After the packer build completes, it will print out the name of the AMI image it created, e.g., ami-6db33abc.

Using AWS CLI to create our Cassandra EC2 instance

The AWS Command Line Interface is the Swiss army knife of utilities to manage your AWS services.

"With just one tool to download and configure, you can control multiple AWS services from the command line and automate them through scripts." --AWS CLI Docs

The AWS command line tool slices and dices from VPCs to running CloudFormations to backing up Cassandra snapshot files to S3. If you are working with AWS, you need the AWS CLI.

Automating EC2 image creation with AWS CLI

Starting up an EC2 instance with the right, AMI id, IAM instance role, into the correct subnet, using the appropriate security groups, with the right AWS key-pair name can be tedious. We must automate as using the AWS console (GUI) is error prone (requires too much human intervention).

Instead of using the AWS console, we use the aws command line. We create four scripts to automate creating and connecting to EC2 instances:

bin/ec2-env.sh - setups common AWS references to subnets, security groups, key pairs
bin/create-ec2-instance.sh - uses aws command line to create an ec2 instance
bin/login-ec2-cassandra.sh Uses ssh to log into Cassandra node we are testing
bin/get-IP-cassandra.sh Uses aws command line to get the public IP address of the cassandra instance

Note to parse the JSON coming back from the *aws command line we use jq. Note that jq is a lightweight command-line JSON processor. To download and install jq see the jq download documents.

bin/create-ec2-instance.sh Create an EC2 instance based on our new AMI from Packer

#!/bin/bash
set -e

source bin/ec2-env.sh

instance_id=$(aws ec2 run-instances --image-id "$AMI_CASSANDRA" --subnet-id  "$SUBNET_CLUSTER" \
 --instance-type m4.large --iam-instance-profile "Name=$IAM_PROFILE_CASSANDRA" \
 --associate-public-ip-address --security-group-ids "$VPC_SECURITY_GROUP" \
 --key-name "$KEY_NAME_CASSANDRA" | jq --raw-output .Instances[].InstanceId)

echo "${instance_id} is being created"

aws ec2 wait instance-exists --instance-ids "$instance_id"

aws ec2 create-tags --resources "${instance_id}" --tags Key=Name,Value="${EC2_INSTANCE_NAME}"

echo "${instance_id} was tagged waiting to login"

aws ec2 wait instance-status-ok --instance-ids "$instance_id"

bin/login-ec2-cassandra.sh

Notice we use the aws ec2 wait to ensure the instance is ready before we tag it and before we log into it.

All of the ids for the servers AWS resources we need to refer to are in scripts/ec2-ens.sh. Notice that all of our AWS/EC2 shell scripts load this env file source bin/ec2-env.sh as follows:

bin/ec2-env.sh common AWS resources exposed as ENV Vars

#!/bin/bash
set -e

export AMI_CASSANDRA=ami-6db33abc
export VPC_SECURITY_GROUP=sg-a8653123

export SUBNET_CLUSTER=subnet-dc0f2123
export KEY_NAME_CASSANDRA=cloudurable-us-west-2
export PEM_FILE="${HOME}/.ssh/${KEY_NAME_CASSANDRA}.pem"
export IAM_PROFILE_CASSANDRA=IAM_PROFILE_CASSANDRA
export EC2_INSTANCE_NAME=cassandra-node

Note that we created an AWS key pair called cloudurable-us-west-2. You will need to create a VPC security group with ssh access. You should lock it down to only accept ssh connections from your IP. At this stage, you can use a default VPC, and for now use a public subnet. Replace the ids above with your subnet (SUBNET_CLUSTER), your key pair (KEY_NAME_CASSANDRA), your AMI (AMI_CASSANDRA), and your IAM instance role (IAM_PROFILE_CASSANDRA). The IAM instance role should have access to create logs and metrics for AWS CloudWatch.

The login script (login-ec2-cassandra.sh) uses ssh to log into the instance, but to know what IP to use, it uses get-IP-cassandra.sh

bin/login-ec2-cassandra.sh Log into new EC2 instance using ssh

#!/bin/bash
set -e

source bin/ec2-env.sh

if [ ! -f "$PEM_FILE" ]; then
    echo "Put your key file $PEM_FILE in your .ssh directory."
    exit 1
fi
ssh -i  "$PEM_FILE"  centos@`bin/get-IP-cassandra.sh`

Ensure you create a key pair in AWS. Copy it to ~/.ssh and then run chmod 400 on the pem file. Note the above script uses bin/get-IP-cassandra.sh to get the IP address of the server as follows:

bin/get-IP-cassandra.sh Get public IP address of new EC2 instance using aws cmdline

#!/bin/bash
set -e

source bin/ec2-env.sh

aws ec2 describe-instances --filters  "Name=tag:Name,Values=${EC2_INSTANCE_NAME}" \
| jq --raw-output .Reservations[].Instances[].PublicIpAddress

Running bin/create-ec2-instance.sh

To run bin/create-ec2-instance.sh

Running bin/create-ec2-instance.sh

$ bin/create-ec2-instance.sh

Let's show how to check to see if everything is up and running.

Interactive session showing everything running

$ pwd
~/github/cassandra-image
$ bin/create-ec2-instance.sh
i-013daca3d11137a8c is being created
i-013daca3d11137a8c was tagged waiting to login
The authenticity of host '54.202.110.114 (54.202.110.114)' can't be established.
ECDSA key fingerprint is SHA256:asdfasdfasdfasdfasdf.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '54.202.110.114' (ECDSA) to the list of known hosts.

[centos@ip-172-31-5-57 ~]$ systemctl status cassandra
● cassandra.service - Cassandra Service
   Loaded: loaded (/etc/systemd/system/cassandra.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2017-03-01 02:15:10 UTC; 14min ago
  Process: 456 ExecStart=/opt/cassandra/bin/cassandra -p /opt/cassandra/PID (code=exited, status=0/SUCCESS)
 Main PID: 5240 (java)
   CGroup: /system.slice/cassandra.service
           └─5240 java -Xloggc:/opt/cassandra/bin/../logs/gc.log -ea -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003 -XX:+AlwaysPreTouch -XX:-UseBiasedLocking -XX:+U...

Mar 01 02:14:13 ip-172-31-22-103.us-west-2.compute.internal systemd[1]: Starting Cassandra Service...
Mar 01 02:15:10 ip-172-31-5-57 systemd[1]: Started Cassandra Service.

[centos@ip-172-31-5-57 ~]$ systemctl status metricds
Unit metricds.service could not be found.
[centos@ip-172-31-5-57 ~]$ systemctl status metricsd
● metricsd.service - MetricsD OS Metrics
   Loaded: loaded (/etc/systemd/system/metricsd.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2017-03-01 02:15:10 UTC; 14min ago
 Main PID: 5243 (metricsd)
   CGroup: /system.slice/metricsd.service
           └─5243 /opt/cloudurable/bin/metricsd

Mar 01 02:25:15 ip-172-31-5-57 metricsd[5243]: INFO     : [worker] - 2017/03/01 02:25:15 config.go:30: Loading config /etc/metricsd.conf
Mar 01 02:25:15 ip-172-31-5-57 metricsd[5243]: INFO     : [worker] - 2017/03/01 02:25:15 config.go:46: Loading log...


[centos@ip-172-31-5-57 ~]$ systemctl status systemd-cloud-watch
● systemd-cloud-watch.service - SystemD Cloud Watch Sends Journald logs to CloudWatch
   Loaded: loaded (/etc/systemd/system/systemd-cloud-watch.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2017-03-01 02:15:10 UTC; 15min ago
 Main PID: 5241 (systemd-cloud-w)
   CGroup: /system.slice/systemd-cloud-watch.service
           └─5241 /opt/cloudurable/bin/systemd-cloud-watch /etc/systemd-cloud-watch.conf

Mar 01 02:30:44 ip-172-31-5-57 systemd-cloud-watch[5241]: main INFO: 2017/03/01 02:30:44 workers.go:138: Read record &{i-013daca3d11137a8c 1488335194775 5241 0 0 systemd-cloud-w /opt/cloudurable/bin/systemd-cloud-watch /opt/cloudurable/bin...
...
Mar 01 02:30:44 ip-172-31-5-57 systemd-cloud-watch[5241]: main INFO: 2017/03/01 02:30:44 workers.go:138: Read record &{i-013daca3d11137a8c 1488335194776 5241 0 0 systemd-cloud-w /opt/cloudurable/bin/systemd-cloud-watch /opt...7f10a2c35de4098
Mar 01 02:30:44 ip-172-31-5-57 systemd-cloud-watch[5241]: repeater INFO: 2017/03/01 02:30:44 cloudwatch_journal_repeater.go:209: SENT SUCCESSFULLY
Mar 01 02:30:44 ip-172-31-5-57 systemd-cloud-watch[5241]: repeater

Notice that we use systemctl status systemd-cloud-watch, systemctl status cassandra, and systemctl status metricsd to ensure it is all working.

Ansible and EC2

Although we have base images, since Cassandra is stateful, we will want the ability to update the images in place.

The options for configuration and orchestration management are endless (Puppet, Chef, Boto, etc.). This article and Cloudrable uses Ansible for many of these tasks. Ansible is an agentless architecture and works over ssh (secure shell) as we covered in our last article (Setting up Ansible for our Cassandra Cluster to do DevOps tasks). There are some very helpful Ansible/AWS integrations which will try to cover in future articles.

The Ansible framework allows DevOps staff to run commands against Amazon EC2 instances as soon as they are available. Ansible is very suitable for provisioning hosts in a Cassandra cluster as well as performing routine DevOps tasks like replacing a failed node, backing up a node, profiling Cassandra, performing a rolling upgrade and more.

Since Ansible relies on ssh, we should make sure that ssh is working for us.

Debugging possible ssh problems before we get started with ansible

Before you go about using ansible with AWS/EC2 to manage your Cassandra clusters, you have to make sure that you can, in fact, connect with ssh.

The first step in our journey is to get the IP of the EC2 instance that you just launched.

Another key tip for using ansible is to use -vvvv if it can't connect so you can see why it can't connect

Let's get the IP of the new instance using get-IP-cassandra.sh, which we covered earlier.

Getting the IP

$ bin/get-IP-cassandra.sh
54.218.113.95

Now we can log in with the pem file associated with our AWS key-pair that we used to launch our Cassandra EC2 instance.

Let's see if we can log into the Cassandra EC2 instance with ssh.

Can I log in with the pem file?

ssh -i ~/.ssh/cloudurable-us-west-2.pem  centos@54.218.113.95

If you can do this, then your security group is setup properly. If you can't do this, make sure you VPC security group associated with the instance has port 22 open. (Limit logging into instances via SSH port 22 to only your IP address.)

In addition to the pem file that AWS creates, we have our private rsa key for the test cluster (which we covered in the last article). Recall that the rsa key is used with the ansible user (also described in the last article on ansible).

Let's see if we can log in with our RSA private key.

Can I log in with the key we generated for ansible?

ssh -i ~/.ssh/test_rsa  ansible@54.218.113.95

If you can log in with the pem but not the rsa key we created for the test cluster, then you have an issue with a key mismatch (perhaps). You could try to regenerate the keys with bin/setupkeys-cassandra-security.sh then either copy them with scp copy or upload them with the ansible file/copy module or file/synchronize module.

Passing the key on each ansible command is tiresome, let's use the ssh-agent (discussed in the last article), to add (ssh-add) our cluster key identity (~/.ssh/test_rsa) to all ssh commands that we use (including ansible).

Can I install the key and log in using ssh-agent?

$ ssh-agent bash
$ ssh-add ~/.ssh/test_rsa
$ ssh ansible@54.218.113.95

If you were able to log in with ssh by adding the key to the ssh-agent, then you are ready to use ansible. To test that you can connect via ansible add these entries to the inventory.ini file in the project root (~/github/cassandra-cloud).

Setting up Ansible using the inventory.ini

We assume you have set up the cluster key as follow:

Setup cluster key for ansible

$ ssh-agent bash
$ ssh-add ~/.ssh/test_rsa

Recall that bin/setupkeys-cassandra-security.sh creates the RSA key and installs it under ~/.ssh/test_rsa. Then the provisioning scripts install the key correctly on the EC2 image (AMI).

Add this to inventory.ini for ansible

[aws-nodes]
54.218.113.95 ansible_user=ansible

The above tells ansible that this server 54.218.113.95 exists, and it is in the group aws-nodes, and that when we connect to it that we should use the user ansible. (Remember we looked up the IP of the Cassandra EC2 instance using bin/get-IP-cassandra.sh).

Once that is setup, we can run the ansible ping module against our Cassandra EC2 instance as follows:

Run ansible ping modules against aws-nodes

$ ansible aws-nodes -m ping
54.218.113.95 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

Dynamic Ansible inventory

When doing Amazon Web Services EC2 DevOps, you could be managing several groups of servers. EC2 allows you to use placement groups, autoscale groups, security groups, and tags to organize and manage your instances. AWS EC2 is rich with meta-data about the instances.

If you are running Dev, QA, production or even green/blue deploys with CI and CD, you will be running many EC2 instances over time. Hosts can come and go in EC2. Because of the ephemeral nature of hosts in EC2, ansbile allows you to use external scripts to manage ansible inventory lists. There is such an ansible inventory script for AWS EC2.

As you can imagine if you are doing DevOps, ansible AWS EC2 dynamic inventory is a must.

You can set up AWS via your ~/.aws/config and ~/.aws/credentials files, and if you installed the aws command line, then you likely have this setup or the requisite environment variables.

To use Ansible's -i command line option and specify the path to the script.

Before we do that, we need to download the ec2 ansible inventory script and mark it executable.

Download the dynamic ansible inventory script as follows.

Download the ansible ec2 inventory script, make it executable

$ pwd
~/github/cassandra-image/

$ wget https://raw.githubusercontent.com/ansible/ansible/devel/contrib/inventory/ec2.py -O ansible-ec2/ec2.py

$ chmod +x ansible-ec2/ec2.py

$ wget https://raw.githubusercontent.com/ansible/ansible/devel/contrib/inventory/ec2.ini -O ansible-ec2/ec2.ini

After you download it, you can start using it.

Using a dynamic inventory

Now let's use the dynamic inventory ansible script.

Before we do that, let's add the pem associated with our AWS Key Pair to the ssh-agent as follows (if you have not done so already).

Add centos pem file (key pair)

$ ssh-add ~/.ssh/cloudurable-us-west-2.pem

Then we can ping the EC2 instance with the ansible ping module as follows:

Pass dynamic list to ansible use user centos

$ ansible -i ansible-ec2/ec2.py  tag_Name_cassandra_node  -u centos  -m ping
54.218.113.95 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

The -i option passes a script that will generate a JSON inventory list. You can even write a custom script that produces an inventory as long as that script uses the same JSON format. Above we are passing the script that we just downloaded. Remember that an ansible inventory list is just a list of servers that we are managing with ansible.

Now we know we can use ansible with our AWS key pair and our AWS PEM file. But can we use it with our RSA key?

Using ansible with RSA Key and ansible user from last article

Please recall from the last article that we set up a user called ansible which used an RSA private key file that we created in the last article as well (~/.ssh/test_rsa).

We should be able to manage our EC2 instance with the ansible user using the RSA key. Let's try.

Add the ansible users RSA key to the ssh-agent as follows.

Add ansible users RSA private key file - test_rsa file

$ ssh-add `~/.ssh/test_rsa`

Now we can access ansible via the ansible user using ~/.ssh/test_rsa as our private key. Use ansible with the ansible users (-u ansible) and the RSA key we just installed.

Pass dynamic list to ansible use user ansible

$ ansible -i ansible-ec2/ec2.py  tag_Name_cassandra_node  -u ansible  -m ping
54.218.113.95 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

Often DevOps tasks require you to manage different machines, Ubuntu, CoreOS, CentOS, RedHat, Debian, and AmazonLinux. The various EC2 instances will have different users to log in. For example, CentOS has the user centos and Ubuntu has the user ubuntu (I have run into admin, root, etc.). It is a good idea to create a standard user like ansible (or devops or ops or admin) to run ansible commands against different flavors of Unix. Also, AWS PEM files / key pairs do not change once an instance if launched, and Cassandra instances tend to be less ephemeral (due to the statefulness of Cassandra and the potentially large amounts of data on a node) then some other EC2 instances. The ability to regenerate the RSA key periodically is important as you do not want the keys to getting into the wrong hands.

The AWS inventory list command uses security groups, VPC ids, instance id, image type, EC2 tags, AZ, scaling groups, region and more to group EC2 instances to run ansible commands against, which is very flexible for DevOps operations.

Let's see a list of all of the aliases and ansible groups that our one Cassandra EC2 instance is exposed.

Show all ansible groups that our Cassandra EC2 instance can be accessed by

./ansible-ec2/ec2.py  | jq "keys"
[
  "_meta",
  "ami_6db4410e",             //by AMI
  "ec2",                      //All ec2 instances
  "i-754a8a4f693b58d1b",      //by instance id
  "key_cloudurable_us_west_2",//by key pair
  "security_group_allow_ssh", //by security group
  "tag_Name_cassandra_node",  //by EC2 tag
  "type_m4_large",            //by EC2 instance type
  "us-west-2",                //by Region
  "us-west-2c",               //by AZ Availability Zone
  "vpc_id_vpc_c78000a0"       //by VPC Virtual Private Cloud
]

You can use any of these ansible groups to ping a set of servers. Let's ping every server (we only have one) in the AWS us-west-2 region.

Ping all servers in the us-west-2 region

$ ansible -i ansible-ec2/ec2.py  us-west-2  -u ansible  -m ping
54.218.113.95 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

I don't know about you, but I don't like passing around the -i and -u option on every command. Let's see what we can do to remedy this.

Installing Dynamic Inventory as the default inventory

Another option besides using the -i is to copy the dynamic inventory script to /etc/ansible/ec2.py and chmod +x it. You will also need to copy the ec2.ini file to /etc/ansible/ec2.ini. Then we will be able to use ansible EC2 dynamic inventory without passing the -i (making it a lot easier to use).

Let's install the ansible dynamic inventory script and config as follows.

Installing ansible EC2 dynamic inventory script as the default

$ cd ~/github/cassandra-image

$ wget https://raw.githubusercontent.com/ansible/ansible/devel/contrib/inventory/ec2.py -O ansible-ec2/ec2.py

$ chmod +x ansible-ec2/ec2.py

$ sudo cp ansible-ec2/ec2.py /etc/ansible/ec2.py

$ wget https://raw.githubusercontent.com/ansible/ansible/devel/contrib/inventory/ec2.ini -O ansible-ec2/ec2.ini
sudo cp ansible-ec2/ec2.ini /etc/ansible/ec2.ini

You will also need to add the script (ANSIBLE_HOSTS) and the ini file (EC2_INI_PATH) to environment variable which you can put in your ``~/.bash_profile`.

Environment varaibles needed to make dynamic inventory work

export ANSIBLE_HOSTS=/etc/ansible/ec2.py
export EC2_INI_PATH=/etc/ansible/ec2.ini

Now when you use ansible, you will not have to specify -i every time.

Let's try ansible using the dynamic inventory list without the -i.

Using dynamic inventory without -i

$ ansible   tag_Name_cassandra_node  -u ansible -m ping
54.218.113.95 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

Now that we got rid of -i to specify the ansible dynamic inventory list script, let's get rid of the -u to specify the user. At least let's try.

Again before we do that, let's see if we can use ssh without passing the user name.

Specifying default user via ~/.ssh/config

If you're like most developers doing DevOps, you have a half dozen remote servers (or these days, local virtual machines, EC2 instances, Docker containers) you might need to deal with.

Remembering all of those usernames, passwords, domain names, identity files, and command line options to ssh can be daunting. You want a way to simplify your life with an ssh config file.

Using ssh effectively is another one of those essential DevOp skills!

You can create an ssh config file that configures host names, user names, and private keys to connect to ssh. There are many custom ssh config options to configure ssh and make life easier.

We will show how to configure ``~/.ssh/configto make logging into our EC2 instance easier, and eventually get rid of the need to run thessh-agent` or use the `-i` option when using ansible.

We wrote a small bash script that gets the DNS name of our instance using the aws command line as follows:

bin/get-DNS-name-cassandra.sh - Get the DNS name of our Cassandra EC2 instance using `aws` command line

#!/bin/bash
set -e

source bin/ec2-env.sh

aws ec2 describe-instances --filters  "Name=tag:Name,Values=${EC2_INSTANCE_NAME}" \
| jq --raw-output .Reservations[].Instances[].PublicDnsName

We can use bin/get-DNS-name-cassandra.sh to get the DNS name of our instance as follows:

Getting the DNS name of the Cassandra EC2 instance

bin/get-DNS-name-cassandra.sh
ec2-54-218-113-95.us-west-2.compute.amazonaws.com

Now let's see the IP address associated with this instance.

EC2 Cassandra host

$ host ec2-54-218-113-95.us-west-2.compute.amazonaws.com
54.218.113.95

Note that for this discussion that we are using 54.218.113.95 as the public IP address of our Cassandra node (that we created with packer and launched with the aws command line tools).

Now we can configure ~/.ssh/config to use this information.

~/.ssh/config

# Note we can use wild star so any that match this pattern will work.
Host *.us-west-2.compute.amazonaws.com
  ForwardAgent yes
  IdentityFile ~/.ssh/test_rsa
  User ansible

# Note we can use the IP address
# so if we ssh into it, we don't have to pass username and the id file
Host 54.218.113.95
  ForwardAgent yes
  IdentityFile ~/.ssh/test_rsa
  User ansible

# We even create an alias for ssh that has username and the id file.
Host cnode0
  Hostname 54.218.113.95
  ForwardAgent yes
  IdentityFile ~/.ssh/test_rsa
  User ansible

Read the comments in the file above.

Now when we log into cnode0 using ssh as follows:

ssh cnode0

$ ssh cnode0

Note that cnode0 is an alias that we set up in ~/.ssh/config and that we don't have to use the -i option to pass the identity file or use the username.

Would you rather need to remember ssh cnode0 or ssh -i ~/.ssh/my-long-pem-name-with-region-info.pem someuserForEachLinuxFlavor@ec2-ip-address-that-changes-every-time-i-build-a-new-instance.us-west-2.compute.amazonaws.com?

Keep in mind; you do not have to use ssh-agent or ssh-add anymore to use ansible since we configured the identity file and username in ~/.ssh/config. Forgetting to set up the ssh-agent and adding the right key file with ssh-add was error prone at best, and often left me personally confused. Now that issue of confusion is gone, but since we set up ForwardAgent yes, once we log into a remote instance the keys we set up with ssh-agent and ssh-add get passed to the remote host. This way those keys do not have to live on the remote host. You can, for example, log into a bastion server and then ssh into a private subnet with the keys you set up with ssh-agent, and none of those private keys have to live on the remote instances (to avoid getting used by someone else). Mastering ssh, ssh-agent, and ssh key management is essential to being good at DevOps.

Given the above config, you can also log into the Cassandra dev instance with its public domain name as follows:

ssh into box using public address

$ ssh ec2-54-218-113-95.us-west-2.compute.amazonaws.com

The above uses the Host *.us-west-2.compute.amazonaws.com which is a domain pattern that would work for all ec2 instances in the us-west-2 AWS region. Since different regions will use different AWS key-pairs, you can set up a pattern/key-pair pem file for each region easily.

Attempt Get ansible to use public DNS names instead of IP addresses (does not work)

Given the above ./ssh/config to get rid of -u option one might imagine you could tell ec2.py, the script that generates the ansible inventory, you could configure it to use public domain names instead of public IP addresses, and you can.

If you make this change, to ec2.ini.

/etc/ansible/ec2.ini

Change vpc_destination_variable = ip_address to vpc_destination_variable = public_dns_name

vpc_destination_variable = public_dns_name

Then ec2.py will use the public domain names from EC2 instead of public IP addresses, for example, ec2-54-218-113-95.us-west-2.compute.amazonaws.com instead of 54.218.113.95.

But then all of the ansible commands stop working. It seems, as best that I can tell, that ansible does not like domain names with dashes. We searched for a workaround for this and could not find it. If you know the answer, please write us.

We even tried to add this directly to inventory.ini.

adding ec2 host direct

[aws-nodes]
# 54.186.15.163 ansible_user=ansible
ec2-54-218-113-95.us-west-2.compute.amazonaws.com ansible_user=ansible

Then tried running the ansible commands against the aws-nodes and we got the same result until we tried the fix for EC2 domain name being too long for Ansible, but we never got the ec2.py to work with the longer DNS names (we were able to get past parts of it).

This problem is either ansible not handling dashes or long dns name problem. The fix seems to be in the comments of this fix for EC2 domain name being too long for Ansible, but again worked but only in the non-dynamic config. For the most part, we tried the fix and it did not work (still getting ERROR! Specified hosts and/or --limit does not match any hosts).

It is okay, though. The only real limitation here is that when you use ansible with ec2.py that you will need to pass the user and continue to use ssh-agent and ssh-add.

This workaround of having to give the username with -u is not too serious. We still wish there was a way to use ansible without passing a username and identity file just like we have with ssh. And there is, but it involves AWS Route 53 and configuring ``~/ssh/config`.

Using ansible without passing the id file or username

Another way to use ansible with our Cassandra cluster is to create DNS names for the Cassandra nodes that we want to manage. The problem with using the public IP address or the AWS generated DNS name is that they change each time we terminate and recreate the instance. We plan on terminating and recreating the instance a lot.

The solution is where DNS comes in and AWS route 53. After we create the instance, we can use an internal hosted zone of Route 53 (for VPN) or a public hosted zone and associate the IP address with our new instance. We could do this for all of the Cassandra seed nodes and all of the cluster nodes for that matter.

Before we get started let's add two more variables to our bin/ec2-env.sh, namely, HOSTED_ZONE_ID and NODE0_DNS as follows:

bin/ec2-env.sh

#!/bin/bash
set -e

export AMI_CASSANDRA=ami-abc1234
export VPC_SECURITY_GROUP=sg-abc1234

export SUBNET_CLUSTER=subnet-abc1234
export KEY_NAME_CASSANDRA=cloudurable-us-west-2
export PEM_FILE="${HOME}/.ssh/${KEY_NAME_CASSANDRA}.pem"
export IAM_PROFILE_CASSANDRA=IAM_PROFILE_CASSANDRA
export EC2_INSTANCE_NAME=cassandra-node

export HOSTED_ZONE_ID="Z1-abc1234"
export NODE0_DNS="node0.cas.dev.cloudurable.com."

Now let's define a new script that will use the aws command line. We will use the aws route53 change-resource-record-sets to associate a DNS name with the IP address as follows:

bin/associate-DNS-with-IP.sh

#!/bin/bash
set -e

source bin/ec2-env.sh

IP_ADDRESS=`bin/get-IP-CASSANDRA.sh`


REQUEST_BATCH="
{
\"Changes\":[
    {
        \"Action\": \"UPSERT\",
        \"ResourceRecordSet\": {
                \"Type\": \"A\",
                \"Name\": \"$NODE0_DNS\",
                \"TTL\": 300,
                \"ResourceRecords\": [{
                    \"Value\": \"$IP_ADDRESS\"
                }]
        }
    }
]
}
"

echo "$REQUEST_BATCH"

changeId=$(aws route53 change-resource-record-sets --hosted-zone-id "$HOSTED_ZONE_ID" --change-batch "$REQUEST_BATCH" \
| jq --raw-output .ChangeInfo.Id)

aws route53 wait resource-record-sets-changed --id "$changeId"

Notice that we are running this change against our Route 53 Hosted ZONE with aws route53 change-resource-record-sets as follows:

Change batch for Route 53 hosted zone

{
"Changes":[
    {
        "Action": "UPSERT",
        "ResourceRecordSet": {
                "Type": "A",
                "Name": "node0.cas.dev.cloudurable.com.",
                "TTL": 300,
                "ResourceRecords": [{
                    "Value": "54.218.113.95"
                }]
        }
    }
]
}

Notice we are using UPSERT which will update or add the A record to Route 53's DNS resources to associate the name node0.cas.dev.cloudurable.com. with the IP address 54.218.113.95.

Now that we have a domain name, and it is scripted/automated (we added a call to bin/associate-DNS-with-IP.sh into bin/create-ec2-cassandra.sh), we can configure ~/.ssh/config to use this domain name which will not change like the public IP or our Cassandra instance public DNS name changes.

Let's update the ~/.ssh/config to refer to our new DNS name as follows:

`~/.ssh/config` - Use new DNS naming

Host *.us-west-2.compute.amazonaws.com
  ForwardAgent yes
  IdentityFile ~/.ssh/test_rsa
  User ansible

Host *.cas.dev.cloudurable.com
  ForwardAgent yes
  IdentityFile ~/.ssh/test_rsa
  User ansible

Host cnode0
  Hostname node0.cas.dev.cloudurable.com
  ForwardAgent yes
  IdentityFile ~/.ssh/test_rsa
  User ansible

Notice we added the pattern *.cas.dev.cloudurable.com (where cas stands for Cassandra and dev means this is our development environment). We also added an alias for our Cassandra instance called cnode0 that refers to node0.cas.dev.cloudurable.com.

We can ssh into cnode0 or node0.cas.dev.cloudurable.com without passing the username or identity file (private key) each time. This config is like before but using a DNS name that does not change when we rebuild our servers. This concept is important; you would not want to modify ~/.ssh/config every time you rebuild a server.

Now let's change our inventory.ini file in the project directory (~/github/cassandra-image) to use this as follows:

~/github/cassandra-image/inventory.ini

[aws-nodes]
cnode0
node0.cas.dev.cloudurable.com

Notice that we use the short name and the long name.

Note you truly just need one but we have two just for this article. Never put the same box twice in the same ansible group, all commands and playbooks will run twice.

Now we can run ansible ping against these servers and not pass the username or identity file.

Use ansible ping module against cnode and node0.cas.dev.cloudurable.com.

Run against all (see note above).

Running ansible ping against all of the "instances"

$ ansible  aws-nodes  -u ansible -m ping
cnode0 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
node0.cas.dev.cloudurable.com | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

We can also just run it against one of the instances by using just that instances name.

Run against cnode0.

ansible cnode0 -u ansible -m ping

$ ansible cnode0  -u ansible -m ping
cnode0 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

We can do this for any server.

Run against node0.cas.dev.cloudurable.com.

ansible node0.cas.dev.cloudurable.com -u ansible -m ping

$ ansible node0.cas.dev.cloudurable.com  -u ansible -m ping
node0.cas.dev.cloudurable.com | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

Keep in mind; you do not have to use ssh-agent or ssh-add anymore to use ansible since we configured the identity file and username in ~/.ssh/config. We can rebuild our server at will. Each time we do, our creation script will update the IP address in DNS to look at the new server. Then all of our ansible scripts and playbooks will continue to work.

Using Ansible to manage our cluster

We won't do much actual cluster management in this article per se. And, unlike our last article that used Vagrant and ansible, we don't have a cluster per se (or rather we have a cluster of one).

We can now use Ansible for our Cassandra Cluster to do automate common DevOps tasks.

Ansible running nodetool against all nodes

ansible aws-nodes -a "/opt/cassandra/bin/nodetool describecluster"
cnode0 | SUCCESS | rc=0 >>
Cluster Information:
       Name: Test Cluster
       Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
       Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
       Schema versions:
               86afa796-d883-3932-aa73-6b017cef0d19: [127.0.0.1]

node0.cas.dev.cloudurable.com | SUCCESS | rc=0 >>
Cluster Information:
       Name: Test Cluster
       Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
       Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
       Schema versions:
               86afa796-d883-3932-aa73-6b017cef0d19: [127.0.0.1]

Let’s say that we wanted to update a schema or do a rolling restart of our Cassandra nodes, which could be a very common task. Perhaps before the update, we want to decommission the node and back things up. To do this sort of automation, we could create an Ansible playbook.

Let's run an Ansible playbook from the last article.

Running describe-cluster playbook

$ ansible-playbook playbooks/describe-cluster.yml --verbose
Using /Users/jean/github/cassandra-image/ansible.cfg as config file

PLAY [aws-nodes] ***************************************************************

TASK [Run NodeTool Describe Cluster command] ***********************************
changed: [node0.cas.dev.cloudurable.com] => {"changed": true, "cmd": ["/opt/cassandra/bin/nodetool", "describecluster"],
"delta": "0:00:02.192589", "end": "2017-03-03 08:02:58.537371", "rc": 0, "start": "2017-03-03 08:02:56.344782",
"stderr": "", "stdout": "Cluster Information:\n\tName: Test Cluster\n\tSnitch:
org.apache.cassandra.locator.DynamicEndpointSnitch\n\tPartitioner: org.apache.cassandra.dht.Murmur3Partitioner
\n\tSchema versions:\n\t\t86afa796-d883-3932-aa73-6b017cef0d19: [127.0.0.1]", "stdout_lines": ["Cluster Information:",
"\tName: Test Cluster", "\tSnitch: org.apache.cassandra.locator.DynamicEndpointSnitch",
...
PLAY RECAP *********************************************************************
cnode0                        : ok=1    changed=1    unreachable=0    failed=0
node0.cas.dev.cloudurable.com : ok=1    changed=1    unreachable=0    failed=0

Conclusion

This article covered how to create Amazon AMI with Packer, how to provision with Packer/Ansible, how to install services as systemd service, how to create EC2 instances and manage them with the AWC command line. We spend quite a bit of time on setting up ssh and ansible with some specifics for EC2. Lastly we showed how to add DNS entires using Route 53 AWS command line tools.

Ansible, Packer and the AWS command line are must have tools for DevOps who want to create immutable infrastructure, provide continuous integration and deployment of Cassandra clusters using AWS.

More to come

In later articles, we will set up a working cluster in AWS using VPCs, Subnets, Availability Zones and Placement groups. We will continue to use ansible to manage our Cassandra cluster. We plan on writing guides on how to do rolling updates, schema changes, and more with ansible running against Cassandra nodes in EC2. The DevOps concepts should transend Cassandra.

About Cloudurable™

Cloudurable™: streamline DevOps for Cassandra running on AWS. Cloudurable™ provides AMIs, CloudWatch Monitoring, CloudFormation templates and monitoring tools to support Cassandra in production running in EC2. We also teach advanced Cassandra courses which teaches how one could develop, support and deploy Cassandra to production in AWS EC2 for Developers and DevOps.

More info

Please take some time to read the Advantage of using Cloudurable™.

Cloudurable provides:

Authors

Written by R. Hightower and JP Azar.

Resources

Simplify your life with an ssh config file
Setting up ssh config file on linux
SSH config man pages
Digital Ocean: Configure Custom SSH Connection Options
Amazon Blog: Getting Started with Ansible and Dynamic Amazon EC2 Inventory Management
Ansible Docs: AWS EC2 External Inventory Script
Cool lists of things you can do with Cassandra and Ansible
Learning Ansible with Vagrant Training Video
Source code for this article
The first article in this series was about setting up a Cassandra cluster with Vagrant
First article on DZone with some additional content DZone DevOps Setting up a Cassandra Cluster with Vagrant
The second article in this series: setting up SSL for a Cassandra cluster using Vagrant
Second article on DZone with additional content: DZone DevOps Setting up a Cassandra Cluster with SSL
Cloudurable Cassandra Image for Docker, AWS, Packer and Vagrant

About us

Cloudurable™: streamline DevOps for Cassandra running on AWS provides AMIs, CloudWatch Monitoring, CloudFormation templates and monitoring tools to support Cassandra in production running in EC2. We also teach advanced Cassandra courses which teaches how one could develop, support and deploy Cassandra to production in AWS EC2 for Developers and DevOps.

More info

Please take some time to read the Advantage of using Cloudurable™.

Cloudurable provides:

Kafka training, Kafka consulting, Cassandra training, Cassandra consulting, Spark training, Spark consulting

About us

Cloudurable™: streamline DevOps for Cassandra running on AWS provides AMIs, CloudWatch Monitoring, CloudFormation templates and monitoring tools to support Cassandra in production running in EC2. We also teach advanced Cassandra courses which teaches how one could develop, support and deploy Cassandra to production in AWS EC2 for Developers and DevOps.

More info

Please take some time to read the Advantage of using Cloudurable™.

Cloudurable provides:

AWS Cassandra Deployment Guides

Cassandra Cluster/DevOps/DBA tutorial

Kafka training, Kafka consulting, Cassandra training, Cassandra consulting, Spark training, Spark consulting

Cassandra Tutorial 4: Using Packer and Ansible to create and manage EC2 Cassandra instances in AWS

Cloud DevOps: Using Packer, Ansible/SSH and AWS command line tools to create and manage EC2 Cassandra instances in AWS

Overview

Retrospective - Past Articles in this DevOps series

Where do you go if you have a problem or get stuck?

Packer EC2 support and Ansible

Packer script to create Cassandra EC2 image

packer-ec2.json - Packer creattion script for EC2 Cassandra instance

Bash provisioning

scripts/000-ec2-provision.sh

metricsd to send OS metrics to AWS

Installing metricsd systemd from our provisioning scripts

/etc/systemd/system/metricsd.service

systemd-cloud-watch to send OS logs to AWS log aggregation

Installing systemd-cloud-watch systemd service from our provisioning scripts

/etc/systemd/system/systemd-cloud-watch.service

systemd-cloud-watch.conf

Creating a AWS CloudWatch log group

Running Cassandra as a systemd service

/etc/systemd/system/cassandra.service

Using Packer to build our ec2 AMI

Building the AWS AMI

Using AWS CLI to create our Cassandra EC2 instance

Automating EC2 image creation with AWS CLI

bin/create-ec2-instance.sh Create an EC2 instance based on our new AMI from Packer

bin/ec2-env.sh common AWS resources exposed as ENV Vars

bin/login-ec2-cassandra.sh Log into new EC2 instance using ssh

bin/get-IP-cassandra.sh Get public IP address of new EC2 instance using aws cmdline

Running bin/create-ec2-instance.sh

Running bin/create-ec2-instance.sh

Interactive session showing everything running

Ansible and EC2

Debugging possible ssh problems before we get started with ansible

Getting the IP

Can I log in with the pem file?

Can I log in with the key we generated for ansible?

Can I install the key and log in using ssh-agent?

Setting up Ansible using the inventory.ini

Setup cluster key for ansible

Add this to inventory.ini for ansible

Run ansible ping modules against aws-nodes

Dynamic Ansible inventory

Download the ansible ec2 inventory script, make it executable

Using a dynamic inventory

Add centos pem file (key pair)

Pass dynamic list to ansible use user centos

Using ansible with RSA Key and ansible user from last article

Add ansible users RSA private key file - test_rsa file

Pass dynamic list to ansible use user ansible

Show all ansible groups that our Cassandra EC2 instance can be accessed by

Ping all servers in the us-west-2 region

Installing Dynamic Inventory as the default inventory

Installing ansible EC2 dynamic inventory script as the default

Environment varaibles needed to make dynamic inventory work

Using dynamic inventory without -i

Specifying default user via ~/.ssh/config

bin/get-DNS-name-cassandra.sh - Get the DNS name of our Cassandra EC2 instance using aws command line

Getting the DNS name of the Cassandra EC2 instance

EC2 Cassandra host

~/.ssh/config

ssh cnode0

ssh into box using public address

Attempt Get ansible to use public DNS names instead of IP addresses (does not work)

/etc/ansible/ec2.ini

Change vpc_destination_variable = ip_address to vpc_destination_variable = public_dns_name

adding ec2 host direct

Using ansible without passing the id file or username

bin/ec2-env.sh

bin/associate-DNS-with-IP.sh

Change batch for Route 53 hosted zone

~/.ssh/config - Use new DNS naming

~/github/cassandra-image/inventory.ini

Running ansible ping against all of the "instances"

ansible cnode0 -u ansible -m ping

ansible node0.cas.dev.cloudurable.com -u ansible -m ping

Using Ansible to manage our cluster

Ansible running nodetool against all nodes

Running describe-cluster playbook

Conclusion

More to come

bin/get-DNS-name-cassandra.sh - Get the DNS name of our Cassandra EC2 instance using `aws` command line

`~/.ssh/config` - Use new DNS naming