-
Notifications
You must be signed in to change notification settings - Fork 8
Cassandra Tutorial 6: Setting up Cassandra Cluster in EC2 Part 2 Multi AZs with Ec2Snitch
Multi AZs, Ec2Snitch
{
"AWSTemplateFormatVersion": "2010-09-09",
"Description": "Setup VPC for Cassandra Cluster for Cassandra Database",
"Outputs": {
...
"subnetPrivate2Out": {
"Description": "Subnet Private Id",
"Value": {
"Ref": "subnetPrivate2"
},
"Export": {
"Name": {
"Fn::Sub": "${AWS::StackName}-subnetPrivate2"
}
}
},
"Resources": {
...
"subnetPrivate2": {
"Type": "AWS::EC2::Subnet",
"Properties": {
"CidrBlock": "10.0.2.0/24",
"AvailabilityZone": "us-west-2b",
"VpcId": {
"Ref": "vpcMain"
},
"Tags": [
{
"Key": "cloudgen",
"Value": "cassandra-test"
},
{
"Key": "Name",
"Value": "Private subnet2"
}
]
}
...
"subnetPrivate2AclAssociation": {
"Type": "AWS::EC2::SubnetNetworkAclAssociation",
"Properties": {
"NetworkAclId": {
"Ref": "networkACL"
},
"SubnetId": {
"Ref": "subnetPrivate2"
}
}
},
...
"subnetPrivate2RouteTableAssociation": {
"Type": "AWS::EC2::SubnetRouteTableAssociation",
"Properties": {
"RouteTableId": {
"Ref": "routeTablePrivate"
},
"SubnetId": {
"Ref": "subnetPrivate"
"Ref": "subnetPrivate2"
}
}
},
...Notice that private subnet is in the AZ us-west-2b (subnet 1 is in AZ us-west-2a), and it has the
CIDR block 10.0.2.0/24 (subnet 1 has CIDR block 10.0.3.0/24).
"subnetPrivate2": {
"Type": "AWS::EC2::Subnet",
"Properties": {
"CidrBlock": "10.0.2.0/24",
"AvailabilityZone": "us-west-2b",
"VpcId": {
"Ref": "vpcMain"
},
#!/usr/bin/env bash
set -e
source bin/ec2-env.sh
aws --region ${REGION} s3 cp cloud-formation/vpc.json s3://$CLOUD_FORMER_S3_BUCKET
aws --region ${REGION} cloudformation update-stack --stack-name ${ENV}-vpc-cassandra \
--template-url "https://s3-us-west-2.amazonaws.com/$CLOUD_FORMER_S3_BUCKET/vpc.json" \
bin/update-vpc-cloudformation.sh
upload: cloud-formation/vpc.json to s3://cloudurable-cloudformer-templates/vpc.json
{
"StackId": "arn:aws:cloudformation:us-west-2:821683928919:stack/dev-vpc-cassandra/39987f10-0303-11e7-b054-50a686be7356"
}
The update failed because we renamed subnetPrivate to subnetPrivate1. And then there was a conflict with the CIDR addresses. We went into the AWS CloudFormation GUI and deleted the stack manually. The moral of that story is if you plan to have multiple subnets (future subnets), then get the naming down correctly so you can update (faster) instead of delete and re-create the stack (much longer).
Deleting the CloudFormation stack that has a NATGateway takes a long time. It is like watching a complete baseball game with no beer, and 6 extra innings. Also I stop my instances at the end of a workday and they were preventing a security group or two from getting deleted which were in turn preventing the VPC and subnets from being deleted. I had to go terminate the instances. Also sometimes manually deleting a VPC through the AWS GUI console gives more detailed error messages than deleting through the CloudFormation GUI. After manually attempting a VPC delete, fixing the problem of instances being stopped instead of terminated, I then went back and ran delete from the CloudFormation GUI and it worked.
#!/usr/bin/env bash
set -e
source bin/ec2-env.sh
aws --region ${REGION} s3 cp cloud-formation/vpc.json s3://$CLOUD_FORMER_S3_BUCKET
aws --region ${REGION} cloudformation create-stack --stack-name ${ENV}-vpc-cassandra \
--template-url "https://s3-us-west-2.amazonaws.com/$CLOUD_FORMER_S3_BUCKET/vpc.json"
bin/run-vpc-cloudformation.sh
upload: cloud-formation/vpc.json to s3://cloudurable-cloudformer-templates/vpc.json
{
"StackId": "arn:aws:cloudformation:us-west-2:821683928919:stack/dev-vpc-cassandra/f40d2dc0-0379-11e7-8ebb-503aca41a035"
}
$ aws cloudformation describe-stacks --stack-name dev-vpc-cassandra | jq .Stacks[].Outputs[]
{
"Description": "Subnet Private Id",
"OutputKey": "subnetPrivate1Out",
"OutputValue": "subnet-7bf75abc"
}
{
"Description": "Subnet Private Id",
"OutputKey": "subnetPrivate2Out",
"OutputValue": "subnet-e2e56abc"
}
{
"Description": "Subnet Public Id",
"OutputKey": "subnetPublicOut",
"OutputValue": "subnet-7af75abc"
}
{
"Description": "Cassandra Database Node security group for Cassandra Cluster",
"OutputKey": "securityGroupCassandraNodesOut",
"OutputValue": "sg-48ffdabc"
}
{
"Description": "Security Group Bastion for managing Cassandra Cluster Nodes with Ansible",
"OutputKey": "securityGroupBastionOut",
"OutputValue": "sg-4affdabc"
}Modify env to match.
#!/bin/bash
set -e
export REGION=us-west-2
export ENV=dev
export KEY_PAIR_NAME="cloudurable-$REGION"
export PEM_FILE="${HOME}/.ssh/${KEY_PAIR_NAME}.pem"
export SUBNET_PUBLIC=subnet-7af75abc
export SUBNET_PRIVATE1=subnet-7bf75abc
export SUBNET_PRIVATE2=subnet-e2e56abc
export CLOUD_FORMER_S3_BUCKET=cloudurable-cloudformer-templates
export BASTION_NODE_SIZE=t2.small
export BASTION_SECURITY_GROUP=sg-4affdabc
export BASTION_AMI=ami-6db33abc
export BASTION_EC2_INSTANCE_NAME="bastion.${ENV}.${REGION}"
export BASTION_DNS_NAME="bastion.${ENV}.${REGION}.cloudurable.com."
export CASSANDRA_NODE_SIZE=m4.large
export CASSANDRA_AMI=ami-6db33abc
export CASSANDRA_SECURITY_GROUP=sg-48ffdabc
...
Notice that SUBNET_PRIVATE was renamed to SUBNET_PRIVATE1 and that we added SUBNET_PRIVATE2. Now it is just a matter of plugging in our CloudFormation output variables into the right locations.
Our bin/create-ec2-instance-cassandra.sh only handled one private subnet in one AZ. Now we need to change it to switch on AZ so we can deploy to subnetprivate1 in AZ a (us-west-2a) or subnetprivate2 in AZ b (us-west-2b). To do this AZ switch we will add an extra argument to our bash script as follows.
#!/bin/bash
set -e
source bin/ec2-env.sh
# Set the private IP to 10.0.1.10 (the seed node), if first arg empty
if [ -z "$1" ]
then
PRIVATE_IP_ADDRESS=10.0.1.10
else
PRIVATE_IP_ADDRESS=$1
fi
# Set the AZ to a if empty
if [ -z "$2" ]
then
AZ="a"
else
AZ=$2
fi
if [ "$AZ" == "a" ]
then
SUBNET_PRIVATE="$SUBNET_PRIVATE1"
else
SUBNET_PRIVATE="$SUBNET_PRIVATE2"
fi
instance_id=$(aws ec2 run-instances --image-id "$CASSANDRA_AMI" --subnet-id "$SUBNET_PRIVATE" \
--instance-type ${CASSANDRA_NODE_SIZE} --private-ip-address ${PRIVATE_IP_ADDRESS} \
--iam-instance-profile "Name=$CASSANDRA_IAM_PROFILE" \
--security-group-ids "$CASSANDRA_SECURITY_GROUP" \
--user-data file://resources/user-data/cassandra \
--key-name "$KEY_PAIR_NAME" | jq --raw-output .Instances[].InstanceId)
## For debugging only...
# --associate-public-ip-address ADD this param to run-instances if you add Cassandra to pub subnet
echo "Cassandra Database: Cassandra Cluster Node ${instance_id} is being created"
aws ec2 wait instance-exists --instance-ids "$instance_id"
aws ec2 create-tags --resources "${instance_id}" --tags Key=Name,Value="${CASSANDRA_EC2_INSTANCE_NAME}" \
Key=Cluster,Value="Cassandra" Key=Role,Value="Cassandra_Database_Cluster_Node" Key=Env,Value="DEV"
echo "Cassandra Node ${instance_id} was tagged waiting for status ready"
aws ec2 wait instance-status-ok --instance-ids "$instance_id" bin/create-ec2-instance-bastion.sh
bastion i-067f664ebccf23e03 is being created
i-067f664ebccf23e03 was tagged waiting to login
IP ADDRESS 55.222.233.79 bastion.dev.us-west-2.cloudurable.com.
{
"Changes":[
{
"Action": "UPSERT",
"ResourceRecordSet": {
"Type": "A",
"Name": "bastion.dev.us-west-2.cloudurable.com.",
"TTL": 300,
"ResourceRecords": [{
"Value": "55.222.233.79"
}]
}
}
]
}
IP ADDRESS 55.222.233.79
The authenticity of host '55.222.233.79' can't be established.
ECDSA key fingerprint is SHA256:ABC123.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '55.222.233.79' (ECDSA) to the list of known hosts.
bin/create-ec2-instance-cassandra.sh
Cassandra Database: Cassandra Cluster Node i-0b351a03c58245abc is being created
Cassandra Node i-0b351a03c58245abc was tagged waiting for status readybin/create-ec2-instance-cassandra.sh
Cassandra Database: Cassandra Cluster Node i-0b351a03c58245abc is being created
Cassandra Node i-0b351a03c58245abc was tagged waiting for status readyNotice we pass no arguments which means it will be in default AZ A, and will use default IP address 10.0.1.10.
bin/create-ec2-instance-cassandra.sh 10.0.1.11
Cassandra Database: Cassandra Cluster Node i-0d293603230169abc is being created
Cassandra Node i-0d293603230169abc was tagged waiting for status readyNotice we pass one arguments which means it will be in default AZ A, and use IP address 10.0.1.11.
Lastly we launch the third server.
$ bin/create-ec2-instance-cassandra.sh 10.0.2.10 b
Cassandra Database: Cassandra Cluster Node i-0d00c98e0a2c2babc is being created
Cassandra Node i-0d00c98e0a2c2babc was tagged waiting for status ready
Notice we pass two arguments which means it is in AZ B, and uses IP address 10.0.2.10.
Let's verify that that this is the case.
$ ssh cassandra.node0
[ansible@ip-10-0-1-10 ~]$ /opt/cassandra/bin/nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.0.2.10 163.11 KiB 32 73.5% a6a07305-7099-4693-ac36-65afcfaa13d6 rack1
UN 10.0.1.10 104.41 KiB 32 69.1% c1ba5f75-3cf2-4ee2-a580-2d311a4b0275 rack1
UN 10.0.1.11 75.87 KiB 32 57.4% 60af9ad7-27b0-4cd7-a846-d0958a0a2dc4 rack1
Do you see a problem? They are all on rack1. AZs correspond to racks.
[cassandra-nodes]
cassandra.node0
10.0.1.11
10.0.2.10
$ ansible-playbook playbooks/describe-cluster.yml --verbose
Using /Users/jean/github/cassandra-image/ansible.cfg as config file
PLAY [cassandra-nodes] *********************************************************
TASK [Run NodeTool Describe Cluster command] ***********************************
changed: [10.0.1.11] => {"changed": true, "cmd": ["/opt/cassandra/bin/nodetool", "describecluster"],
...
[10.0.1.10, 10.0.2.10, 10.0.1.11]",
...{"changed": true, "cmd": ["/opt/cassandra/bin/nodetool", "describecluster"], ...
[10.0.2.10, 10.0.1.10, 10.0.1.11]", ...
fatal: [10.0.2.10]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh.", "unreachable": true}
to retry, use: --limit @playbooks/describe-cluster.retry
PLAY RECAP *********************************************************************
10.0.1.11 : ok=1 changed=1 unreachable=0 failed=0
10.0.2.10 : ok=0 changed=0 unreachable=1 failed=0
cassandra.node0 : ok=1 changed=1 unreachable=0 failed=0
Ok. The clusters look setup but we can't seem to ssh into 10.0.2.10.
$ ssh 10.0.1.11
Last login: Tue Mar 7 22:13:15 2017 from ip-10-0-0-62.us-west-2.compute.internal
[ansible@ip-10-0-1-11 ~]$ exit
logout
Shared connection to 10.0.1.11 closed.
$ ssh 10.0.2.10
ssh: connect to host 10.0.2.10 port 22: Operation timed out
Ok so we can log into 10.0.1.11 but not 10.0.2.0.
To fix this we have to modify our ssh config as follows.
Host 10.0.2.*
ForwardAgent yes
IdentityFile ~/.ssh/test_rsa
ProxyCommand ssh bastion -W %h:%p
User ansible
ControlMaster auto
ControlPath ~/.ssh/ansible-%r@%h:%p
ControlPersist 5m
Host 10.0.1.*
ForwardAgent yes
IdentityFile ~/.ssh/test_rsa
ProxyCommand ssh bastion -W %h:%p
User ansible
ControlMaster auto
ControlPath ~/.ssh/ansible-%r@%h:%p
ControlPersist 5m
Or you can do this
Host 10.0.*
ForwardAgent yes
IdentityFile ~/.ssh/test_rsa
ProxyCommand ssh bastion -W %h:%p
User ansible
ControlMaster auto
ControlPath ~/.ssh/ansible-%r@%h:%p
ControlPersist 5m
Both work, but with the first approach you will have to remember to modify it again when you add another AZ. I prefer the first approach.
Now let's retest.
$ ansible-playbook playbooks/describe-cluster.yml
PLAY [cassandra-nodes] *********************************************************
TASK [Run NodeTool Describe Cluster command] ***********************************
changed: [10.0.1.11]
changed: [cassandra.node0]
changed: [10.0.2.10]
PLAY RECAP *********************************************************************
10.0.1.11 : ok=1 changed=1 unreachable=0 failed=0
10.0.2.10 : ok=1 changed=1 unreachable=0 failed=0
cassandra.node0 : ok=1 changed=1 unreachable=0 failed=0
It can reach all nodes.
The cassandra-cloud utility which we used in the last tutorial. Has an option called -snitch. The snitch options allows you to describe the Cassandra snitch type, examples, GossipingPropertyFileSnitch, PropertyFileSnitch, Ec2Snitch, etc. The cassandra-cloud utility defaults to "SimpleSnitch". If we instead specify Ec2Snitch, Cassandra will recognize AWS AZs as Cassandra racks. TODO MORE EXPLANATION HERE.
#!/bin/bash
set -e
export BIND_IP=`curl http://169.254.169.254/latest/meta-data/local-ipv4`
/opt/cloudurable/bin/cassandra-cloud -cluster-name test \
-client-address ${BIND_IP} \
-cluster-address ${BIND_IP} \
-cluster-seeds 10.0.1.10,10.0.2.10 \
-snitch Ec2Snitch
/bin/systemctl restart cassandra
Notice we are now passing the Ec2Snitch as the snitch and we also added 10.0.2.10 as a seed server.
Now let's start four servers.
# Create Cassandra seed nodes in AWS AZ a and AWS AZ b.
bin/create-ec2-instance-cassandra.sh 10.0.1.10 a
bin/create-ec2-instance-cassandra.sh 10.0.2.10 b
# Create two more Cassandra database servers in AZ a and b.
bin/create-ec2-instance-cassandra.sh 10.0.1.11 a
bin/create-ec2-instance-cassandra.sh 10.0.2.11 b
We are creating two seed servers (10.0.1.10, and 10.0.2.10) in two different AZs. Try to use at least three AZs for real prod applications with at least three servers per AZ. Try to have 1 seed node per AZ so an outage will not prevent you adding nodes to your Cassandra Cluster.
$ ansible cassandra.node0 -a "/opt/cassandra/bin/nodetool status"
cassandra.node0 | SUCCESS | rc=0 >>
Datacenter: us-west-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.0.1.10 88.08 KiB 32 45.9% d4a0d593-9dac-40c8-8d0d-9146a923e9fe 2a
UN 10.0.2.10 107.21 KiB 32 50.1% c8f0d0e8-b4de-44ce-b9a2-b8ecc1781484 2b
UN 10.0.2.11 95.09 KiB 32 43.7% c3dabbfd-b53d-48cc-a408-6cb931935535 2b
UN 10.0.1.11 81.01 KiB 32 60.3% 2c4638f6-4d1c-4c1c-82d1-df1efce116e0 2a
The above runs the /opt/cassandra/bin/nodetool status on the Cassandra seed node for AZ A (us-west-2).
Notice that the AWS region (us-west-2) becomes a Cassandra Datacenter. Also notice that there are two racks
corresponding to the two AZs that we launched our Cassandra nodes into, namely, 2a, and 2b, which correspond
to AWS AZs us-west-2a and us-west-2b. The Ec2Snitch maps the Cassandra Datacenters to AWS regions and Cassandra Racks to AWS AZs.
Just to be sure, let's add 10.0.2.11 to the inventory.ini under cassandra-nodes and run the status check
on all of the servers using ansible.
10.0.1.11 | SUCCESS | rc=0 >>
Datacenter: us-west-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.0.1.10 88.08 KiB 32 45.9% d4a0d593-9dac-40c8-8d0d-9146a923e9fe 2a
UN 10.0.2.10 107.21 KiB 32 50.1% c8f0d0e8-b4de-44ce-b9a2-b8ecc1781484 2b
UN 10.0.2.11 95.09 KiB 32 43.7% c3dabbfd-b53d-48cc-a408-6cb931935535 2b
UN 10.0.1.11 81.01 KiB 32 60.3% 2c4638f6-4d1c-4c1c-82d1-df1efce116e0 2a
10.0.2.10 | SUCCESS | rc=0 >>
Datacenter: us-west-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.0.1.10 88.08 KiB 32 45.9% d4a0d593-9dac-40c8-8d0d-9146a923e9fe 2a
UN 10.0.2.10 107.21 KiB 32 50.1% c8f0d0e8-b4de-44ce-b9a2-b8ecc1781484 2b
UN 10.0.2.11 95.09 KiB 32 43.7% c3dabbfd-b53d-48cc-a408-6cb931935535 2b
UN 10.0.1.11 81.01 KiB 32 60.3% 2c4638f6-4d1c-4c1c-82d1-df1efce116e0 2a
cassandra.node0 | SUCCESS | rc=0 >>
Datacenter: us-west-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.0.1.10 88.08 KiB 32 45.9% d4a0d593-9dac-40c8-8d0d-9146a923e9fe 2a
UN 10.0.2.10 107.21 KiB 32 50.1% c8f0d0e8-b4de-44ce-b9a2-b8ecc1781484 2b
UN 10.0.2.11 95.09 KiB 32 43.7% c3dabbfd-b53d-48cc-a408-6cb931935535 2b
UN 10.0.1.11 81.01 KiB 32 60.3% 2c4638f6-4d1c-4c1c-82d1-df1efce116e0 2a
10.0.2.11 | SUCCESS | rc=0 >>
Datacenter: us-west-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.0.1.10 88.08 KiB 32 45.9% d4a0d593-9dac-40c8-8d0d-9146a923e9fe 2a
UN 10.0.2.10 107.21 KiB 32 50.1% c8f0d0e8-b4de-44ce-b9a2-b8ecc1781484 2b
UN 10.0.2.11 95.09 KiB 32 43.7% c3dabbfd-b53d-48cc-a408-6cb931935535 2b
UN 10.0.1.11 81.01 KiB 32 60.3% 2c4638f6-4d1c-4c1c-82d1-df1efce116e0 2aLooks like they all agree on status of the Cassandra Cluster. We can see that each Cassandra Database node is up and in the normal state.
Cloudurable™: streamline DevOps for Cassandra running on AWS provides AMIs, CloudWatch Monitoring, CloudFormation templates and monitoring tools to support Cassandra in production running in EC2. We also teach advanced Cassandra courses which teaches how one could develop, support and deploy Cassandra to production in AWS EC2 for Developers and DevOps.
Please take some time to read the Advantage of using Cloudurable™.
Cloudurable provides:
- Subscription Amazon Cassandra support to streamline DevOps (Support subscription pricing for Cassandra and Kafka in AWS)
- Cassandra Course
- Cassandra Consulting: Quick Start
- Cassandra Consulting: Architecture Analysis
Kafka training, Kafka consulting, Cassandra training, Cassandra consulting, Spark training, Spark consulting
Cloudurable™: streamline DevOps for Cassandra running on AWS provides AMIs, CloudWatch Monitoring, CloudFormation templates and monitoring tools to support Cassandra in production running in EC2. We also teach advanced Cassandra courses which teaches how one could develop, support and deploy Cassandra to production in AWS EC2 for Developers and DevOps.
Please take some time to read the Advantage of using Cloudurable™.
Cloudurable provides:
- Subscription Cassandra support to streamline DevOps (Support subscription pricing for Cassandra and Kafka in AWS)
- Cassandra Course
- Cassandra Consulting: Quick Start
- Cassandra Consulting: Architecture Analysis
- Cassandra Cluster Tutorial 1 - Vagrant Cassandra Cluster, DevOps testing
- Cassandra Cluster Tutorial 2 - SSL Security for Cassandra Cluster
- Cassandra Cluster Tutorial 3 - Ansible, DevOps, SSH config for Cassandra Cluster
- Cassandra Cluster Tutorial 4 - AWS, Packer, DevOps Ansible, SSH config for Cassandra Cluster
- Cassandra Cluster Tutorial 5 - AWS, VPC, Subnets, NACL, CloudFormation, DevOps Ansible Playbook for Cassandra Cluster
- Cassandra Cluster Tutorial 6 - AWS, multi-AZ, EC2Snitch
- Cassandra Cluster Tutorial 7 - AWS, multi-region, Ec2MultiRegionSnitch
Kafka training, Kafka consulting, Cassandra training, Cassandra consulting, Spark training, Spark consulting