Skip to content

Cassandra Tutorial 7: Setting up Cassandra Cluster in EC2 Part 2 Multi Region with Ec2MultiRegionSnitch

jeanpaulazara edited this page Mar 13, 2017 · 25 revisions

In the previous Cassandra Cluster tutorial, we setup a Cassandra cluster that works in two AZs. In this Cassandra Cluster tutorial, we will pick up where the last one left off and connect two regions. You might recall that regions are datacenters with the Cassandra Database Ec2Snitch and racks are Amazon Availability Zones (AZs). Now it is time to have more than one datacenter, i.e., it is time to run our Cassandra cluster in more than one AWS region.

The Cassandra Database Ec2MultiRegionSnitch snitch uses public IP designated in the broadcast_address of the Cassandra YAML config. Specifying the broadcast_address allow the Cassandra Cluster to connect to another AWS region which acts like a second datacenter.

Like before we want to configure the cassandra.yaml, to the listen_address of the private IP address of the Cassandra node. Now we also want to set the broadcast_address to the public IP address of the Cassandra node EC2 instance.

This will connect the Cassandra Cluster nodes in one AWS region to bind to nodes in another AWS/region (Cassandra datacenter). For traffic in the same AWS region/Casandra datacenter, Cassandra switches to the private IP after establishing a connection.

Just like in the last tutorial we will use the cassandra-cloud project to generate the cassandra.yaml config. We can get the broadcast_address with curl http://169.254.169.254/latest/meta-data/public-ipv4.

We have to ensure that the storage_port or ssl_storage_port (Cassandra yaml) is routable and open. This means for this Cassandra Cluster tutorial that we will need to launch our Cassandra clusters in the public subnet (unless you have a VPC gateway setup).

Each node will have two physical network interfaces in a multi-datacenter/multi-region installation. And our aim is to deploy our Cassandra cluster across multiple Amazon EC2 regions using the Ec2MultiRegionSnitch. We need to follow these configuration steps (cassandra.yaml):

  • Set listen_address to Cassandra node's private IP or hostname
  • Set broadcast_address to the Cassandra node's public IP or VPN routable IP for communication between AWS regions
  • Set listen_on_broadcast_address to true
  • If the Cassandra node is a seed node, add the node's public IP address or hostname to the seeds list????

AWS CloudFormation updates: Relaunch Cassandra Cluster Nodes in public subnet

I am not a fan of this, but unless we are using a private AWS VPN Customer gateway, we will need to launch our nodes into public subnets. This means we will need to lock down access to the Cassandra Cluster's AWS Security Group (AWS firewall). We could also lock down the Network ACL, but will leave that as an exercise for the reader.

Also calling the subnet for clusters subnetPrivate1 and subnetPrivate2 makes no sense any longer so let's rename them to subnetCluster1 and subnetCluster2. We don't need a NatGateway any longer since the Cassandra nodes will live in public subnets (even saying this hurts me see note). We will also need to create a new route table for the clusters and route the InternetGateway via this new route table.

Note: If you did not get this already, the only time I would use Ec2MultiRegionSnitch is if first you are using a Customer VPN Gateway (Amazon VPC VPN to a AWS Customer Gateway).

Another change to the Cassandra Cluster CloudFormation is we can no longer hard-code AWS regions and AWS Availability Zones. Instead we will have to use this crazy CloudFormation syntax "AvailabilityZone": { "Fn::Join" : [ "", [ { "Ref": "AWS::Region"}, "a"] ] } to lookup the current region and form an AZ string. Fn::Join joins a string. Fn::Ref looks up references to things like the current region or actual AWS ids.

vpc.json - CloudFormation

{
  "AWSTemplateFormatVersion": "2010-09-09",
  "Description": "Setup VPC for Cassandra Cluster for Cassandra Database",

  "Outputs": {
    "subnetPublicOut": {
      "Description": "Subnet Public Id",
      "Value": {
        "Ref": "subnetPublic"
      },
      "Export": {
        "Name": {
          "Fn::Sub": "${AWS::StackName}-subnetPublic"
        }
      }
    },
    "subnetCluster1Out": {
      "Description": "Subnet Cluster 1 Id",
      "Value": {
        "Ref": "subnetCluster1"
      },
      "Export": {
        "Name": {
          "Fn::Sub": "${AWS::StackName}-subnetCluster1"
        }
      }
    },
    "subnetCluster2Out": {
      "Description": "Subnet Cluster Id",
      "Value": {
        "Ref": "subnetCluster2"
      },
      "Export": {
        "Name": {
          "Fn::Sub": "${AWS::StackName}-subnetCluster2"
        }
      }
    },
    "securityGroupBastionOut": {
      "Description": "Security Group Bastion for managing Cassandra Cluster Nodes with Ansible",
      "Value": {
        "Ref": "securityGroupBastion"
      },
      "Export": {
        "Name": {
          "Fn::Sub": "${AWS::StackName}-securityGroupBastion"
        }
      }
    },
    "securityGroupCassandraNodesOut": {
      "Description": "Cassandra Database Node security group for Cassandra Cluster",
      "Value": {
        "Ref": "securityGroupCassandraNodes"
      },
      "Export": {
        "Name": {
          "Fn::Sub": "${AWS::StackName}-securityGroupCassandraNodes"
        }
      }
    }
  },
  "Resources": {
    "vpcMain": {
      "Type": "AWS::EC2::VPC",
      "Properties": {
        "CidrBlock": "10.0.0.0/16",
        "InstanceTenancy": "default",
        "EnableDnsSupport": "true",
        "EnableDnsHostnames": "true",
        "Tags": [
          {
            "Key": "cloudgen",
            "Value": "cassandra-test"
          },
          {
            "Key": "Name",
            "Value": "CassandraTestCluster"
          }
        ]
      }
    },
    "subnetPublic": {
      "Type": "AWS::EC2::Subnet",
      "Properties": {
        "CidrBlock": "10.0.0.0/24",
        "AvailabilityZone": { "Fn::Join" : [ "", [ { "Ref": "AWS::Region"}, "a"] ] },
        "VpcId": {
          "Ref": "vpcMain"
        },
        "Tags": [
          {
            "Key": "cloudgen",
            "Value": "cassandra-test"
          },
          {
            "Key": "Name",
            "Value": "Public subnet"
          }
        ]
      }
    },
    "subnetCluster1": {
      "Type": "AWS::EC2::Subnet",
      "Properties": {
        "CidrBlock": "10.0.1.0/24",
        "AvailabilityZone": { "Fn::Join" : [ "", [ { "Ref": "AWS::Region"}, "a"] ] },
        "VpcId": {
          "Ref": "vpcMain"
        },
        "Tags": [
          {
            "Key": "cloudgen",
            "Value": "cassandra-test"
          },
          {
            "Key": "Name",
            "Value": "Cluster subnet1"
          }
        ]
      }
    },
    "subnetCluster2": {
      "Type": "AWS::EC2::Subnet",
      "Properties": {
        "CidrBlock": "10.0.2.0/24",
        "AvailabilityZone": { "Fn::Join" : [ "", [ { "Ref": "AWS::Region"}, "b"] ] },
        "VpcId": {
          "Ref": "vpcMain"
        },
        "Tags": [
          {
            "Key": "cloudgen",
            "Value": "cassandra-test"
          },
          {
            "Key": "Name",
            "Value": "Cluster subnet2"
          }
        ]
      }
    },
    "internetGateway": {
      "Type": "AWS::EC2::InternetGateway",
      "Properties": {
      }
    },
    "dhcpOptions": {
      "Type": "AWS::EC2::DHCPOptions",
      "Properties": {
        "DomainName": { "Fn::Join" : [ "", [ { "Ref": "AWS::Region"}, ".compute.internal"] ] },
        "DomainNameServers": [
          "AmazonProvidedDNS"
        ]
      }
    },
    "networkACL": {
      "Type": "AWS::EC2::NetworkAcl",
      "Properties": {
        "VpcId": {
          "Ref": "vpcMain"
        },
        "Tags": [
          {
            "Key": "cloudgen",
            "Value": "cassandra-test"
          },
          {
            "Key": "Name",
            "Value": "CassandraTestNACL"
          }
        ]
      }
    },
    "routeTableMain": {
      "Type": "AWS::EC2::RouteTable",
      "Properties": {
        "VpcId": {
          "Ref": "vpcMain"
        },
        "Tags": [
          {
            "Key": "cloudgen",
            "Value": "cassandra-test"
          }
        ]
      }
    },
    "routeTablePublic": {
      "Type": "AWS::EC2::RouteTable",
      "Properties": {
        "VpcId": {
          "Ref": "vpcMain"
        },
        "Tags": [
          {
            "Key": "cloudgen",
            "Value": "cassandra-test"
          }
        ]
      }
    },
    "routeTableCluster": {
      "Type": "AWS::EC2::RouteTable",
      "Properties": {
        "VpcId": {
          "Ref": "vpcMain"
        },
        "Tags": [
          {
            "Key": "cloudgen",
            "Value": "cassandra-test"
          }
        ]
      }
    },
    "securityGroupDefault": {
      "Type": "AWS::EC2::SecurityGroup",
      "Properties": {
        "GroupDescription": "default VPC security group",
        "VpcId": {
          "Ref": "vpcMain"
        },
        "Tags": [
          {
            "Key": "cloudgen",
            "Value": "cassandra-test"
          },
          {
            "Key": "Name",
            "Value": "CassandraTestSG"
          }
        ]
      }
    },
    "aclEntryAllowAllEgress": {
      "Type": "AWS::EC2::NetworkAclEntry",
      "Properties": {
        "CidrBlock": "0.0.0.0/0",
        "Egress": "true",
        "Protocol": "-1",
        "RuleAction": "allow",
        "RuleNumber": "100",
        "NetworkAclId": {
          "Ref": "networkACL"
        }
      }
    },
    "aclEntryAllowAllIngress": {
      "Type": "AWS::EC2::NetworkAclEntry",
      "Properties": {
        "CidrBlock": "0.0.0.0/0",
        "Protocol": "-1",
        "RuleAction": "allow",
        "RuleNumber": "100",
        "NetworkAclId": {
          "Ref": "networkACL"
        }
      }
    },
    "subnetAclAssociationPublic": {
      "Type": "AWS::EC2::SubnetNetworkAclAssociation",
      "Properties": {
        "NetworkAclId": {
          "Ref": "networkACL"
        },
        "SubnetId": {
          "Ref": "subnetPublic"
        }
      }
    },
    "subnetCluster1AclAssociation": {
      "Type": "AWS::EC2::SubnetNetworkAclAssociation",
      "Properties": {
        "NetworkAclId": {
          "Ref": "networkACL"
        },
        "SubnetId": {
          "Ref": "subnetCluster1"
        }
      }
    },
    "subnetCluster2AclAssociation": {
      "Type": "AWS::EC2::SubnetNetworkAclAssociation",
      "Properties": {
        "NetworkAclId": {
          "Ref": "networkACL"
        },
        "SubnetId": {
          "Ref": "subnetCluster2"
        }
      }
    },
    "vpcGatewayAttachment": {
      "Type": "AWS::EC2::VPCGatewayAttachment",
      "Properties": {
        "VpcId": {
          "Ref": "vpcMain"
        },
        "InternetGatewayId": {
          "Ref": "internetGateway"
        }
      }
    },
    "subnetRouteTableAssociationPublic": {
      "Type": "AWS::EC2::SubnetRouteTableAssociation",
      "Properties": {
        "RouteTableId": {
          "Ref": "routeTablePublic"
        },
        "SubnetId": {
          "Ref": "subnetPublic"
        }
      }
    },
    "subnetCluster1RouteTableAssociation": {
      "Type": "AWS::EC2::SubnetRouteTableAssociation",
      "Properties": {
        "RouteTableId": {
          "Ref": "routeTableCluster"
        },
        "SubnetId": {
          "Ref": "subnetCluster1"
        }
      }
    },
    "subnetCluster2RouteTableAssociation": {
      "Type": "AWS::EC2::SubnetRouteTableAssociation",
      "Properties": {
        "RouteTableId": {
          "Ref": "routeTableCluster"
        },
        "SubnetId": {
          "Ref": "subnetCluster2"
        }
      }
    },
    "routePublicSubnetToInternetGateway": {
      "Type": "AWS::EC2::Route",
      "Properties": {
        "DestinationCidrBlock": "0.0.0.0/0",
        "RouteTableId": {
          "Ref": "routeTablePublic"
        },
        "GatewayId": {
          "Ref": "internetGateway"
        }
      },
      "DependsOn": "vpcGatewayAttachment"
    },
    "routeClusterToInternetGateway": {
      "Type": "AWS::EC2::Route",
      "Properties": {
        "DestinationCidrBlock": "0.0.0.0/0",
        "RouteTableId": {
          "Ref": "routeTableCluster"
        },
        "GatewayId": {
          "Ref": "internetGateway"
        }
      },
      "DependsOn": "vpcGatewayAttachment"
    },
    "vpcDHCPOptionsAssociation": {
      "Type": "AWS::EC2::VPCDHCPOptionsAssociation",
      "Properties": {
        "VpcId": {
          "Ref": "vpcMain"
        },
        "DhcpOptionsId": {
          "Ref": "dhcpOptions"
        }
      }
    },
    "securityGroupIngressDefault": {
      "Type": "AWS::EC2::SecurityGroupIngress",
      "Properties": {
        "GroupId": {
          "Ref": "securityGroupDefault"
        },
        "IpProtocol": "-1",
        "SourceSecurityGroupId": {
          "Ref": "securityGroupDefault"
        }
      }
    },
    "securityGroupEgressDefault": {
      "Type": "AWS::EC2::SecurityGroupEgress",
      "Properties": {
        "GroupId": {
          "Ref": "securityGroupDefault"
        },
        "IpProtocol": "-1",
        "CidrIp": "0.0.0.0/0"
      }
    },
    "securityGroupBastion": {
      "Type": "AWS::EC2::SecurityGroup",
      "Properties": {
        "GroupDescription": "Security group for bastion server.",
        "VpcId": {
          "Ref": "vpcMain"
        },
        "SecurityGroupIngress": [
          {
            "IpProtocol": "tcp",
            "FromPort": "22",
            "ToPort": "22",
            "CidrIp": "0.0.0.0/0"
          }
        ],
        "SecurityGroupEgress": [
          {
            "IpProtocol": "-1",
            "CidrIp": "0.0.0.0/0"
          }
        ],
        "Tags": [
          {
            "Key": "Name",
            "Value": "bastionSecurityGroup"
          },
          {
            "Key": "cloudgen",
            "Value": "cassandra-test"
          }
        ]
      }
    },
    "securityGroupCassandraNodes": {
      "Type": "AWS::EC2::SecurityGroup",
      "Properties": {
        "GroupDescription": "Security group for Cassandra Database nodes in Cassandra Cluster",
        "VpcId": {
          "Ref": "vpcMain"
        },
        "SecurityGroupIngress": [
          {
            "IpProtocol": "-1",
            "CidrIp": "10.0.0.0/8"
          },
          {
            "IpProtocol": "TCP",
            "CidrIp": "0.0.0.0/0",
            "FromPort": "7000", "ToPort":"7001"
          },
          {
            "IpProtocol": "TCP",
            "CidrIp": "0.0.0.0/0",
            "FromPort": "9042", "ToPort":"9042"
          }
        ],
        "SecurityGroupEgress": [
          {
            "IpProtocol": "-1",
            "CidrIp": "0.0.0.0/0"
          }
        ],
        "Tags": [
          {
            "Key": "Name",
            "Value": "cassandraSecurityGroup"
          },
          {
            "Key": "cloudgen",
            "Value": "cassandra-test"
          }
        ]
      }
    }
  }
}

We need to be able to pass the region when launching the formation script.

bin/run-vpc-cloudformation.sh - Changing the launch script to take AWS Regione

#!/usr/bin/env bash
set -e
source bin/ec2-env.sh

# Set aws-region
if [ -z "$1" ]
    then
        AWS_REGION=${REGION}
    else
        AWS_REGION=$1
fi

aws --region ${REGION} s3 cp cloud-formation/vpc.json s3://$CLOUD_FORMER_S3_BUCKET
aws --region ${AWS_REGION} cloudformation create-stack --stack-name ${ENV}-vpc-cassandra \
--template-url "https://s3-us-west-2.amazonaws.com/$CLOUD_FORMER_S3_BUCKET/vpc.json" 

Notice that the AWS region (AWS_REGION) can be passed as the first argument.

To run it against Ohio do the following.

Running bin/run-vpc-cloudformation.sh us-east-2

$ bin/run-vpc-cloudformation.sh   us-east-2
upload: cloud-formation/vpc.json to s3://cloudurable-cloudformer-templates/vpc.json
{
    "StackId": "arn:aws:cloudformation:us-east-2:821683928919:stack/dev-vpc-cassandra/93f8d830-039c-11e7-9dce-503f3136a135"
}

This script/CloudFormation runs everywhere but in Virginia, if you can figure out why, drop us a note, we are hiring.

Delete the earlier Cassandra Cluster CloudFormation

Terminate all instances (bastion and cassandra), and delete the other Cassandra Cluster CloudFormation since we once again did a major refactor of naming.

To run the Cassandra Cluster CloudFormation after the delete is complete, use bin/run-vpc-cloudformation.sh us-west-2.

Using the output from the CloudFormation, modify the bin/ec2-env.sh file to use the new subnets, and security groups. Then start up the instances for Cassandra and the bastion.

With the original cluster in DC us-west-2

$ bin/create-ec2-instance-bastion.sh 
$ bin/create-ec2-instance-cassandra.sh 10.0.1.10 a
$ bin/create-ec2-instance-cassandra.sh 10.0.2.10 b
$ bin/create-ec2-instance-cassandra.sh 10.0.1.11 a
$ bin/create-ec2-instance-cassandra.sh 10.0.2.11 b

Verify connectivity and status of Cassandra Cluster with ansible

Let's ensure the bastion and servers are up.

Ping Cassandra Cluster nodes with ansible

$ ansible cassandra-nodes -m ping
10.0.2.11 | SUCCESS => {
    "changed": false, 
    "ping": "pong"
}
cassandra.node0 | SUCCESS => {
    "changed": false, 
    "ping": "pong"
}
10.0.1.11 | SUCCESS => {
    "changed": false, 
    "ping": "pong"
}
10.0.2.10 | SUCCESS => {
    "changed": false, 
    "ping": "pong"
}

Check node health.

$ ansible cassandra.node0 -a "/opt/cassandra/bin/nodetool status"
cassandra.node0 | SUCCESS | rc=0 >>
Datacenter: us-west-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.0.2.10  102.25 KiB  32           58.8%             662f8f8d-5f04-4c96-977a-f96d4d03c863  2b
UN  10.0.1.10  88.06 KiB   32           56.8%             9a1d5f8d-ffa8-4710-9d9d-999606d24474  2a
UN  10.0.2.11  119.25 KiB  32           40.0%             a622a2b7-1519-46f8-8589-536f782688d3  2b
UN  10.0.1.11  95.06 KiB   32           44.4%             b6dc5ac7-d96d-4707-8790-ed66a010b67c  2a

Running servers in the other Region

In our case the other region is Ohio (us-east-2) as our first region was Oregon (us-west-2). In order to create these other instances, we need to have an image in the other region. We can use Packer to recreate the image in the other region like the following.

Invoke Packer to recreate the Cassandra Database EC2 AMI in us-east-2

$ packer build --var-file=packer-vars-us-east-2.json packer-ec2.json

This will recreate the image in the new region. https://www.packer.io/docs/templates/user-variables.html

Or you can do an aws ami copy

http://docs.aws.amazon.com/cli/latest/reference/ec2/copy-image.html

aws copy image

$ aws --region us-east-2 ec2 copy-image --source-region us-west-2 \
      --source-image-id ami-6db3310d --name CassandraClusterAMI

The aws ec2 copy-image seems a lot faster than recreating the image, and it ensures that you have the same image/AMI running in both regions. Exactly the same!

Add ability to pass EC2 Region as a param to all scripts

Up until now, we have only been supporting one region. We need our bash scripts to work for multiple regions. TBD explain this more.

bin/ec2-env-region.sh - Region env variables

#!/bin/bash
set -e

source "bin/ec2-env-region-${AWS_REGION}.sh"

Then we define two files bin/ec2-env-region-us-west-2.sh and bin/ec2-env-region-us-east-2.sh as follows:

bin/ec2-env-region-us-west-2.sh

#!/bin/bash
set -e

export SUBNET_PUBLIC=subnet-2bf57abc
export SUBNET_PRIVATE1=subnet-28f57abc
export SUBNET_PRIVATE2=subnet-a1ef2abc
export BASTION_AMI=ami-fa6d4abc
export BASTION_SECURITY_GROUP=sg-78cc5abc
export CASSANDRA_AMI=ami-fa6d4abc
export CASSANDRA_SECURITY_GROUP=sg-66cc5abc

Then we change all of our scripts to take an extra region command line argument. For brevity we will show just one, feel free to read the source as the scripts are small.

bin/create-ec2-instance-bastion.sh

#!/bin/bash
set -e

source bin/ec2-env.sh

if [ "$1" == "-h" ]
    then
        echo "Usage:"
        echo "    bin/create-ec2-instance-bastion.sh <AWS_REGION|Optional>"
        echo "Example:"
        echo "    $ bin/create-ec2-instance-bastion.sh us-east-2"
        exit 0
fi

# Set aws-region
if [ -z "$1" ]
    then
        export AWS_REGION=${REGION}
    else
        export AWS_REGION=$1
fi

source bin/ec2-env-region.sh

BASTION_EC2_INSTANCE_NAME="bastion.${ENV}.${AWS_REGION}"
BASTION_DNS_NAME="bastion.${ENV}.${AWS_REGION}.cloudurable.com."

instance_id=$(aws --region ${AWS_REGION} ec2 run-instances --image-id "$BASTION_AMI" \
 --subnet-id  "$SUBNET_PUBLIC" \
 --instance-type "$BASTION_NODE_SIZE" --iam-instance-profile "Name=$CASSANDRA_IAM_PROFILE" \
 --associate-public-ip-address --security-group-ids "$BASTION_SECURITY_GROUP" | jq --raw-output .Instances[].InstanceId)

echo "bastion ${instance_id} is being created"

aws --region ${AWS_REGION} ec2 wait instance-exists --instance-ids "$instance_id"

echo "Tagging instance"
aws --region ${AWS_REGION} ec2 create-tags --resources "${instance_id}" --tags Key=Name,Value="${BASTION_EC2_INSTANCE_NAME}" \
Key=Role,Value="Bastion" Key=Env,Value="DEV"

echo "${instance_id} was tagged waiting to login"

aws --region ${AWS_REGION} ec2 wait instance-status-ok --instance-ids "$instance_id"

echo "Associate DNS"
bin/associate-route53-DNS-with-IP.sh ${BASTION_EC2_INSTANCE_NAME} ${BASTION_DNS_NAME} ${AWS_REGION}

The bin/create-ec2-instance-cassandra.sh also can load a AWS User-Data file per region by employing --user-data "file://resources/user-data/cassandra-${AWS_REGION}" as follows.

bin/create-ec2-instance-cassandra.sh -

instance_id=$(aws --region ${AWS_REGION} ec2 run-instances --image-id "$CASSANDRA_AMI" \
 --subnet-id  "$SUBNET_PRIVATE" \
 --instance-type ${CASSANDRA_NODE_SIZE} --private-ip-address ${PRIVATE_IP_ADDRESS}  \
 --iam-instance-profile "Name=$CASSANDRA_IAM_PROFILE" \
 --security-group-ids "$CASSANDRA_SECURITY_GROUP" \
 --user-data "file://resources/user-data/cassandra-${AWS_REGION}" | jq --raw-output .Instances[].InstanceId)

Then we load a user data per region

resources/user-data/cassandra-us-west-2

#!/bin/bash
set -e

export BIND_IP=`curl http://169.254.169.254/latest/meta-data/local-ipv4`

/opt/cloudurable/bin/cassandra-cloud -cluster-name test \
                -client-address  ${BIND_IP} \
                -cluster-address  ${BIND_IP} \
                -cluster-seeds 10.0.1.10,10.0.2.10 \
                -snitch Ec2Snitch


/bin/systemctl restart  cassandra


resources/user-data/cassandra-us-east-2

#!/bin/bash
set -e

export BIND_IP=`curl http://169.254.169.254/latest/meta-data/local-ipv4`

/opt/cloudurable/bin/cassandra-cloud -cluster-name test \
                -client-address  ${BIND_IP} \
                -cluster-address  ${BIND_IP} \
                -cluster-seeds 10.0.1.10,10.0.2.10 \
                -snitch Ec2Snitch


/bin/systemctl restart  cassandra

Launch and test servers in new Region

TBD

Launch servers in new region

bin/create-ec2-instance-cassandra.sh 10.0.1.10 a us-east-2
bin/create-ec2-instance-cassandra.sh 10.0.2.10 b us-east-2
bin/create-ec2-instance-cassandra.sh 10.0.1.11 a us-east-2
bin/create-ec2-instance-cassandra.sh 10.0.2.11 b us-east-2

Launch bastion in new region

bin/create-ec2-instance-bastion.sh us-east-2

Verify connectivity ansible for new Cassandra Datacenter in new AWS Region

Let's ensure the bastion and servers are up.

For now modify bastion in ssh config

explain why

ssh config ~/.ssh/config or ${project_dir}/ssh/ssh.config

Host bastion
  Hostname bastion.dev.us-east-2.cloudurable.com
  ForwardAgent yes
  IdentityFile ~/.ssh/test_rsa
  User ansible

TBD

Ping Cassandra Datacenter nodes with ansible in new Region

$ ansible cassandra-nodes -m ping
10.0.2.11 | SUCCESS => {
    "changed": false, 
    "ping": "pong"
}
cassandra.node0 | SUCCESS => {
    "changed": false, 
    "ping": "pong"
}
10.0.1.11 | SUCCESS => {
    "changed": false, 
    "ping": "pong"
}
10.0.2.10 | SUCCESS => {
    "changed": false, 
    "ping": "pong"
}

Check node health.

$ ansible cassandra.node0 -a "/opt/cassandra/bin/nodetool status"
cassandra.node0 | SUCCESS | rc=0 >>
Datacenter: us-east-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.0.2.10  99.44 KiB   32           47.9%             b341b653-0f34-4262-b5a1-eb2063b3351e  2b
UN  10.0.1.10  104.41 KiB  32           58.0%             d6d7fef7-3711-4133-9844-bd329fa13218  2a
UN  10.0.1.11  100.07 KiB  32           44.3%             64f2f3cd-6361-4fc2-99e5-bff455bda403  2a
UN  10.0.2.11  95.09 KiB   32           49.8%             4f77bad6-ee73-4baf-9ddd-e6f75ed09332  2b

Notice the datacenter is us-east-2.

The problem is we can only address nodes in datacenter:us-west-2 or datacenter:us-east-2, and our CIDR addresses overlap. What we need is to create a new CIDR addressing scheme that works with both datacenters. (TBD explain this better).

Modify CloudFormation script to pass CIDR address

Setting up the CIDR addresses like the following.

Datacenter 1 VPC (us-west-2)
VPC CIDR:      10.1.0.0/16

Public:
Subnet CIDR:   10.1.0.0/24

Clusters:
Subnet 1 CIDR: 10.1.1.0/24
Subnet 2 CIDR: 10.1.2.0/24
Datacenter 2 VPC (us-east-2)
VPC CIDR:      10.2.0.0/16

Public:
Subnet CIDR:   10.2.0.0/24

Clusters:
Subnet 1 CIDR: 10.2.1.0/24
Subnet 2 CIDR: 10.2.2.0/24

Add Parameters to Cassandra Cluster CloudFormation

{
  "AWSTemplateFormatVersion": "2010-09-09",
  "Description": "Setup VPC for Cassandra Cluster for Cassandra Database",
  "Parameters": {
    "vpcCidr": {
      "Description": "Enter VPC CIDR",
      "Type": "String",
      "Default": "10.1.0.0/16",
      "AllowedValues": [
        "10.0.0.0/16",
        "10.1.0.0/16",
        "10.2.0.0/16",
        "10.3.0.0/16"
      ]
    },
    "subnetPublicCidr": {
      "Description": "Enter Public Subnet CIDR",
      "Type": "String",
      "Default": "10.1.0.0/24",
      "AllowedValues": [
        "10.0.0.0/24",
        "10.1.0.0/24",
        "10.2.0.0/24",
        "10.3.0.0/24"
      ]
    },
    "subnetCluster1Cidr": {
      "Description": "Enter Cluster Subnet Rack 1 CIDR",
      "Type": "String",
      "Default": "10.1.1.0/24",
      "AllowedValues": [
        "10.0.1.0/24",
        "10.1.1.0/24",
        "10.2.1.0/24",
        "10.3.1.0/24"
      ]
    },
    "subnetCluster2Cidr": {
      "Description": "Enter Cluster Subnet Rack 2 CIDR",
      "Type": "String",
      "Default": "10.1.2.0/24",
      "AllowedValues": [
        "10.0.2.0/24",
        "10.1.2.0/24",
        "10.2.2.0/24",
        "10.3.2.0/24"
      ]
    }
  },

Then we have to use those CloudFormation parameters.

vpc.json - use CloudFormation parameters

  "Resources": {
    "vpcMain": {
      "Type": "AWS::EC2::VPC",
      "Properties": {
        "CidrBlock": {"Ref": "vpcCidr"},
        ...
    },
    "subnetPublic": {
      "Type": "AWS::EC2::Subnet",
      "Properties": {
        "CidrBlock": {"Ref": "subnetPublicCidr"},
        ...
    },
    "subnetCluster1": {
      "Type": "AWS::EC2::Subnet",
      "Properties": {
        "CidrBlock": {"Ref": "subnetCluster1Cidr"},
        ...
      }
    },
    "subnetCluster2": {
      "Type": "AWS::EC2::Subnet",
      "Properties": {
        "CidrBlock": {"Ref": "subnetCluster2Cidr"},
        ...
      }
    },

Now we just modify the bash scripts to configure CIDRs per region for VPC and subnets as follows.

bin/ec2-env-region-us-west-2.sh

...
export SUBNET_VPC_CIDR=10.1.0.0/16
export SUBNET_PUBLIC_CIDR=10.1.0.0/24
export SUBNET_CLUSTER1_CIDR=10.1.1.0/24
export SUBNET_CLUSTER2_CIDR=10.1.2.0/24

Repeat for the east.

bin/ec2-env-region-us-east-2.sh

...
export SUBNET_VPC_CIDR=10.1.0.0/16
export SUBNET_PUBLIC_CIDR=10.1.0.0/24
export SUBNET_CLUSTER1_CIDR=10.1.1.0/24
export SUBNET_CLUSTER2_CIDR=10.1.2.0/24

Now use these from bin/update-vpc-cloudformation.sh and bin/run-vpc-cloudformation.sh as follows.

bin/run-vpc-cloudformation.sh - load and use CIDRs per AWS Region / Cassandra Datacenter

#!/usr/bin/env bash
set -e
source bin/ec2-env.sh

# Set aws-region
if [ -z "$1" ]
    then
        AWS_REGION=${REGION}
    else
        AWS_REGION=$1
fi

source bin/ec2-env-region.sh


aws --region ${REGION} s3 cp cloud-formation/vpc.json s3://$CLOUD_FORMER_S3_BUCKET
aws --region ${AWS_REGION} cloudformation create-stack --stack-name ${ENV}-vpc-cassandra \
--template-url "https://s3-us-west-2.amazonaws.com/$CLOUD_FORMER_S3_BUCKET/vpc.json" \
--parameters ParameterKey=vpcCidr,ParameterValue=${SUBNET_VPC_CIDR} \
    ParameterKey=subnetPublicCidr,ParameterValue=${SUBNET_PUBLIC_CIDR} \
    ParameterKey=subnetCluster1Cidr,ParameterValue=${SUBNET_CLUSTER1_CIDR} \
    ParameterKey=subnetCluster2Cidr,ParameterValue=${SUBNET_CLUSTER2_CIDR}

TODO describe the above.

Let's cleanup the old Cassandra Cluster CloudFormation stacks. First, terminate all bastion and cassandra database EC2 instances in both regions in AWS. Then delete the old Cassandra Cluster CloudFormation stacks in both the first region (Oregon/us-west-2) and the second region (Ohio/us-east-2).

Next we want to recreate our VPCs.

Recreate the VPCs with the parameterized CIDR

$ bin/run-vpc-cloudformation.sh   us-west-2
upload: cloud-formation/vpc.json to s3://cloudurable-cloudformer-templates/vpc.json
{
    "StackId": "arn:aws:cloudformation:us-west-2:821683928919:stack/dev-vpc-cassandra/ce52ae30-03d2-11e7-a467-503a90a9c435"
}

$ bin/run-vpc-cloudformation.sh   us-east-2
upload: cloud-formation/vpc.json to s3://cloudurable-cloudformer-templates/vpc.json
{
    "StackId": "arn:aws:cloudformation:us-east-2:821683928919:stack/dev-vpc-cassandra/d472d380-03d2-11e7-b212-50a68a27082a"
}

Once the CloudFormation are done creating, it is time to launch our instances. But first modify the environment variables to match our new VPC, subnet and security groups.

But before we do that we need to change the seeds per region in the user-data.

resources/user-data/cassandra-us-west-2

#!/bin/bash
set -e

export BIND_IP=`curl http://169.254.169.254/latest/meta-data/local-ipv4`

/opt/cloudurable/bin/cassandra-cloud -cluster-name test \
                -client-address  ${BIND_IP} \
                -cluster-address  ${BIND_IP} \
                -cluster-seeds 10.1.1.10,10.1.2.10 \
                -snitch Ec2Snitch


/bin/systemctl restart  cassandra


resources/user-data/cassandra-us-east-2

#!/bin/bash
set -e

export BIND_IP=`curl http://169.254.169.254/latest/meta-data/local-ipv4`

/opt/cloudurable/bin/cassandra-cloud -cluster-name test \
                -client-address  ${BIND_IP} \
                -cluster-address  ${BIND_IP} \
                -cluster-seeds 10.2.1.10,10.2.2.10 \
                -snitch Ec2Snitch


/bin/systemctl restart  cassandra

Launching 4 Cassandra Database servers and a bastion into each region

# Create Bastion Node for Oregon/us-west-2
bin/create-ec2-instance-bastion.sh us-west-2

# Create Cassandra Nodes for Oregon/us-west-2
bin/create-ec2-instance-cassandra.sh 10.1.1.10 a us-west-2
bin/create-ec2-instance-cassandra.sh 10.1.2.10 b us-west-2
bin/create-ec2-instance-cassandra.sh 10.1.1.11 a us-west-2
bin/create-ec2-instance-cassandra.sh 10.1.2.11 b us-west-2


# Create Bastion for Ohio/us-east-2
bin/create-ec2-instance-cassandra.sh us-east-2


# Create Cassandra nodes for Ohio/us-east-2
bin/create-ec2-instance-bastion.sh 10.2.1.10 a us-east-2
bin/create-ec2-instance-cassandra.sh 10.2.2.10 b us-east-2
bin/create-ec2-instance-cassandra.sh 10.2.1.11 a us-east-2
bin/create-ec2-instance-cassandra.sh 10.2.2.11 b us-east-2

Modify ansible and ssh config for bastion1 and bastion2

Describe. TODO.

ssh config (~/.ssh/config or ${projectDir}/ssh.config)


Host bastion1
  Hostname bastion.dev.us-west-2.cloudurable.com
  ForwardAgent yes
  IdentityFile ~/.ssh/test_rsa
  User ansible

Host bastion2
  Hostname bastion.dev.us-east-2.cloudurable.com
  ForwardAgent yes
  IdentityFile ~/.ssh/test_rsa
  User ansible

Host 10.1.*
  ForwardAgent yes
  IdentityFile ~/.ssh/test_rsa
  ProxyCommand ssh bastion1  -W  %h:%p
  User ansible
  ControlMaster auto
  ControlPath ~/.ssh/ansible-%r@%h:%p
  ControlPersist 5m

Host 10.2.*
  ForwardAgent yes
  IdentityFile ~/.ssh/test_rsa
  ProxyCommand ssh bastion2  -W  %h:%p
  User ansible
  ControlMaster auto
  ControlPath ~/.ssh/ansible-%r@%h:%p
  ControlPersist 5m

inventory.ini

[seed1]
10.1.1.10

[seed2]
10.2.1.10

[seeds]
10.1.1.10
10.1.2.10
10.2.1.10
10.2.2.10

[dc1-nodes]
10.1.1.10
10.1.1.11
10.1.2.10
10.1.2.11


[dc2-nodes]
10.2.1.10
10.2.1.11
10.2.2.10
10.2.2.11

[all-nodes]
10.1.1.10
10.1.1.11
10.1.2.10
10.1.2.11
10.2.1.10
10.2.1.11
10.2.2.10
10.2.2.11

[bastions]
bastion1
bastion2

TODO explain.

Ping all of the servers

ansible all-nodes -m ping
10.1.1.11 | SUCCESS => {
    "changed": false, 
    "ping": "pong"
}
10.1.2.11 | SUCCESS => {
    "changed": false, 
    "ping": "pong"
}
10.1.1.10 | SUCCESS => {
    "changed": false, 
    "ping": "pong"
}
10.1.2.10 | SUCCESS => {
    "changed": false, 
    "ping": "pong"
}
10.2.1.10 | SUCCESS => {
    "changed": false, 
    "ping": "pong"
}
10.2.1.11 | SUCCESS => {
    "changed": false, 
    "ping": "pong"
}
10.2.2.10 | SUCCESS => {
    "changed": false, 
    "ping": "pong"
}
10.2.2.11 | SUCCESS => {
    "changed": false, 
    "ping": "pong"
}

Explain.

Check config

TODO

nodetool status on seed1

$ ansible seed1 -a "/opt/cassandra/bin/nodetool status"
10.1.1.10 | SUCCESS | rc=0 >>
Datacenter: us-west-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.1.2.11  95.01 KiB  32           43.4%             aa5e2391-324d-4500-b71e-e72ede2ecfd0  2b
UN  10.1.1.11  75.92 KiB  32           54.1%             67cba82d-45d0-40c6-b73a-db0330d7ccb0  2a
UN  10.1.2.10  84.74 KiB  32           51.4%             bda4cea4-16ca-46e1-b3b0-ecff038d3007  2b
UN  10.1.1.10  70.56 KiB  32           51.1%             702ba933-fd4c-4236-bc8c-25bd17acd67b  2a

Explain this.

nodetool status on seed1

$ ansible seed2 -a "/opt/cassandra/bin/nodetool status"
10.2.1.10 | SUCCESS | rc=0 >>
Datacenter: us-east-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.2.1.10  88.08 KiB  32           44.3%             eb4d29eb-00ab-4708-b0bc-9bae6145107b  2a
UN  10.2.2.10  65.66 KiB  32           45.1%             1db995a1-9cb4-4df3-beb5-8e3c7aa3984e  2b
UN  10.2.2.11  114.16 KiB 32           53.4%             be4b5cef-b900-405f-ad90-34e1c663cd6b  2b
UN  10.2.1.11  75.92 KiB  32           57.2%             e5fb0233-f24b-4e4a-8eaf-a61201bbc23e  2a


Explain. Two datacenters that don't see each other.

Use Ec2MultiRegionSnitch.

Switch to Ec2MultiRegionSnitch snitch

playbooks/connect-regions.yaml

---
- hosts: all-nodes
  gather_facts: no
  become: true
  remote_user: ansible
  vars:
    cluster_name: test
    seeds: 52.14.159.203,52.14.108.53,54.202.57.109,54.149.126.26
    aws_meta: http://169.254.169.254/latest/meta-data
    bind_ip: "{{aws_meta}}/local-ipv4"
    broadcast_ip: "{{aws_meta}}/public-ipv4"


  tasks:

  - name : "Copy template"
    copy:
      src: ../resources/opt/cassandra/conf/cassandra-yaml.template
      dest: /opt/cassandra/conf/cassandra-yaml.template
      owner: cassandra
      group: cassandra
      mode: "u=rw,g=r,o=r"

  - name : "Get bind_ip"
    command: curl {{bind_ip}}
    register: bind_ip_contents

  - name : "Get broadcast_ip"
    command: curl {{broadcast_ip}}
    register: broadcast_ip_contents

  - name: Configure Cassandra
    command: "/opt/cloudurable/bin/cassandra-cloud -cluster-name  {{cluster_name}} \
                              -client-address  {{bind_ip_contents.stdout}} \
                              -cluster-address  {{bind_ip_contents.stdout}} \
                              -cluster-seeds {{seeds}} \
                              -cluster-broadcast-address {{broadcast_ip_contents.stdout}} \
                              -snitch Ec2MultiRegionSnitch"

  - name: Restart Cassandra
    command: "/bin/systemctl restart  cassandra"


Running the playbook.

Running playbooks/connect.yml

$ ansible-playbook playbooks/connect-regions.yml 

PLAY [all-nodes] ***************************************************************

TASK [Copy template] ***********************************************************
ok: [10.1.1.10]
ok: [10.1.1.11]
ok: [10.1.2.11]
ok: [10.1.2.10]
ok: [10.2.1.10]
ok: [10.2.1.11]
ok: [10.2.2.11]
ok: [10.2.2.10]

TASK [Get bind_ip] *************************************************************
changed: [10.1.2.11]
 [WARNING]: Consider using get_url or uri module rather than running curl

changed: [10.1.1.10]
changed: [10.1.2.10]
changed: [10.1.1.11]
changed: [10.2.1.10]
changed: [10.2.1.11]
changed: [10.2.2.10]
changed: [10.2.2.11]

TASK [Get broadcast_ip] ********************************************************
changed: [10.1.1.10]
changed: [10.1.2.11]
changed: [10.1.1.11]
changed: [10.1.2.10]
changed: [10.2.1.10]
changed: [10.2.1.11]
changed: [10.2.2.11]
changed: [10.2.2.10]

TASK [Configure Cassandra] *****************************************************
changed: [10.1.1.10]
changed: [10.1.2.11]
changed: [10.1.1.11]
changed: [10.1.2.10]
changed: [10.2.1.10]
changed: [10.2.1.11]
changed: [10.2.2.10]
changed: [10.2.2.11]

TASK [Restart Cassandra] *******************************************************
changed: [10.1.1.10]
changed: [10.1.1.11]
changed: [10.1.2.11]
changed: [10.1.2.10]
changed: [10.2.1.10]
changed: [10.2.2.10]
changed: [10.2.1.11]
changed: [10.2.2.11]

PLAY RECAP *********************************************************************
10.1.1.10                  : ok=5    changed=4    unreachable=0    failed=0   
10.1.1.11                  : ok=5    changed=4    unreachable=0    failed=0   
10.1.2.10                  : ok=5    changed=4    unreachable=0    failed=0   
10.1.2.11                  : ok=5    changed=4    unreachable=0    failed=0   
10.2.1.10                  : ok=5    changed=4    unreachable=0    failed=0   
10.2.1.11                  : ok=5    changed=4    unreachable=0    failed=0   
10.2.2.10                  : ok=5    changed=4    unreachable=0    failed=0   
10.2.2.11                  : ok=5    changed=4    unreachable=0    failed=0  

Check config again

TBD

Verify new setup

status on seed1

$ ansible seed1  -a "/opt/cassandra/bin/nodetool status"
10.1.1.10 | SUCCESS | rc=0 >>
Datacenter: us-east-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
UN  52.14.159.203   198.57 KiB  32           28.1%             3b268d9b-56dd-48a7-b614-2f29f1757593  2b
UN  52.14.156.148   241.97 KiB  32           24.5%             c73e4c97-6b3e-4fc5-ac4d-59afacceb219  2b
UN  52.14.108.53    239.15 KiB  32           26.1%             38a314ff-a0a0-40b2-8ead-426fb975f2f4  2a
UN  52.14.139.245   195.85 KiB  32           25.4%             9fa3c21d-7d12-4a1a-8116-77ce89319d23  2a
Datacenter: us-west-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
UN  54.202.137.143  194.06 KiB  32           24.1%             4afd0d96-6536-40af-8983-28b2581148d2  2a
UN  54.202.57.109   234.12 KiB  32           26.5%             32002e7b-2867-47ce-b637-ed135ff2b8eb  2b
UN  54.187.138.240  220.24 KiB  32           20.1%             f13f3aef-035b-47a1-a1f3-a9b0d077408d  2b
UN  54.149.126.26   194.18 KiB  32           25.3%             440eef64-1471-4711-987d-21f5b8819f8e  2a

status on seed2

$ ansible seed2  -a "/opt/cassandra/bin/nodetool status"
10.1.1.10 | SUCCESS | rc=0 >>
Datacenter: us-east-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
UN  52.14.159.203   198.57 KiB  32           28.1%             3b268d9b-56dd-48a7-b614-2f29f1757593  2b
UN  52.14.156.148   241.97 KiB  32           24.5%             c73e4c97-6b3e-4fc5-ac4d-59afacceb219  2b
UN  52.14.108.53    239.15 KiB  32           26.1%             38a314ff-a0a0-40b2-8ead-426fb975f2f4  2a
UN  52.14.139.245   195.85 KiB  32           25.4%             9fa3c21d-7d12-4a1a-8116-77ce89319d23  2a
Datacenter: us-west-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
UN  54.202.137.143  194.06 KiB  32           24.1%             4afd0d96-6536-40af-8983-28b2581148d2  2a
UN  54.202.57.109   234.12 KiB  32           26.5%             32002e7b-2867-47ce-b637-ed135ff2b8eb  2b
UN  54.187.138.240  220.24 KiB  32           20.1%             f13f3aef-035b-47a1-a1f3-a9b0d077408d  2b
UN  54.149.126.26   194.18 KiB  32           25.3%             440eef64-1471-4711-987d-21f5b8819f8e  2a

About us

Cloudurable™: streamline DevOps for Cassandra running on AWS provides AMIs, CloudWatch Monitoring, CloudFormation templates and monitoring tools to support Cassandra in production running in EC2. We also teach advanced Cassandra courses which teaches how one could develop, support and deploy Cassandra to production in AWS EC2 for Developers and DevOps.

More info

Please take some time to read the Advantage of using Cloudurable™.

Cloudurable provides:

AWS Cassandra Deployment Guides

Cassandra Cluster/DevOps/DBA tutorial

Kafka training, Kafka consulting, Cassandra training, Cassandra consulting, Spark training, Spark consulting

Clone this wiki locally