There seem to be endless options for deploying Cassandra clusters to Amazon Web Services. As an engineer at Ippon Technologies, I have deployed, tested, and productionalized both Apache and Datastax flavors of Cassandra and seen the pitfalls and benefits of several approaches with EC2. I will be giving special attention to the impacts on your monthly cloud bill and engineering resources.
First let us walk through some numbers on instance-store vs EBS. In our scenario we want to have a 12 node cluster so we can replicate our data across 3 availability zones in US-East-1. All pricing was done using the AWS Simple Monthly Calculator. We will be looking at basic pricing for 2xlarge VMs from the i3 and m5 families.
Why m5 and i3? These EC2 families are recommended by Datastax and the Apache Cassandra Community as optimal depending on choice of instance-store or EBS volumes. These sizes are typical of production use but consider tweaking them if you intend on tuning for more vCPUs. You can find more specific tuning information in the Apache Cassandra docs.
You'll also notice, and need to consider, that EBS volumes must be sized at 3.5 terabytes to ensure the 10,000 IOPS recommended for Cassandra performance.
Finally, as a disclaimer, these numbers are heavily estimated and there are always other hidden costs to consider when using the cloud.
Numbers at a glance
12 nodes in US-East-1:
|m5.2xlarge||8/32||3.5 TB EBS provisioned||$ 7573.08|
|i3.2xlarge||8/61||1 x 1.9 TB provided||$ 5481.24|
Without any other constraints, you can see m5 instances with EBS are significantly more expensive. But what if we want to take EBS Snapshots?
Being charitable, lets say there's a 5% daily change for incremental EBS snapshots. With 42000 gigabytes of S3 storage at 5% change, you've added $3753.84 to your monthly bill for a total of $11326.92.
On the surface, EBS starts getting expensive quickly when looking at storage costs for disaster recovery. But what about recovery options for instance-store volumes? Numerous types of Cassandra snapshot options exist but ultimately you will still need to store the snapshots somewhere. For a deeper dive into these options, check out The Last Pickle's blog post.
Cassandra snapshots need not be the same size as the volume. Where EBS must store the entire volume on S3, you can get by with just storing your actual Cassandra data in-use. If done incrementally, we need to calculate for daily change. With roughly 1 terabyte of data per node, the monthly cost for instance-store rises by roughly $300.
You might look at this number and wonder why it's so much cheaper than the EBS snapshots on S3, even after accounting for the difference in size. You can think of EBS snapshots as a managed service that you're subscribing to. The service costs significantly more than normal S3 operations and data.
Our New (Estimated) Numbers
We can estimate that using EBS volumes almost doubles our monthly bill. Why would you want to take on such costs for EBS?
Making a Case for EBS
As an engineering consultant that has dealt with building EBS-backed Cassandra clusters, I can attest to their operational simplicity. There is no better option when it comes to easy, safe, disaster recovery. Again, I want to encourage you to take a look at The Last Pickle's blog on this topic as he does a deeper dive from an operational point-of-view. But I want to add a developer's perspective.
#1 You just received an alert that AWS is retiring the hardware that one of your nodes is running on
- Detach the EBS volume
- Decommission old node VM
- Bootstrap new node and attach old EBS volume
- Bootstrap new node
- Configure new node to replace old node in cassandra.yaml
- Stream data into new node
- Decommission old node
Not much difference here. EBS has the advantage of recovery time because data is already present in the reattached volume.
#2 Your organization requires that you refresh all services with new AMIs every quarter
Basically the same as above, for each node in the cluster.
One-by-one, detach EBS volume, decommission old node, and then bootstrap new node and attach old volume. Only do one at a time. As long as your replication factor is higher than one, this method has no downtime. This is very simple and safe. Can be made even easier if you use the same private ips for new nodes as the ones they are replacing.
There are several ways you could approach this but ultimately they will be just as complex as the following:
- Create new cluster and run it alongside old one
- Change topology on old cluster to add new cluster as a datacenter
- Change configuration in cassandra.yaml on both old and new such that the clusters recognize each other
- Nodetool rebuild on new cluster from old cluster
- Swap DNS to new cluster and kick cache or redeploy integrating software to refresh DNS
- Decommission old cluster
This process is arduous, difficult to automate, and vastly inferior to the EBS option. I want to stress the hidden costs of engineering time spent on a process like this. It is relatively impossible to estimate the real cost and time spent on such a process, especially with regard to development time for automation. Factor in the topology concerns as well when your cluster is running multi-region or in several availability zones.
#3 The worst has happened, your cluster was bricked and you must implement the disaster recovery plan
- Create volumes from the snapshots you've been taking, for each node
- Bootstrap new cluster and attach EBS volumes from snapshot
That's mostly it. It's dead simple disaster recovery. Your recovery time is short and can be done manually or automated painlessly.
- Create new cluster
- Bulk load the new cluster with SSTableLoader from the incremental backups on S3
- Nodetool refresh
Not very complex at a glance, but difficult to automate. Recovery time is very long as the nodes have to replay some of the snapshot data to determine changes needed on SSTables. The bulk loader will need time to stream data to the new cluster. Consider the time and resources needed to automate the creation and removal of snapshots from the nodes to S3. As a warning, I've never seen this option work and I can't find any real data on anyone attempting it, so this is mostly theoretical. In a perfect world, your cluster data is replicated in several availability zones and you only lost one AZ to whatever disaster occurred.
Making a case for Instance-Store
Thus far I think I've made it clear that EBS shines for operational simplicity. Maybe it works for your budget and you don't want to expend the resources to automate backups and restoration for instance-store volumes. I believe there are still significant advantages to using instance-store.
Looking at it from the perspective of IOPS, the i3 family of EC2 is capable of 3.3M IOPS. This is a big advantage for an I/O heavy workload like Cassandra.
Long-term, you will be saving a significant amount of money on monthly bills if you choose to go with instance-store. I think the engineering expenditure needed to build out the automation, recovery, and backups, is justified.
When you look at the larger picture, I believe this can be summed up as follows: EBS volumes offer data integrity, operational simplicity, and recovery options, but you must consider the costs to your monthly bill. Instance-store volumes, offered by the community approved i3 family of instances, are a cheaper option if you can spare the engineering resources to build out backup and recovery.