As discussed earlier on our Lambda Architecture posts here and here, we are long time users of Hadoop. This post is about how we manage Hadoop infrastructure on AWS using Matsya, a tool we built that reduced cost and improved reliability.
Our Hadoop1 cluster deployments consist of one Master EC2 instance which hosts NameNode and JobTracker and slave machines that hosts DataNode and TaskTracker processes. We have our slaves configured via Amazon’s Auto Scaling Group (ASG), so we can scale our cluster anytime by just modifying the desired value. Auto Scaling Groups offer a wide range of features which makes it a perfect candidate to run the slaves on them. Some of the relevant features include:
By default, the machines that are launched are On-Demand Instances. You’re charged a fixed price based on the chosen instance type with no long term commitments. While this is a really compelling start, one of the major concerns here is the cost of running such an infrastructure.
For example, if you want to run 200 c3.2xlarge machines for 24 hours, you would spend -> 0.420 x 200 x 24, which comes to $2,016 per day.
In order to circumvent this, AWS also has another type of instance launching mechanism called Spot. Using this, we can significantly reduce the overall cost of running infrastructure for various type of applications. AWS allows us to bid on excess unused compute capacity at a lower cost compared to its On-Demand counterparts. Please note that not all applications can be run on Spot infrastructure. We run Hadoop on Spot because the application can handle dynamic node addition/ deletion and can also identify if a node isn’t responding and will work accordingly.
Running Spot instances comes at the cost of “no guarantee” on how long the instances are available (except when using Spot blocks). The reason for that is that if the cost of the instance type crosses your bid price in the Spot market, the instance will be taken away from you automatically. Figure 1 shows what happens during a Spot outage.
AWS has a feature called Spot Termination Notice, which can help in determining if the instance is going to be taken away because of Spot. Unfortunately not all applications are AWS-aware.
Common parameters to consider while working with Spot instances are:
Spot Placement is choosing the right instance type and availability zone for your Spot request. Each combination of instance type, availability zone and region makes a Spot market. Generally, changes in one market don’t affect the other. You can use tools like Spot Bid Advisor to see which instance types on which regions are more reliable. There’s no known correlation between the past and the future, but it’s a good data point to look at nonetheless.
Let’s say we go on to split the instances across four Availability Zones. Auto Scaling Group has the ability to distribute the instances evenly across the Availability Zones. Now, even if the price of the instance type is skyrocketing on one of the Availability Zones, it is less likely it will affect the others. Figure 2 represents how it looks in terms of instance layout.
We had the above solution implemented and running for some time, until we started seeing our Data Transfer charges going up. For systems like HDFS, the data transfer is directly proportional to the amount of data stored and the replication factor of the cluster. It compounds even more when we’re on the Spot infrastructure with frequent loss of machines. Each time a new machine joins the cluster the replication starts automatically in the background.
At this point, we started looking at the problem on multiple levels. We realized that we had to move to a single AZ-based model but shouldn’t be affected by the Spot market fluctuations. Our requirements at this point were:
We built Matsya to solve the above use cases. Matsya is a tool that tries to keep your fleet running by periodically jumping across AZs based on the instance Spot prices. It is expected to be run as a Cron task within your cluster. On startup, you pass in a configuration file that describes the fleet properties and Matsya does the rest.
Matsya periodically checks the instance cost in the current availability zone where the ASG is configured to run. Whenever the instance price goes beyond a threshold value with respect to the bid price, it will try to look for alternative availability zones which match the threshold and it will update the respective ASG to the cheapest AZ. If it can’t find any AZs that match the threshold, we can also optionally configure it to revert as backup to On-Demand instances, if that appears to be a cheaper option for the moment.
Matsya always optimizes for cost and tries to keep the entire fleet running. Amazon’s Auto Scaling will take care of slowly over-provisioning the machines on the new AZ and terminating the older ones in order to keep the throughput same. Remember the assumption made here is that the nodes in the cluster are stateless or the state can be rebuilt if a new instance replaces the existing one.
We’re happy to open source Matsya to the community under an Apache 2 License. Fork away!