Terraform AWS Contribution Experience

A couple weeks ago I took it upon myself to create a new terraform data source. Overall, it was an extremely pleasant experience. I learned a lot and wandered a little outside my comfort zone. Here are my thoughts.

The Problem

When designing and building some infrastructure in AWS, I discovered a potential problem. When defining an Auto Scaling group, you are required to provide a min_size, max_size, and desired_capacity. min_size and max_size are self-explanatory. However, desired_capacity is where things get tricky. If you deploy a new autoscaling group, it will have a static desired_capacity. This means if the existing autoscaling group has scaled up or down, deploying a new group will reset the desired_capacity to it’s original value.

For example, let’s say you have an existing group. This group had an original desired_capacity of 4instances but recent high traffic has triggered a “scale up” policy so now there are 8 instances. However you notice there is a flaw in your current deployment and need to deploy an update. So you update your terraform configuration and apply it. Except now your desired_capacity is reset to 4 instances and your total capacity is reduced by half during a period of high volume. Congratulations, you are now writing an incident report.

The Solution

There are a couple solutions to this problem. You could have the aws-autoscaling-group resource not update the desired_capacity if it is different than what is currently active. This is, in fact, how CloudFormation accomplishes this.

Another solution would be to query the current desired_capacity and use that value in your new configuration. This is solution I desired to implement. The downside to this approach is there could possibly be a lag in between the time you query for the value and the time you apply the new configuration. ie when you query for the value the Auto Scaling Group (ASG) is scaling up. So at the time of the query the value is 4, but by the time you get around to applying the configuration the ASG has scaled up to 6.

I deemed this an acceptable risk. And perhaps the aws-autoscaling-group will support this functionality in the future. However, the provider still lacked a data source for reading an individual autoscaling group. Creating this data source allowed me to create a new resource that needed to be created, and allow a workable solution to my problem.

The Guts

Most of my setup time was in setting up my development environment. This was a standard setup and I will not spend time going over that here.

The rest of my time was in learning the contributor requirements of the AWS Terraform Provider. These were quite simple and everything was contained in the Makefile.

Overall, in my inexperience, this project has extremely tight and predictable technical requirements. I created my first draft of the resource and submitted it for review. A few days later a very friendly and helpful maintainer redlined most of code. The community is very particular about the quality of code that goes into their project. In my eyes, this is very much a good thing. Brian was extremely helpful in getting me on my way and had me greatly improve the quality of my code.

In Conclusion

This was a very positive experience for me. I got some real-world experience writing in Go and was able to dive into a heavily used open source project. My confidence level in the AWS provider has increased and my knowledge of terraform internals has increased. I would highly recommend contributing to this project.

For more information on using this module, please reference the documentation.

An Update (Jan 7, 2019)

Well it seems like I spoke to soon. While the module I created works just fine, there are a lot of use-cases where you can’t load the “current” resource while you are simultaneously updating that resource. This is gonna take some thinking.