Terraform Scope - where to use and not to use Terraform
So far we have discussed where and how Terraform can be used to manage a cloud environment. While Terraform may provide the core deployment capability in cloud IaC it’s just as important to note the limitations of Terraform, and also where the cost of implementing IaC may not be cost-justifiable.
Terraform’s strength is in the deployment of cloud infrastructure. While that may seems obvious, it’s worth looking at both the term “deployment” and “infrastructure” to further refine our definition.
Infrastructure
In many definitions of cloud infrastructure you will see only the following types defined
- Compute (Servers)
- Networking
- Storage This may have been true of very early clouds, but mature, modern clouds offer a much broader range of services. The three base areas have been extended with higher-value and more abstracted services, together with enhanced services within those three main categories. Here are some examples that don’t easily fit into those categories :
- Load-balancing service
- DNS service
- Fully managed database service
- Fully managed analytics service
- Identity Management service
The key to all of these to be included as an “infrastructure” service is whether they can be deployed natively by Terraform. What we mean by this is whether the Terraform provider_supports that particular resource type. Each vendor’s cloud will define a different set of resources that comprise the infrastructure offering. As new services are brought on line by the various cloud providers, these are added to the _provider and so can be deployed by Terraform. You can find OCI’s latest list in the [Hashicorp OCI Provider Documentation]
So, in terms of the scope of “what” Terraform can deploy - it’s really everything that is enabled in the provider
Deployment
The second term that we should look at is deployment, and this is a little more tricky. Terraform is generally good at deploying the Infrastructure Resources named above, but it is very poor at deploying anything else and very poor at any post-deployment configuration of the contents of those resources.
While we often speak about deploying a web server or a database, we need to take care on what that means. While Terraform can, and should, reply a virtual-machine image of a web-server, we should not look to configure that web server through Terraform. In some cases, we might use Terraform to redeploy a webserver with a new version, and perhaps we might run some basic scripts to install core software (e.g. through yum) but we should steer away from any serious configuration of software - these activities are much better executed through a more focused configuration tool such as Chef, Puppet or Ansible.
Similarly, ???DATABASE???
Long-lived and Complex Resources
As we saw in the benefits section, the value of Terraform is gained through standardisation, re-use and automation. A side-effect of this is that if a particular architecture is to be deployed once, and only once, then using Terraform might be over-kill and may never deliver any benefit for the investment. Let’s look at a few of examples to clarify this point :
- terraform could be used to set up the basic connectivity between on-premises and a public cloud, but it’s highly likely that this would be very specific to the locations involved, and would not be used again
- a DNS global service designed to be used by an entire organisation is unlikely to ever be created from scratch
- a production service high-availability mission-critical application service will likely have it’s resilience delivered by multiple application level capabilities, and these are unlikely to be compatible with a Terraform deployment approach
- a large data warehouse could be initially deployed by Terraform, but the likelihood is that those Terraform scripts will never be used again in production In all of these cases, the investment in configuration after deployment is huge and will likely include a great deal of manual activities. Automating the Terraform aspects bring almost no value and will likely complicate the live service substantially
Development and Test only
In the latter two examples (data warehouse and mission-critical application) it is primarily the complexity, size and resilience of the production service that limits the use of Terraform. However, in other environments where frequent re-builds are required, then Terraform can deliver value.
For example, even in a large monolithic deployment, for development, test, training and patch-testing environments, there is a significant demand for the creation of production clones or reduced-size production clones. This can enable faster CI/CD and also general regression testing.
In these circumstances, attempting to make Terraform compatible with re-building the complex production systems will be a huge cost for no gain. In these circumstances, consideration should be given to only using Terraform in the non-production environments, and using other tools in the production environment.
While this at first seems to defeat the objective of IaC we should remember than in these complex systems, there are probably 1-2 production environments, whereas there may be 8-10 other environments across development and test teams - many of which may be being frequently destroyed and recreated. Terraform will still be able to deliver significant benefits just in a non-prod world.