A builder approach to cloud configuration using Terraform 0.13

September 18, 2020

The Panaseer Team

Hi, I’m Anthony. I work in the DevOps team at Panaseer. In this blog I’ll discuss how I’ve overcome the common challenge of maintaining lots of Terraform configuration to administer a vast multi-account public cloud estate. To solve this issue, I constructed a builder Terraform module that describes some common desired functionality across many AWS regions and accounts. Modules within this builder module can be dynamically installed depending on the business requirements of the region and account using a new Terraform 0.13 feature. This process has increased automation and reduced the operations overhead within the DevOps team and improves confidence in security operating in the cloud. Using this new approach, the number of files in our code base to maintain a single account has reduced by over ~80%. Read on to find out more.

The Panaseer product is hosted using a multi-account, multi-region AWS cloud architecture to provide high guarantees of data separation between customers. This presents the challenge of ensuring that each cloud environment has a consistent configuration of infrastructure alongside a standardised and comprehensive suite of security features. Our software configuration is expressed using a selection of well scoped Terraform modules and deployment is abstracted across each account using Terragrunt.

It has been possible to keep our codebase reasonably DRY (dont’ repeat yourself) by implementing our code structure following Terragrunt’s suggested approach. However, the declarative nature of having to create a Terragrunt configuration file for each desired module on a per region, per account basis where we want the infrastructure deployed creates the opportunity for error and an increasing maintenance overhead as the solution scales with the number of accounts under management.

As an example, imagine a situation in which you have 100 production accounts each needing the installation of a new Terraform module in two AWS regions. To achieve this with the standard approach would require 200 new Terragrunt configuration files. Then you require a further Terraform module in a particular region across the same 100 accounts which means another 100 files to commit. And so on. Not only do you have at least 300 files to maintain but now you also have to remember some business logic around why particular modules are installed in some regions and not others. Nightmare.

We had too many files similar to this in our infrastructure git repo:

## terragrunt.hcl
terraform {
source = “git@github.com:Panaseer/repo.git//modules/module?ref=v0.0.1”
}
include {
path = find_in_parent_folders()
}
inputs = merge(
{
“this_region”: “us-east-1”
}
)

At Panaseer we decided to try things a bit differently. Terraform’s eagerly anticipated version 0.13 release contains a new feature which we found particularly exciting: the for_each attribute on a module. Instead of installing a module in a declarative way (i.e. write a Terragrunt file when we want it), the for_eachattribute can be used to setup a procedural evaluation where logic is used to express whether a given module is enabled in a region or not.

How we implemented a feature-flag approach to manage modules dynamically
First, we assembled all our common Terraform modules that described potential desired functionality into a new main.tf with all the required providers and variable declarations. Using Terraform 0.13 a for_each argument was added to each Terraform module which allows it to be switch on and off dynamically:

## main.tf
##
# Modules
##
module “aws_config” {
providers = {
aws = aws.aws_config
}
for_each = var.aws_config_enabled == tobool(true) ? { RegionEnabled = “True” } : {}
source = “git@github.com:Panaseer/repo.git//modules/aws_config?ref=v0.0.1”
}
module “aws_config_aggregator” {
providers = {
aws = aws.aws_config_aggregator
}
for_each = var.aws_config_aggregator_enabled == tobool(true) ? { RegionEnabled = “True” } : {}
source = “git@github.com:Panaseer/repo.git//modules/aws_config_aggregator?ref=v0.0.1”
}
module “guard_duty” {
providers = {
aws = aws.guard_duty
}
for_each = var.guard_duty_enabled == tobool(true) ? { RegionEnabled = “True” } : {}
source = “git@github.com:Panaseer/repo.git//modules/guard_duty?ref=v0.0.1”
}
module “cloudtrail” {
providers = {
aws = aws.cloudtrail
}
for_each = var.cloudtrail_enabled == tobool(true) ? { RegionEnabled = “True” } : {}
source = “git@github.com:Panaseer/repo.git//modules/cloudtrail?ref=v0.0.1”
}
##
# Variables
##
variable aws_config_enabled {
type = bool
default = true
}
variable aws_config_aggregator_enabled {
type = bool
default = false
}
variable guard_duty_enabled {
type = bool
default = true
}
variable cloudtrail_enabled {
type = bool
default = false
}

This new main.tf has become a region “builder” module, allowing for the declaration of complex, repeatable infrastructure for our product with minimal code. In the example above, default values for enabling a module are set for all modules which are can have overrides when necessary in each environment. In practice, the greatest benefit of this approach is realised when there are many common modules that need to be enabled by default.

Next, we created a Terragrunt configuration file for an AWS region which consumes this new builder Terraform module instead of multiple configuration files which each consume and instantiate single Terraform modules:

## terragrunt.hcl
terraform {
source = “git@github.com:Panaseer/repo.git//modules/builder?ref=v0.0.1”
}
include {
path = find_in_parent_folders()
}
inputs = merge(
{
“this_region”: “eu-west-1”,
“aws_config_aggregator”: “true”
}
)
Variables are merged as inputs to Terragrunt to enable and disable the desired functionality for an AWS region. In this example, the AWS Config aggregator module has been enabled for this specific region and will be installed. The AWS Config and Guard Duty modules are already enabled by default in the builder module and will be installed. The Cloudtrail module has not been enabled and will not be installed.

In this example, the number of Terragrunt configuration files is reduced from three to one per region, per account when using the builder approach versus the standard method. If the Cloudtrail module is also required, the reduction is four files to one per region, per account. The result is a Terragrunt configuration file per region rather than per app, per region which lean whilst maintaining the flexibility to choose which modules to install:

The benefits of Panaseer’s approach are significant if you have adopted the recommended standard practice of keeping your infrastructure packaged up into small units and have many common modules to deploy across many accounts. At Panaseer, we accepted the well-documented trade-offs of this approach versus the standard practice. Panaseer’s approach is similar to adding a dependencies configuration block in a terragrunt.hcl file however it’s more versatile for standardising configuration across multiple regions and accounts as you don’t have to maintain a dependencies block in each file and the modules may not necessarily be dependent which is misleading.

In Panaseer’s implementation of this code architecture, default values for enabling modules across all customers are set in a common properties default_modules.json file which is loaded into Terragrunt across multiple regions and multiple accounts:

inputs = merge(
jsondecode(file(find_in_parent_folders(“default_modules.json”))),
{
“this_region”: “eu-west-1”,
“aws_config_aggregator”: “true”
}
)
The result of this is that we never have to remember to install a security module in a region for a customer, it’s done automatically. The values can also be overridden on an account or region basis if we wish. All core security modules required in an account, such as Guard duty and Inspector, are installed by default which improves confidence in our cloud security, removes scope for errors, increases automation and reduces operational costs.