Jun 23, 2020

Installing NPM Dependencies with Terraform

Terraform has the capability to create a zip file containing Function code, and for simpler functions that’s fine. For more complex functions it quickly becomes necessary to install dependencies to make avoid the need to write every line of code ourselves.

The problem with Terraform is that it only wants to do things that have a Provider resource. npm installing dependencies needs to be done outside of this, and while we could manually there is no way to guarantee we will remember to do it every time we apply.

In order to keep this inside out Terraform workflow we need to abuse some resources a little.

Triggering Dependency Installation

To convince Terraform to install our dependencies, we need to know

Has package.json changed?
Has package-lock.json changed?
Do we have all the files we expected?

We also need to store something in the state file that we can compare to in the future.

resource "null_resource" "lambda_dependencies" {
  provisioner "local-exec" {
    command = "cd ${path.module}/src && npm install"
  }

  triggers = {
    index = sha256(file("${path.module}/src/index.js"))
    package = sha256(file("${path.module}/src/package.json"))
    lock = sha256(file("${path.module}/src/package-lock.json"))
    node = sha256(join("",fileset(path.module, "src/**/*.js")))
  }
}

a null_resource doesn’t do anything in itself, but it acts as a container that we can use to create one resource based on a set of trigger values.

local-exec allows us to run a system command, which in our case is to move in to the Lambda Function source directory and execute npm install.

We can set triggers based on our requirement, i.e. has index.js, package.json or package-lock.json changed by checking their sha256 signatures. We can also get a list of every .js file in the source directory, join them in to a single string and then get the sha256 of that string.

These trigger values will be stored in the state file, so even if we run this on another machine we’re going to be able to tell if anything doesn’t match between the previous deployment and this one. If something doesn’t match, then the local-exec command will be executed, and our dependencies installed!

Building the Bundle

So far so good, but we need to enforce Terraform ordering in the dependency graph. If we just add an archive_file data resource there’s no guarantee it will wait for the dependencies to be installed before creating the bundle.

To fix this we add an intermediate resource called a null_data_source. These resources take a bunch of inputs which can be computed form the outputs of other resources or variables. Until all of the inputs are satisfied Terraform wont allow anything depending on the null_data_source to be created or modified.

data "null_data_source" "wait_for_lambda_exporter" {
  inputs = {
    lambda_dependency_id = "${null_resource.lambda_dependencies.id}"
    source_dir           = "${path.module}/src/"
  }
}

The lambda_dependency_id input wont be known to Terraform until the previous null_resource has completed - which means our npm install has either been completed, or nothing had changed and it wasn’t required.

The source_dir input can be pre-computed by Terraform, and we will use this value to tell the archive_file what to add to the bundle.

Finally we create our archive_file:

data "archive_file" "lambda" {
  output_path = "${path.module}/lambda-bundle.zip"
  source_dir  = "${data.null_data_source.wait_for_lambda_exporter.outputs["source_dir"]}"
  type        = "zip"
}

So now we’re creating our archive_file.lambda from the path given to us by null_data_source.wait_for_lambda_exporter, which wont give us an answer until null_resource.lambda_dependencies is satisfied, which ensures the correct dependency ordering is maintained!

Less than ideal

There is one minor issue with this method. On the first run there is no node modules, so the null_resource.lambda_dependencies.trigger.node value gets an initial sha256 value, and that causes npm install to be executed. This will then change the sha256 that would be computed to the src directory, and a second run would also create a new bundle. subsequent runs wont do this, but in an ideal world a single run should be all that is needed to converge our required to actual states.

Its certainly not perfect, but if you can tolerate this one caveat, its a relatively simple way to include node dependencies without having to do anything more elaborate than a normal deployment.