Uploading File Trees to S3 with Terraform

Uploading a single file to S3 using Terraform is pretty simple, but sometimes you need to upload a whole folder. If you’re serving the files using S3 as a website, or through CloudFront you also need to make sure you set the correct mime types and eTags. Its actually a whole lot simpler than you might think!

Single File Uploads

A single file is easy enough to upload:

resource "aws_s3_bucket_object" "single_file" {
  bucket       = aws_s3_bucket.web.id
  key          = "index.hmtl"
  source       = "${path.module}/content/index.html"
}

While that will upload a file, it will get a mime type of application/octet-stream which means if you’re trying to serve it through CloudFront it will download rather than loading as the right document type. To set the right mime type, all thats needed is:

resource "aws_s3_bucket_object" "content" {
  ...
  content_type = "text/html"
}

Often a browser will perform a HEAD request to see if a file has changed since the last time it downloaded it. The easiest way to check is via the etag header which contains a hash of the contents. We can also add that automatically:

resource "aws_s3_bucket_object" "content" {
  ...
  etag         = filemd5("${path.module}/content/index.html")
}

Extending to Multiple Files

Modern versions of terraform let is create multiple resources using the for_each iteration operator. To get a list of the files we want to upload we can use the fileset function which collects a list of files at a path based on a pattern.

fileset takes two parameters, a root path and a pattern to match files against. The pattern field is able to allow for recursively searching a folder tree using the ** globbing operator, so we could for example do something like:

resource "aws_s3_bucket_object" "content" {
  for_each = fileset("../web/build", "**/*.hmtl")

  bucket       = aws_s3_bucket.web.id
  key          = each.key
  source       = "${path.module}/content/${each.key}"
  content_type = "text/html"
  etag         = filemd5("${path.module}/content/${each.key}")
}

Which is a pretty simple way of finding all the files ending in .html and adding the to the bucket with the right mime type. Its a bit cumbersome working this way though because now we need to add a new block for JS, another for CSS, one for JPGs and so on…..

Mime-Type Lookups

Using some clever indexing we can save ourselves an awful lot of effort by creating a map of file extensions and the mime type they map to. Inside the for_each loop we can dissect the filename to get the file extension, and then look up what mime type it maps to:

Our map of mime types should look like:

locals {
  mime_types = {
    "css"  = "text/css"
    "html" = "text/html"
    "ico"  = "image/vnd.microsoft.icon"
    "js"   = "application/javascript"
    "json" = "application/json"
    "map"  = "application/json"
    "png"  = "image/png"
    "svg"  = "image/svg+xml"
    "txt"  = "text/plain"
  }
}

we need to cover every file type we have, and Terraform will fail to plan if a file extension is added that we don’t have a type for, but at least we don’t end up with the wrong mime type!

To do the lookup we need to split the file name and path by the . character, and then get the last element and use that in a lookup against the map. Using a local actually gives us an object rather than a map, but you can convert using the toset function.

resource "aws_s3_bucket_object" "content" {
  for_each = fileset("${path.module}/content", "**/*.*")

  bucket       = aws_s3_bucket.web.id
  key          = each.key
  source       = "${path.module}/content/${each.key}"
  content_type = lookup(tomap(local.mime_types), element(split(".", each.key), length(split(".", each.key)) - 1))
  etag         = filemd5("${path.module}/content/${each.key}")
}

Now every file will be found, mapped to a valid mime-type, an etag calculated and then be individually uploaded to S3! Since the etag is set, subsequent runs will only upload files that have changed!


607 Words

2021-07-02 14:50 +0000