Skip to content
Contents
Stay up to Date
Learn more about new features, company updates, and be the first to read new blogs.
August 24, 2022

Best practices for deploying S3 buckets using Terraform

Securing and deploying a Terraform S3 bucket can be challenging. In this post, we’ll go through some examples of using Terraform makes to create S3 bucket security standards across your organization.

The Simple Storage Service, or S3, is one of the oldest offerings from Amazon Web Services. It’s been around since day one and is the workhorse behind countless internal- and external-facing systems across the globe.

AWS customers have used S3 to store their public web assets, user uploads, configuration documents, or even sensitive items such as personnel or financial records.

Unfortunately, it can be very challenging to secure S3 buckets correctly, especially when dealing with hundreds or even thousands of them. Even a handful of buckets can feel overwhelming. 

Big companies like Verizon, Time Warner Cable, and Twillio have discovered how difficult S3 bucket security can be to their cost. Unsecured buckets have exposed personally identifiable information (PII), private internal IT information, and login credentials to the public. In Twilio’s case, they left their SDK download location publicly writable, which could have resulted in malicious code injection.

Strategies for securing S3 buckets

Access permissions to S3 buckets can be hard to understand. There’s Identity and Access Management (IAM) to consider. There are Access Control Lists (ACLs). In addition, there are a variety of policies to consider. Security strategies for S3 have changed a lot over the past few years. There’s a bewildering array of online tutorials explaining bucket security in seemingly conflicting ways. We’re going to keep things simple.

Unless you need to control access differently for individual S3 objects, there is no need to use ACLs. To keep things simple, we’ll disable those.

Once we have disabled ACLs, only policies remain as our security tool. There are two types of policy associated with S3:

  • Bucket policies are attached to buckets and describe which users can access them. 
  • User policies are attached to users and describe what buckets they can access.

Working with policies via the CLI is a pain. You must write your policies in JSON, create a file containing this JSON, and then run a command to attach the policy to the bucket. Policies require references to various users and buckets, so you’ve got a lot of information that needs copying and pasting from disparate sources.

Terraform really comes into its own here. Your users, buckets, and policies can all be defined in a single file. The user and bucket references needed by your policies are simple to define. Deployment occurs with a single command. Best of all, this pattern can be repeated over and over for subsequent buckets so that you can maintain consistent policies across your company.

Use cases for S3 buckets

Before we get to the details of S3 configurations in terraform, let’s consider a few use cases for S3 buckets. One of the most well-known cases is to use an S3 bucket for hosting a web page, site, or app. In the early days of S3, the common approach was to create a bucket with a name that matched your site’s name. You’d leave the bucket wide open for anyone to read, and that was it. However, this approach has several problems:

  1. You can only serve the site from a single domain
  2. You can only serve via HTTP and not HTTPS
  3. Permitting public read-only access is just a single “oops” away from permitting public read-write access.

The recommended approach these days is to create a bucket with whatever name you please, heavily restrict access, and serve the pages via a CloudFront CDN. This method provides the opportunity to serve the contents of a single bucket over HTTPS from any number of hostnames. It removes the risk of accidental write access. We’ll show you how to do this in Terraform soon.

Another use case for S3 buckets is supporting user uploads (and subsequent downloads or serving). In this case, we need to ensure that only the appropriate people can access uploaded assets at the appropriate time. Again, we’ll provide an example of how we do this with Terraform.

Getting started with Terraform and AWS

To manage AWS resources with Terraform, we need to tell Terraform how to talk to your AWS account.

The first step is to set up your AWS Provider. The specific details of authenticating this provider are beyond the scope of this post. Still, you can learn more in the Terraform Registry. In our example, we will rely on a profile called terraform, which defines the access key, access secret, and region.

Here’s the first part of our terraform file:

Here, we’ve identified the version of the provider we want to use and the version of Terraform that we will rely on.

Creating a bucket for a website

Creating the bucket itself is really simple:

The default ACL is private, so we’ve left it off here. Since we’re not using ACLs for access control, there’s no more configuration to worry about here.

Next, if you’re hosting resources that might get accessed from different domains, you can set up CORS rules, like so:

You can add as many CORS rules as you need to your bucket, but they must all be in the same CORS Configuration resource.

At this point, you have an S3 bucket, but nobody can access the files you upload to it. We need to create a policy that permits access to the appropriate folks.

S3 access policies

For controlling access to the data in this bucket, we need to decide between a user and a bucket policy.

Since this is a website, you want people who aren’t users within your AWS account to be able to access the files. As a result, a user policy isn’t an option. 

Therefore, we’re going to have to go with a bucket policy. The problem is that a bucket policy says which users can access objects in a bucket.

We’ve just established that we don’t have any users to work with.

An origin access identity (OAI) comes to our rescue here.

An OAI is a special “user” associated with CloudFront, the CDN service provided by AWS. It allows the creation of a bucket policy that permits the OAI to access files within the S3 bucket. Let’s see how we do this with terraform:

As you can see, the OAI has no properties to set. We then create a data resource for the IAM policy allowing the OAI to get objects from an s3 bucket. Finally, we attach that policy to the front-end bucket that we previously created.

Creating the CloudFront distribution

The final step is to create the CloudFront distribution that will give members of the public access to your files. The specifics of CloudFront configuration are beyond the scope of this post, so we’ll keep things simple here:

A few things are happening here, so we’ll go through them individually.

First, we are creating an Origin. This is the source of files for a CDN distribution. We’re making our S3 bucket the origin and telling CloudFront to use the OAI we created to access the files. Next, we enable the distribution and then set the default root object to index.html. This just means that folks who navigate to “www.mywebsite.com”, without specifying a specific page will get served the index page.

The default cache behavior tells CloudFront how to handle inbound requests. CloudFront allows the creation of multiple “Cache Behaviors” which allow you to direct different requests to different origins. The default cache behavior is usually enough and, in our case, it directs all GET and HEAD requests to our S3 origin.

The restrictions block is a requirement in Terraform; here, we’re just disabling geographical restrictions.

Finally, we set up our SSL certificate. We’ve gone with the simplest version in our example, which is to use the default CloudFront certificate. For a real-world example, we’d create an SSL certificate that matches our public web domain and use that instead of the CloudFront default.

Putting it all together

With the use of 5 resources and a data block, we now have a very simple pattern for a website hosted in an S3 bucket and accessed through a CDN distribution. Files in the bucket are secure against inappropriate write access and can only be accessed via the CloudFront URL and not the S3 bucket URL. In a future post, we’ll talk about creating modules to make this even easier to standardize, but this should provide a good starting point.

Creating a bucket for user uploads

Things get a little more complicated when we want to allow users to upload files for future use. The key differences here are:

  1. We want to let unknown users write to our S3 bucket
  2. We want to provide highly controlled access to these files for future reference

As you’ll see, Terraform can help simplify this too.

Creating the bucket is the same as before. This time, we need a new CORS rule to support the upload

S3 access policies

For uploading, we’re going to employ a user policy. This is because uploads and downloads will occur via signed URLs and we will use an IAM user to generate signatures.

We start with creating a policy document that allows putting and getting objects to and from the bucket. We create a policy from this document, then create a user and attach the policy to the user.

There’s actually another way to create a policy that removes the need for the policy document data block. The method is to embed the JSON directly in the policy resource:

Either approach is fine. Embedding the JSON does make it a little easier to move between your Terraform files and the policies in the AWS console, but it really boils down to your preference. You can even go one step further and create your policy in a separate JSON file and include it thus:

Again, whatever you prefer is fine.

Accessing this bucket

A simple way to access this bucket would be to use a service running on your server. An upload would involve transferring data from a user client to your server and then transferring it from your server to the bucket. Downloads would be the same but go from bucket to server to client. This means that all transfers effectively double.

Instead, pre-signed URLs can permit direct upload and download from the bucket. The User associated with the user policy created above can request these URLs. They are cryptographically signed, restricted in time (i.e. can only be used in, say, the next 5 minutes), and only permitted for a GET or a PUT request. An SDK in the language of your choosing provides functions for creating these.

Next Steps

This just scratches the surface of what you can do with S3 buckets and how you might secure them. While Terraform provides the tools necessary to construct security policies simply and consistently across your products, it can be helpful to have a security partner. Take our Terraform Security Test for free to see how you’re doing?