Configure AWS S3 backend for Terraform

This is a demonstration of how to share terraform state file through S3 bucket.

Lets say you are working on a project, and at some point it’s not only you that is working and maintaining the project, you got your team who needs a convenient way to work, and that is done by sharing the state file so more than one person could work with terraform.
For this demonstration we will use a flat TF script that creates some minimal resources, a VPC, public subnet with routes, and a t2.micro EC2 machine, the resources doesn’t matter much, we are here to deal with the state file.

 

Prerequisites:

       Terraform installed and configured.

       AWS access with permissions to create resources.

What is a terraform state (.tfstate) file?

A .tfstate file is a crucial component of Terraform, an open-source infrastructure as code (IaC) software tool created by HashiCorp. Terraform uses the .tfstate file to manage and maintain the state of the infrastructure it provisions and manages.

Key Points about .tfstate File

    Purpose:

        The primary purpose of the .tfstate file is to keep track of the state of your infrastructure. It stores information about the resources that Terraform manages, such as virtual machines, databases, and networking components.

        It enables Terraform to know what resources exist, their current configurations, and any dependencies between them.

    Contents:

        The .tfstate file is a JSON-formatted file that contains a mapping between Terraform resource definitions and the real-world resources they correspond to.

        It includes details such as resource IDs, properties, and metadata about the infrastructure components.

 

   Functionality:

        Plan: When you run terraform plan, TF uses the .tfstate file to determine what changes need to be made to achieve the desired state defined in your configuration files.

        Apply: When you run terraform apply, TF updates the infrastructure according to the plan and then updates the .tfstate file to reflect the new state of the infrastructure.

        Refresh: Terraform can refresh the state file to ensure it accurately represents the current state of the infrastructure, using the terraform refresh command.

    State Locking:

        Terraform supports state locking to prevent multiple TF processes from conflicting with each other. This is especially important in team environments where multiple users might run Terraform commands simultaneously.

    Remote State:

        For collaboration and team environments, Terraform supports storing the state file remotely using backends such as AWS S3, Azure Blob Storage, Google Cloud Storage, and Terraform Cloud. This allows multiple team members to access and update the state file without conflicts.

        Remote state also provides additional benefits such as encryption, versioning, and state locking.

    Security:

        The .tfstate file may contain sensitive information such as access keys, passwords, and other secrets. Therefore, it’s crucial to handle it securely, restrict access, and avoid exposing it publicly.

For more info you can read docs here.

The default TF ‘backend’ configuration is local, meaning that it would be saved on your local machine with out any specific config, you will find ‘.tfstate’ file created locally right after running a ‘terraform apply’.

But if we want to configure it to be a remote state there are some steps to perform.

       1.    create S3 bucket to host the state.
       2.    create a dynamodb chart to manage the lock mechanism.
       3.    Configure a backend for terraform to point to AWS S3 bucket.

Alright, as I said we have this simplest flat TF script, it can be anything you like, this is our starting point…

provider "aws" {
  region = "eu-central-1" 
}
resource "aws_vpc" "main" {
  cidr_block           = "10.0.50.0/23"
  enable_dns_support   = true
  enable_dns_hostnames = true
  tags = {
    Name = "main-vpc"
  }
}
resource "aws_subnet" "public" {
  vpc_id                  = aws_vpc.main.id
  cidr_block              = "10.0.50.0/24"
  map_public_ip_on_launch = true
  availability_zone       = "eu-central-1a" 
  tags = {
    Name = "public-subnet"
  }
}
resource "aws_internet_gateway" "igw" {
  vpc_id = aws_vpc.main.id
  tags = {
    Name = "main-igw"
  }
}
resource "aws_route_table" "public" {
  vpc_id = aws_vpc.main.id
  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.igw.id
  }
  tags = {
    Name = "public-route-table"
  }
}
resource "aws_route_table_association" "public" {
  subnet_id      = aws_subnet.public.id
  route_table_id = aws_route_table.public.id
}
resource "aws_instance" "web" {
  ami                         = "ami-08188dffd130a1ac2" # Amazon Linux 2 AMI for eu-central-1
  instance_type               = "t2.micro"              # Free tier eligible instance type
  subnet_id                   = aws_subnet.public.id
  associate_public_ip_address = true
  tags = {
    Name = "web-instance"
  }
}

let’s start with number one, adding the following code block to create our desired bucket that shall host the state file.

resource "aws_s3_bucket" "terraform_state" {
    bucket = "terraform-state-elibukin"
    lifecycle {
        prevent_destroy = true
    }
    versioning {
        enabled = true
    }
    server_side_encryption_configuration {
        rule {
            apply_server_side_encryption_by_default {
                sse_algorithm = "AES256"
            }
        }
    }
}

Let’s break down each part of the configuration, this Terraform configuration defines an AWS S3 bucket with the following properties:

1.    Bucket Name: terraform-state-elibukin
2.    Lifecycle Rule: Prevents the bucket from being destroyed accidentally.
3.    Versioning: Enables versioning to keep track of object versions, this is a lifesaver!
4.    Server-Side Encryption: Applies AES256 encryption to all objects stored in the bucket by default.

Next we have to define the ‘DynamoDB’ table.
This DynamoDB table is typically used in conjunction with Terraform to manage state locking. State locking is a mechanism that ensures that only one process can modify the state at a time, preventing concurrent updates and potential state corruption. When Terraform performs actions that modify the state, it acquires a lock in the DynamoDB table, ensuring that no other process can make changes until the lock is released.

add the following block to our TF script.

resource "aws_dynamodb_table" "terraform_lock" {
    name         = "tarraform-state-lock"
    billing_mode = "PAY_PER_REQUEST"
    hasg_key     = "LockID"
    attribute {
        name = "LockID"
        type = "S"
    }
}

Let’s break it down:

1.    resource “aws_dynamodb_table” “terraform_lock”:   This line declares a resource of type aws_dynamodb_table with the name terraform_lock. Terraform uses this block to manage a DynamoDB table in AWS.
2.    name = “terraform-state-lock”:   This specifies the name of the DynamoDB table as terraform-state-lock.
3.    billing_mode = “PAY_PER_REQUEST”:   This sets the billing mode for the table to “PAY_PER_REQUEST”, meaning you are billed based on the number of read and write requests rather than pre-provisioned capacity. This is useful for unpredictable workloads.
4.    hash_key = “LockID”:   This specifies that the primary key for the table will be the LockID attribute. The primary key uniquely identifies each item in the table.
attribute { … }: This block defines the attributes for the DynamoDB table.
5.    name = “LockID”:   This specifies the name of the attribute as LockID.
6.    type = “S”:   This sets the data type of the LockID attribute to S, which stands for string.

Now, run the script to actually create the bucket and the table.

terraform apply

Alright, we created the bucket, we created a dynamodb table, the final thing to do is to configure terraform to use the S3 bucket as a backend.

Add the following block to the script.

terraform {
    backend "s3" {
        bucket         = "terraform-state-elibukin"
        key            = "terraform/statefile/terraform.tfstate"
        region         = "eu-central-1"
        dynamo db_table = "terraform_lock"
        encrypt        = true
    }
}

Let’s break it down:

1.    terraform { … }:   This block specifies settings related to Terraform’s behavior and configurations, including backend configuration.
2.    backend “s3” { … }:   This indicates that the backend for storing Terraform state is AWS S3. The backend is where Terraform stores its state files, which track the state of your infrastructure.
3.    bucket = “terraform-state-elibukin”:   This specifies the name of the S3 bucket where the Terraform state file will be stored. In this case, the bucket is named terraform-state-elibukin.
4.    key = “terraform/statefile/terraform.tfstate”:   This specifies the path within the S3 bucket where the state file will be stored. Here, the state file will be stored under global/s3/terraform.tfstate.
5.    region = “eu-central-1”:   This specifies the AWS region where the S3 bucket is located. In this case, it is the eu-central-1 region (Frankfurt).
6.    dynamodb_table = “terraform_lock”:   This specifies the name of the DynamoDB table that will be used for state locking. The table terraform_lock is used to ensure that only one instance of Terraform can modify the state at a time, preventing concurrent changes that could corrupt the state.
7.    encrypt = true:   This indicates that the state file stored in the S3 bucket should be encrypted using server-side encryption. This ensures that the state file is securely stored and protected from unauthorized access.

After adding the backend block we should run again ‘terraform init’ so the new configuration will be applied, run ‘terraform init’ and you will be prompted for confirmation.

terraform init

After that procedure your local state file will still exist, but it would be empty, terraform now looking on the tfstate file that is located on the S3 bucket.

This is how it looks after adding all the new stuff.

terraform {
    backend "s3" {
        bucket         = "terraform-state-elibukin"
        key            = "global/s3/terraform.tfstate"
        region         = "eu-central-1"
        dynamodb_table = "tarraform-state-lock"
        encrypt        = true
    }
}

provider "aws" {
  region = "eu-central-1" 
}

resource "aws_s3_bucket" "terraform_state" {
    bucket = "terraform-state-elibukin"

    lifecycle {
        prevent_destroy = true
    }

    versioning {
        enabled = true
    }

    server_side_encryption_configuration {
        rule {
            apply_server_side_encryption_by_default {
                sse_algorithm = "AES256"
            }
        }
    }
}

resource "aws_dynamodb_table" "terraform_lock" {
    name         = "tarraform-state-lock"
    billing_mode = "PAY_PER_REQUEST"
    hash_key     = "LockID"

    attribute {
        name = "LockID"
        type = "S"
    }
}

resource "aws_vpc" "main" {
  cidr_block           = "10.0.50.0/23"
  enable_dns_support   = true
  enable_dns_hostnames = true

  tags = {
    Name = "main-vpc"
  }
}

resource "aws_subnet" "public" {
  vpc_id                  = aws_vpc.main.id
  cidr_block              = "10.0.50.0/24"
  map_public_ip_on_launch = true
  availability_zone       = "eu-central-1a" 

  tags = {
    Name = "public-subnet"
  }
}

resource "aws_internet_gateway" "igw" {
  vpc_id = aws_vpc.main.id

  tags = {
    Name = "main-igw"
  }
}

resource "aws_route_table" "public" {
  vpc_id = aws_vpc.main.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.igw.id
  }

  tags = {
    Name = "public-route-table"
  }
}

resource "aws_route_table_association" "public" {
  subnet_id      = aws_subnet.public.id
  route_table_id = aws_route_table.public.id
}

resource "aws_instance" "web" {
  ami                         = "ami-08188dffd130a1ac2" # Amazon Linux 2 AMI for eu-central-1
  instance_type               = "t2.micro"              # Free tier eligible instance type
  subnet_id                   = aws_subnet.public.id
  associate_public_ip_address = true

  tags = {
    Name = "web-instance"
  }
}