skip to Main Content

How can I create two separate environments (dev and prod) for deploying AWS resources using Terraform within a single AWS account, and what’s the best approach for managing multiple data pipelines?

Key details:

  • I have one AWS account and want to set up two distinct environments for deploying resources using Terraform.

  • I’m using Bitbucket for version control, with separate branches for dev and prod.

When deploying resources:

  • For the dev environment, resource names should have a *_dev suffix

  • For the prod environment, resource names should have a *_prod suffix

I need to build around 30 data pipelines with the following characteristics:

  • Source: API endpoints

  • Targets: S3 or Redshift

Questions:

What’s the best way to structure this setup in Terraform to manage these two environments within a single AWS account?

Should I use a single repository to build all 30 data pipelines, or should I create separate repositories for each pipeline?

Here is the current structure which is running in my thoughts:

    project-root/
    ├── environments/
    │ ├── dev/
    │ │ ├── main.tf
    │ │ ├── variables.tf
    │ │ ├── outputs.tf
    │ │ ├── terraform.tfvars
    │ │ └── backend.tf
    │ └── prod/
    │ ├── main.tf
    │ ├── variables.tf
    │ ├── outputs.tf
    │ ├── terraform.tfvars
    │ └── backend.tf
    ├── modules/
    │ ├── lambda/
    │ │ ├── main.tf
    │ │ ├── variables.tf
    │ │ └── outputs.tf
    │ ├── step_function/
    │ │ ├── main.tf
    │ │ ├── variables.tf
    │ │ └── outputs.tf
    │ ├── iam/
    │ │ ├── main.tf
    │ │ ├── variables.tf
    │ │ └── outputs.tf
    │ └── eventbridge/
    │ ├── main.tf
    │ ├── variables.tf
    │ └── outputs.tf
    ├── lambda/
    │ ├── lambda1/
    │ │ ├── main.py
    │ │ └── requirements.txt
    │ └── lambda2/
    │ ├── main.py
    │ └── requirements.txt
    ├── step_functions/
    │ └── sage_log_processing.json
    ├── bitbucket-pipelines.yml
    └── README.md

2

Answers


  1. First of all, there is no "best way". Besides of some general rules, it also depends on preferred ways to work.

    I can highly recommend Nicki Watt’s talk on Evolving your Infrastructure with Terraform: https://www.youtube.com/watch?v=wgzgVm7Sqlk

    To summarise it briefly:

    • Separate your terraform code into re-usable modules, which are collections of terraform resources.
    • Create logical components, which are compositions of the modules you created and optionally additional resources.
    • Create environment-specific configurations of those logical components.

    The key is to have the same code for each environment, which differs only in configuration. I.e. the suffix in your case.

    Your current structure already looks pretty much like this approach.
    The environments/*/main.tf should use only the logical components with minimal configuration (e.g. _dev suffix).
    Without knowing anything about the pipelines and the logic, a data pipeline could be logical component.
    Hence, splitting the data pipelines in one repository is feasible. Using one repository each would introduce a lot of complexity.
    This follows the first rule of 12 factor app One codebase tracked in revision control, many deploys: https://12factor.net/codebase

    Consider separating the dev and prod environment into isolated VPCs.
    Best practise would be to have separated AWS accounts: https://docs.aws.amazon.com/whitepapers/latest/organizing-your-aws-environment/organizing-your-aws-environment.html

    Another talk I can recommend:

    Login or Signup to reply.
  2. To set up two separate environments (dev and prod) in a single AWS account using Terraform and manage multiple data pipelines, follow these steps:

    Directory Structure:

    Organize your project with separate directories for each environment and shared modules:

    project-root/
    ├── environments/
    │   ├── dev/
    │   │   └── terraform files...
    │   ├── prod/
    │   │   └── terraform files...
    ├── modules/
    │   ├── lambda/
    │   ├── step_function/
    │   ├── iam/
    │   └── eventbridge/
    ├── bitbucket-pipelines.yml
    

    Resource Naming:
    In variables.tf, define a suffix for environment-specific resource names:

    variable "suffix" { default = "_dev" } # For dev
    variable "suffix" { default = "_prod" } # For prod
    
    resource "aws_s3_bucket" "data_bucket" {
      bucket = "my-bucket${var.suffix}"
    }
    

    Terraform Backend:
    Ensure unique backend configurations for each environment:

    # dev/backend.tf
    backend "s3" { bucket = "my-terraform-state-dev" }
    
    # prod/backend.tf
    backend "s3" { bucket = "my-terraform-state-prod" }
    

    Bitbucket Pipelines (Assumption):
    Define separate pipelines for dev and prod branches:

    pipelines:
      branches:
        dev:
          - step: { script: [ "cd environments/dev", "terraform apply -auto-approve" ] }
        prod:
          - step: { script: [ "cd environments/prod", "terraform apply -auto-approve" ] }
    

    Repo Strategy:
    For 30 pipelines, use either:

    1. Single Repo: Use separate directories for each pipeline.
    2. Multiple Repos: Use separate repositories for isolation.

    This setup modularizes resources, maintains environment separation, and integrates CI/CD with Bitbucket.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search