How can I create two separate environments (dev and prod) for deploying AWS resources using Terraform within a single AWS account, and what’s the best approach for managing multiple data pipelines?
Key details:
-
I have one AWS account and want to set up two distinct environments for deploying resources using Terraform.
-
I’m using Bitbucket for version control, with separate branches for dev and prod.
When deploying resources:
-
For the dev environment, resource names should have a
*_dev
suffix -
For the prod environment, resource names should have a
*_prod
suffix
I need to build around 30 data pipelines with the following characteristics:
-
Source: API endpoints
-
Targets: S3 or Redshift
Questions:
What’s the best way to structure this setup in Terraform to manage these two environments within a single AWS account?
Should I use a single repository to build all 30 data pipelines, or should I create separate repositories for each pipeline?
Here is the current structure which is running in my thoughts:
project-root/
├── environments/
│ ├── dev/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ ├── outputs.tf
│ │ ├── terraform.tfvars
│ │ └── backend.tf
│ └── prod/
│ ├── main.tf
│ ├── variables.tf
│ ├── outputs.tf
│ ├── terraform.tfvars
│ └── backend.tf
├── modules/
│ ├── lambda/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── outputs.tf
│ ├── step_function/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── outputs.tf
│ ├── iam/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── outputs.tf
│ └── eventbridge/
│ ├── main.tf
│ ├── variables.tf
│ └── outputs.tf
├── lambda/
│ ├── lambda1/
│ │ ├── main.py
│ │ └── requirements.txt
│ └── lambda2/
│ ├── main.py
│ └── requirements.txt
├── step_functions/
│ └── sage_log_processing.json
├── bitbucket-pipelines.yml
└── README.md
2
Answers
First of all, there is no "best way". Besides of some general rules, it also depends on preferred ways to work.
I can highly recommend Nicki Watt’s talk on Evolving your Infrastructure with Terraform: https://www.youtube.com/watch?v=wgzgVm7Sqlk
To summarise it briefly:
The key is to have the same code for each environment, which differs only in configuration. I.e. the suffix in your case.
Your current structure already looks pretty much like this approach.
The
environments/*/main.tf
should use only the logical components with minimal configuration (e.g._dev
suffix).Without knowing anything about the pipelines and the logic, a data pipeline could be logical component.
Hence, splitting the data pipelines in one repository is feasible. Using one repository each would introduce a lot of complexity.
This follows the first rule of 12 factor app One codebase tracked in revision control, many deploys: https://12factor.net/codebase
Consider separating the
dev
andprod
environment into isolated VPCs.Best practise would be to have separated AWS accounts: https://docs.aws.amazon.com/whitepapers/latest/organizing-your-aws-environment/organizing-your-aws-environment.html
Another talk I can recommend:
To set up two separate environments (dev and prod) in a single AWS account using Terraform and manage multiple data pipelines, follow these steps:
Directory Structure:
Organize your project with separate directories for each environment and shared modules:
Resource Naming:
In
variables.tf
, define asuffix
for environment-specific resource names:Terraform Backend:
Ensure unique backend configurations for each environment:
Bitbucket Pipelines (Assumption):
Define separate pipelines for
dev
andprod
branches:Repo Strategy:
For 30 pipelines, use either:
This setup modularizes resources, maintains environment separation, and integrates CI/CD with Bitbucket.