The .tf file below is used to create a rds postgres instance with multi-az and a read replica and an EC2 instance for microservice deployment that will access the rds instance. This is on AWS.
Running terraform apply
is slow taking about 15 minutes. terraform destroy
takes about 20 minutes and fails with the following error. I end up running aws-nuke
that terraform leaves.
Can anyone see what is causing this?
aws_internet_gateway.example: Still destroying... [id=igw-0acac09118fd268dc, 20m0s elapsed]
╷
│ Error: deleting RDS DB Parameter Group (pg15): operation error RDS: DeleteDBParameterGroup, https response error StatusCode: 400, RequestID: 74a4dcce-118c-46d4-9f46-42bbd90334a2, InvalidDBParameterGroupState: One or more database instances are still members of this parameter group pg15, so the group cannot be deleted
│
│
╵
╷
│ Error: deleting EC2 Internet Gateway (igw-0acac09118fd268dc): detaching EC2 Internet Gateway (igw-0acac09118fd268dc) from VPC (vpc-045dd1b8ec6453e33): DependencyViolation: Network vpc-045dd1b8ec6453e33 has some mapped public address(es). Please unmap those public address(es) before detaching the gateway.
│ status code: 400, request id: 6d7c4aad-39e5-4c5a-b9fe-4443058b6109
│
│
╵
╷
│ Error: final_snapshot_identifier is required when skip_final_snapshot is false
│
│
╵
Here is my .tf file:
provider "aws" {
region = "us-west-2"
}
# Create a VPC
resource "aws_vpc" "example" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = "example-vpc"
}
}
# Create internet gateway
resource "aws_internet_gateway" "example" {
vpc_id = aws_vpc.example.id
}
# Create subnets
resource "aws_subnet" "example" {
vpc_id = aws_vpc.example.id
cidr_block = "10.0.1.0/24"
availability_zone = "us-west-2b"
tags = {
Name = "example-subnet-1"
}
}
resource "aws_subnet" "example2" {
vpc_id = aws_vpc.example.id
cidr_block = "10.0.2.0/24"
availability_zone = "us-west-2c"
tags = {
Name = "example-subnet-2"
}
}
# Create route table
resource "aws_route_table" "example" {
vpc_id = aws_vpc.example.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.example.id
}
}
# Associate route table with subnets
resource "aws_route_table_association" "example" {
subnet_id = aws_subnet.example.id
route_table_id = aws_route_table.example.id
}
resource "aws_route_table_association" "example2" {
subnet_id = aws_subnet.example2.id
route_table_id = aws_route_table.example.id
}
# Create security group for RDS
resource "aws_security_group" "rds_sg" {
name = "example-rds-sg"
description = "Security group for RDS"
vpc_id = aws_vpc.example.id
ingress {
from_port = 5432
to_port = 5432
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
# Create security group for EC2
resource "aws_security_group" "ec2_sg" {
name = "example-ec2-sg"
description = "Security group for EC2"
vpc_id = aws_vpc.example.id
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_db_parameter_group" "pg15" {
name = "pg15"
family = "postgres15"
parameter {
name = "log_connections"
value = "1"
}
lifecycle {
create_before_destroy = true
}
}
# Create RDS instance
resource "aws_db_instance" "example" {
engine = "postgres"
engine_version = "15.4"
instance_class = "db.t3.micro"
identifier = "example-db"
username = "exampleuser"
password = "examplepassword"
parameter_group_name = "pg15"
backup_retention_period = 5
multi_az = true
publicly_accessible = true
vpc_security_group_ids = [aws_security_group.rds_sg.id]
db_subnet_group_name = aws_db_subnet_group.example.name
allocated_storage = 20
tags = {
Name = "example-db"
}
}
# Create RDS subnet group
resource "aws_db_subnet_group" "example" {
name = "example-db-subnet-group"
subnet_ids = [aws_subnet.example.id, aws_subnet.example2.id]
}
2
Answers
You will have to configure the
timeout
settings of the resources that take longer to destroy. Try the code below:You may also need to add this configuration to other resources that exceed the default deletion/creation timeout (15 mins).
What you have here is an example of a "hidden dependency", where the remote API has some dependency relationships that Terraform cannot infer automatically based on the reference expressions in your configuration.
Specifically, a publicly-accessible RDS instance needs a public IP address, and in EC2 a public IP address is allocated through an Internet Gateway. Therefore the RDS instance depends on your internet gateway, but the remote API infers that automatically by following several steps:
Terraform can infer the first, second, and third of these relationships, but AWS provider is designed so that the relationship between VPC and internet gateway is configured "backwards", with the internet gateway referring to the VPC rather than the other way around.
When an infrastructure description has hidden dependencies like this, you’ll need to use the
depends_on
meta-argument to tell Terraform explicitly about the dependency relationship. In this case, the RDS instance effectively depends on the internet gateway, because internally RDS will request a public IP address from that internet gateway and so the internet gateway cannot be destroyed as long as the RDS instance is still using its assigned IP address.To specify that, you can add the following
depends_on
argument to youraws_db_instance
configuration:Terraform will then know that it must create the internet gateway before creating the RDS instance, and conversely that it must destroy the RDS instance before it can destroy the internet gateway.