Implementing Terraform for MenoTime Infrastructure
This guide provides a comprehensive approach to adopting Terraform for managing MenoTime's AWS infrastructure. Whether you're migrating existing resources or building new infrastructure from scratch, this guide covers architecture, implementation, and best practices specific to our environment.
Why Terraform?
The Case for Infrastructure as Code
Terraform transforms AWS infrastructure management from manual, error-prone processes into repeatable, version-controlled automation. For MenoTime, Terraform delivers several critical benefits:
Reproducibility & Consistency - Every environment (Dev, Staging, Production) follows the exact same configuration - Reduces "works on my machine" issues in infrastructure - Eliminates manual misconfigurations that cause inconsistencies across environments - All changes tracked in Git with full audit trail
Region Migration Made Simple - Currently in us-west-1, but migrating to us-east-1? Change one variable - Re-create entire infrastructure stack in minutes instead of days - Disaster recovery and multi-region deployment become practical - Test infrastructure changes in dev before applying to production
Cost Control & Visibility - Document exactly what resources exist and why - Easily identify and remove orphaned resources - Cost estimation before applying changes - Prevent accidental infrastructure sprawl
Disaster Recovery - Entire infrastructure definition lives in Git - Recovery from regional outage: apply Terraform in new region - No manual steps means faster recovery time objectives (RTO)
Team Collaboration - Code review infrastructure changes like application code - Onboard new team members with existing infrastructure patterns - Documentation lives alongside code (as comments)
Recommended Project Structure
Organize your Terraform project to match MenoTime's architecture and environment separation:
terraform/
├── README.md
├── .terraform.lock.hcl # Dependency lock file (commit to Git)
├── .gitignore # Exclude local tfstate files
│
├── main.tf # Root module - orchestration
├── variables.tf # Root module input variables
├── outputs.tf # Root outputs
├── terraform.tfvars.example # Example tfvars template
├── versions.tf # Provider version constraints
│
├── environments/
│ ├── dev.tfvars # Dev environment variables
│ ├── staging.tfvars # Staging environment variables
│ └── prod.tfvars # Production environment variables
│
├── modules/
│ ├── vpc/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── outputs.tf
│ ├── rds/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── outputs.tf
│ ├── ecs/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── outputs.tf
│ ├── security/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── outputs.tf
│ ├── monitoring/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── outputs.tf
│ └── dns/
│ ├── main.tf
│ ├── variables.tf
│ └── outputs.tf
│
└── terraform-state/ # S3 + DynamoDB for remote state
├── main.tf
└── outputs.tf
Getting Started
1. Install Terraform
Download and install Terraform v1.5+:
# macOS with Homebrew
brew tap hashicorp/tap
brew install hashicorp/tap/terraform
# Verify installation
terraform version
# Terraform v1.5.x
2. Initialize AWS Provider
Create versions.tf to define provider requirements:
terraform {
required_version = ">= 1.5"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {
region = var.aws_region
default_tags {
tags = {
Environment = var.environment
Project = "menotime"
ManagedBy = "terraform"
CreatedAt = timestamp()
}
}
}
3. Set Up Remote State with S3 + DynamoDB
Remote state is critical for team collaboration. First, create the state infrastructure:
terraform-state/main.tf:
# S3 bucket for Terraform state
resource "aws_s3_bucket" "terraform_state" {
bucket = "menotime-terraform-state-${data.aws_caller_identity.current.account_id}"
}
# Enable versioning for state recovery
resource "aws_s3_bucket_versioning" "terraform_state" {
bucket = aws_s3_bucket.terraform_state.id
versioning_configuration {
status = "Enabled"
}
}
# Block public access to state
resource "aws_s3_bucket_public_access_block" "terraform_state" {
bucket = aws_s3_bucket.terraform_state.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
# Enable encryption
resource "aws_s3_bucket_server_side_encryption_configuration" "terraform_state" {
bucket = aws_s3_bucket.terraform_state.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "AES256"
}
}
}
# DynamoDB table for state locking
resource "aws_dynamodb_table" "terraform_locks" {
name = "menotime-terraform-locks"
billing_mode = "PAY_PER_REQUEST"
hash_key = "LockID"
attribute {
name = "LockID"
type = "S"
}
tags = {
Name = "menotime-terraform-locks"
}
}
data "aws_caller_identity" "current" {}
output "state_bucket" {
value = aws_s3_bucket.terraform_state.id
}
output "locks_table" {
value = aws_dynamodb_table.terraform_locks.name
}
After creating this infrastructure:
cd terraform-state
terraform init
terraform apply
# Note the bucket and table names in outputs
Then create terraform/backend.tf:
terraform {
backend "s3" {
bucket = "menotime-terraform-state-123456789012"
key = "menotime/terraform.tfstate"
region = "us-west-1"
dynamodb_table = "menotime-terraform-locks"
encrypt = true
}
}
Initialize Terraform with remote state:
cd terraform
terraform init
# Terraform will confirm state migration to S3
Terraform Modules for Core Components
VPC Module
The VPC module manages network isolation, subnet placement, and NAT gateways.
modules/vpc/variables.tf:
variable "environment" {
type = string
description = "Environment name (dev, staging, prod)"
}
variable "vpc_cidr" {
type = string
description = "CIDR block for VPC"
default = "10.0.0.0/16"
}
variable "availability_zones" {
type = list(string)
description = "Availability zones for subnets"
default = ["us-west-1a", "us-west-1b"]
}
variable "enable_nat_gateway" {
type = bool
description = "Enable NAT Gateway for private subnets"
default = true
}
modules/vpc/main.tf:
# VPC
resource "aws_vpc" "main" {
cidr_block = var.vpc_cidr
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = "${var.environment}-vpc"
}
}
# Public Subnets (for ALB, NAT Gateway)
resource "aws_subnet" "public" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = "10.0.${count.index + 1}.0/24"
availability_zone = var.availability_zones[count.index]
map_public_ip_on_launch = true
tags = {
Name = "${var.environment}-public-subnet-${count.index + 1}"
}
}
# Private Subnets (for ECS, RDS)
resource "aws_subnet" "private" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = "10.0.${count.index + 101}.0/24"
availability_zone = var.availability_zones[count.index]
tags = {
Name = "${var.environment}-private-subnet-${count.index + 1}"
}
}
# Internet Gateway
resource "aws_internet_gateway" "main" {
vpc_id = aws_vpc.main.id
tags = {
Name = "${var.environment}-igw"
}
}
# Public Route Table
resource "aws_route_table" "public" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.main.id
}
tags = {
Name = "${var.environment}-public-rt"
}
}
resource "aws_route_table_association" "public" {
count = length(aws_subnet.public)
subnet_id = aws_subnet.public[count.index].id
route_table_id = aws_route_table.public.id
}
# Elastic IPs and NAT Gateways
resource "aws_eip" "nat" {
count = var.enable_nat_gateway ? length(var.availability_zones) : 0
domain = "vpc"
depends_on = [aws_internet_gateway.main]
tags = {
Name = "${var.environment}-eip-${count.index + 1}"
}
}
resource "aws_nat_gateway" "main" {
count = var.enable_nat_gateway ? length(var.availability_zones) : 0
allocation_id = aws_eip.nat[count.index].id
subnet_id = aws_subnet.public[count.index].id
depends_on = [aws_internet_gateway.main]
tags = {
Name = "${var.environment}-nat-${count.index + 1}"
}
}
# Private Route Table (routes through NAT Gateway)
resource "aws_route_table" "private" {
count = var.enable_nat_gateway ? length(var.availability_zones) : 0
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.main[count.index].id
}
tags = {
Name = "${var.environment}-private-rt-${count.index + 1}"
}
}
resource "aws_route_table_association" "private" {
count = length(aws_subnet.private)
subnet_id = aws_subnet.private[count.index].id
route_table_id = aws_route_table.private[count(index.index) % length(aws_route_table.private)].id
}
# VPC Endpoints (S3, DynamoDB, Secrets Manager)
resource "aws_vpc_endpoint" "s3" {
vpc_id = aws_vpc.main.id
service_name = "com.amazonaws.us-west-1.s3"
route_table_ids = concat(
[aws_route_table.public.id],
aws_route_table.private[*].id
)
tags = {
Name = "${var.environment}-s3-endpoint"
}
}
resource "aws_vpc_endpoint" "secrets_manager" {
vpc_id = aws_vpc.main.id
service_name = "com.amazonaws.us-west-1.secretsmanager"
vpc_endpoint_type = "Interface"
private_dns_enabled = true
subnet_ids = aws_subnet.private[*].id
security_group_ids = [aws_security_group.vpc_endpoints.id]
tags = {
Name = "${var.environment}-secrets-endpoint"
}
}
# Security Group for VPC Endpoints
resource "aws_security_group" "vpc_endpoints" {
name = "${var.environment}-vpc-endpoints"
description = "Security group for VPC endpoints"
vpc_id = aws_vpc.main.id
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = [var.vpc_cidr]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "${var.environment}-vpc-endpoints-sg"
}
}
modules/vpc/outputs.tf:
output "vpc_id" {
value = aws_vpc.main.id
}
output "public_subnet_ids" {
value = aws_subnet.public[*].id
}
output "private_subnet_ids" {
value = aws_subnet.private[*].id
}
output "vpc_endpoint_s3_id" {
value = aws_vpc_endpoint.s3.id
}
output "vpc_endpoints_security_group_id" {
value = aws_security_group.vpc_endpoints.id
}
RDS Module
Manages PostgreSQL databases with proper backup and security configuration.
modules/rds/variables.tf:
variable "environment" {
type = string
description = "Environment name"
}
variable "db_name" {
type = string
description = "Database name"
default = "menotime"
}
variable "db_username" {
type = string
description = "Master database username"
sensitive = true
}
variable "db_password" {
type = string
description = "Master database password"
sensitive = true
}
variable "instance_class" {
type = string
description = "RDS instance class"
default = "db.m7g.large"
}
variable "allocated_storage" {
type = number
description = "Allocated storage in GB"
default = 100
}
variable "backup_retention_period" {
type = number
description = "Backup retention days"
default = 30
}
variable "vpc_id" {
type = string
}
variable "private_subnet_ids" {
type = list(string)
}
variable "kms_key_id" {
type = string
description = "KMS key for encryption"
}
modules/rds/main.tf:
# DB Subnet Group
resource "aws_db_subnet_group" "main" {
name = "${var.environment}-db-subnet-group"
subnet_ids = var.private_subnet_ids
tags = {
Name = "${var.environment}-db-subnet-group"
}
}
# Security Group for RDS
resource "aws_security_group" "rds" {
name = "${var.environment}-rds"
description = "Security group for RDS"
vpc_id = var.vpc_id
ingress {
from_port = 5432
to_port = 5432
protocol = "tcp"
cidr_blocks = ["10.0.0.0/16"] # Adjust to match VPC CIDR
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "${var.environment}-rds-sg"
}
}
# RDS Instance
resource "aws_db_instance" "main" {
identifier = "${var.environment}-menotime-db"
engine = "postgres"
engine_version = "15.3"
instance_class = var.instance_class
allocated_storage = var.allocated_storage
db_name = var.db_name
username = var.db_username
password = var.db_password
parameter_group_name = aws_db_parameter_group.main.name
db_subnet_group_name = aws_db_subnet_group.main.name
vpc_security_group_ids = [aws_security_group.rds.id]
# Backup & Recovery
backup_retention_period = var.backup_retention_period
backup_window = "03:00-04:00"
maintenance_window = "mon:04:00-mon:05:00"
# Encryption
storage_encrypted = true
kms_key_id = var.kms_key_id
# Monitoring
enabled_cloudwatch_logs_exports = ["postgresql"]
monitoring_interval = 60
monitoring_role_arn = aws_iam_role.rds_monitoring.arn
# High Availability
multi_az = var.environment == "prod" ? true : false
deletion_protection = var.environment == "prod" ? true : false
skip_final_snapshot = var.environment == "prod" ? false : true
final_snapshot_identifier = "${var.environment}-menotime-db-final-snapshot-${formatdate("YYYY-MM-DD-hhmm", timestamp())}"
tags = {
Name = "${var.environment}-menotime-db"
}
}
# Parameter Group
resource "aws_db_parameter_group" "main" {
family = "postgres15"
name = "${var.environment}-menotime-params"
# Enable slow query logging
parameter {
name = "log_statement"
value = "all"
}
parameter {
name = "log_duration"
value = "on"
}
parameter {
name = "log_min_duration_statement"
value = "1000" # Log queries over 1 second
}
tags = {
Name = "${var.environment}-menotime-params"
}
}
# IAM Role for RDS Monitoring
resource "aws_iam_role" "rds_monitoring" {
name = "${var.environment}-rds-monitoring-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "monitoring.rds.amazonaws.com"
}
}]
})
}
resource "aws_iam_role_policy_attachment" "rds_monitoring" {
role = aws_iam_role.rds_monitoring.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonRDSEnhancedMonitoringRole"
}
# RDS Enhanced Monitoring IAM Policy (required for CloudWatch)
resource "aws_cloudwatch_log_group" "rds_logs" {
name = "/aws/rds/instance/${var.environment}-menotime-db/postgresql"
retention_in_days = 7
tags = {
Name = "${var.environment}-rds-logs"
}
}
modules/rds/outputs.tf:
output "db_endpoint" {
value = aws_db_instance.main.endpoint
}
output "db_address" {
value = aws_db_instance.main.address
}
output "db_port" {
value = aws_db_instance.main.port
}
output "db_name" {
value = aws_db_instance.main.db_name
}
output "security_group_id" {
value = aws_security_group.rds.id
}
ECS Module
Manages Fargate container orchestration with task definitions and services.
modules/ecs/variables.tf:
variable "environment" {
type = string
}
variable "container_port" {
type = number
default = 3000
}
variable "container_image" {
type = string
description = "Docker image URI from ECR"
}
variable "container_name" {
type = string
default = "menotime-api"
}
variable "cpu" {
type = number
description = "Task CPU (256, 512, 1024, 2048, 4096)"
}
variable "memory" {
type = number
description = "Task memory in MB (512, 1024, 2048, ...)"
}
variable "desired_count" {
type = number
description = "Number of tasks to run"
default = 2
}
variable "vpc_id" {
type = string
}
variable "private_subnet_ids" {
type = list(string)
}
variable "alb_target_group_arn" {
type = string
}
variable "log_group_name" {
type = string
}
variable "kms_key_id" {
type = string
}
variable "task_role_arn" {
type = string
}
variable "execution_role_arn" {
type = string
}
modules/ecs/main.tf:
# ECS Cluster
resource "aws_ecs_cluster" "main" {
name = "${var.environment}-menotime-cluster"
setting {
name = "containerInsights"
value = "enabled"
}
tags = {
Name = "${var.environment}-menotime-cluster"
}
}
# CloudWatch Log Group
resource "aws_cloudwatch_log_group" "ecs" {
name = "/ecs/${var.environment}-menotime"
retention_in_days = var.environment == "prod" ? 30 : 7
kms_key_id = var.kms_key_id
tags = {
Name = "${var.environment}-ecs-logs"
}
}
# ECS Task Definition
resource "aws_ecs_task_definition" "main" {
family = "${var.environment}-menotime"
network_mode = "awsvpc"
requires_compatibilities = ["FARGATE"]
cpu = var.cpu
memory = var.memory
execution_role_arn = var.execution_role_arn
task_role_arn = var.task_role_arn
container_definitions = jsonencode([{
name = var.container_name
image = var.container_image
essential = true
portMappings = [{
containerPort = var.container_port
hostPort = var.container_port
protocol = "tcp"
}]
logConfiguration = {
logDriver = "awslogs"
options = {
"awslogs-group" = aws_cloudwatch_log_group.ecs.name
"awslogs-region" = "us-west-1"
"awslogs-stream-prefix" = "ecs"
}
}
environment = [
{
name = "ENVIRONMENT"
value = var.environment
}
]
}])
tags = {
Name = "${var.environment}-menotime-task-def"
}
}
# Security Group for ECS
resource "aws_security_group" "ecs" {
name = "${var.environment}-ecs"
description = "Security group for ECS tasks"
vpc_id = var.vpc_id
ingress {
from_port = var.container_port
to_port = var.container_port
protocol = "tcp"
security_groups = [aws_security_group.alb.id]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "${var.environment}-ecs-sg"
}
}
# Security Group for ALB
resource "aws_security_group" "alb" {
name = "${var.environment}-alb"
description = "Security group for ALB"
vpc_id = var.vpc_id
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "${var.environment}-alb-sg"
}
}
# ECS Service
resource "aws_ecs_service" "main" {
name = "${var.environment}-menotime-service"
cluster = aws_ecs_cluster.main.id
task_definition = aws_ecs_task_definition.main.arn
desired_count = var.desired_count
launch_type = "FARGATE"
network_configuration {
subnets = var.private_subnet_ids
security_groups = [aws_security_group.ecs.id]
assign_public_ip = false
}
load_balancer {
target_group_arn = var.alb_target_group_arn
container_name = var.container_name
container_port = var.container_port
}
depends_on = [aws_ecs_task_definition.main]
tags = {
Name = "${var.environment}-menotime-service"
}
}
# Auto Scaling Target
resource "aws_appautoscaling_target" "ecs" {
max_capacity = var.environment == "prod" ? 10 : 4
min_capacity = var.desired_count
resource_id = "service/${aws_ecs_cluster.main.name}/${aws_ecs_service.main.name}"
scalable_dimension = "ecs:service:DesiredCount"
service_namespace = "ecs"
}
# Auto Scaling Policy - CPU
resource "aws_appautoscaling_policy" "ecs_cpu" {
name = "${var.environment}-ecs-cpu-scaling"
policy_type = "TargetTrackingScaling"
resource_id = aws_appautoscaling_target.ecs.resource_id
scalable_dimension = aws_appautoscaling_target.ecs.scalable_dimension
service_namespace = aws_appautoscaling_target.ecs.service_namespace
target_tracking_scaling_policy_configuration {
predefined_metric_specification {
predefined_metric_type = "ECSServiceAverageCPUUtilization"
}
target_value = 70.0
}
}
# Auto Scaling Policy - Memory
resource "aws_appautoscaling_policy" "ecs_memory" {
name = "${var.environment}-ecs-memory-scaling"
policy_type = "TargetTrackingScaling"
resource_id = aws_appautoscaling_target.ecs.resource_id
scalable_dimension = aws_appautoscaling_target.ecs.scalable_dimension
service_namespace = aws_appautoscaling_target.ecs.service_namespace
target_tracking_scaling_policy_configuration {
predefined_metric_specification {
predefined_metric_type = "ECSServiceAverageMemoryUtilization"
}
target_value = 80.0
}
}
modules/ecs/outputs.tf:
output "cluster_name" {
value = aws_ecs_cluster.main.name
}
output "service_name" {
value = aws_ecs_service.main.name
}
output "ecs_security_group_id" {
value = aws_security_group.ecs.id
}
output "alb_security_group_id" {
value = aws_security_group.alb.id
}
Security Module
Manages IAM roles, KMS encryption keys, and Secrets Manager integration.
modules/security/variables.tf:
variable "environment" {
type = string
}
variable "kms_key_deletion_window" {
type = number
default = 30
}
modules/security/main.tf:
# KMS Key for encryption
resource "aws_kms_key" "menotime" {
description = "KMS key for ${var.environment} MenoTime encryption"
deletion_window_in_days = var.kms_key_deletion_window
enable_key_rotation = true
tags = {
Name = "${var.environment}-menotime-key"
}
}
resource "aws_kms_alias" "menotime" {
name = "alias/${var.environment}-menotime"
target_key_id = aws_kms_key.menotime.key_id
}
# ECS Task Execution Role
resource "aws_iam_role" "ecs_task_execution_role" {
name = "${var.environment}-ecs-task-execution-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "ecs-tasks.amazonaws.com"
}
}]
})
}
resource "aws_iam_role_policy_attachment" "ecs_task_execution_role_policy" {
role = aws_iam_role.ecs_task_execution_role.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"
}
# ECS Task Execution Role - Secrets Manager Access
resource "aws_iam_role_policy" "ecs_task_execution_secrets" {
name = "${var.environment}-ecs-task-execution-secrets"
role = aws_iam_role.ecs_task_execution_role.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = [
"secretsmanager:GetSecretValue",
"secretsmanager:DescribeSecret"
]
Resource = "arn:aws:secretsmanager:us-west-1:*:secret:menotime/${var.environment}/*"
},
{
Effect = "Allow"
Action = [
"kms:Decrypt",
"kms:DescribeKey"
]
Resource = aws_kms_key.menotime.arn
}
]
})
}
# ECS Task Role (for application permissions)
resource "aws_iam_role" "ecs_task_role" {
name = "${var.environment}-ecs-task-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "ecs-tasks.amazonaws.com"
}
}]
})
}
# ECS Task Role - ECR Access
resource "aws_iam_role_policy" "ecs_task_ecr" {
name = "${var.environment}-ecs-task-ecr"
role = aws_iam_role.ecs_task_role.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = [
"ecr:GetAuthorizationToken",
"ecr:BatchGetImage",
"ecr:GetDownloadUrlForLayer"
]
Resource = "*"
}]
})
}
# ECS Task Role - S3 Access
resource "aws_iam_role_policy" "ecs_task_s3" {
name = "${var.environment}-ecs-task-s3"
role = aws_iam_role.ecs_task_role.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:ListBucket"
]
Resource = [
"arn:aws:s3:::menotime-${var.environment}",
"arn:aws:s3:::menotime-${var.environment}/*"
]
}]
})
}
# ECS Task Role - CloudWatch Logs
resource "aws_iam_role_policy" "ecs_task_logs" {
name = "${var.environment}-ecs-task-logs"
role = aws_iam_role.ecs_task_role.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
]
Resource = "arn:aws:logs:us-west-1:*:log-group:/ecs/${var.environment}-*"
}]
})
}
# Secrets Manager Secret
resource "aws_secretsmanager_secret" "db_password" {
name = "menotime/${var.environment}/db-password"
recovery_window_in_days = 7
kms_key_id = aws_kms_key.menotime.id
tags = {
Name = "${var.environment}-db-password"
}
}
resource "aws_secretsmanager_secret" "api_keys" {
name = "menotime/${var.environment}/api-keys"
recovery_window_in_days = 7
kms_key_id = aws_kms_key.menotime.id
tags = {
Name = "${var.environment}-api-keys"
}
}
resource "aws_secretsmanager_secret" "jwt_secret" {
name = "menotime/${var.environment}/jwt-secret"
recovery_window_in_days = 7
kms_key_id = aws_kms_key.menotime.id
tags = {
Name = "${var.environment}-jwt-secret"
}
}
modules/security/outputs.tf:
output "kms_key_id" {
value = aws_kms_key.menotime.id
}
output "kms_key_arn" {
value = aws_kms_key.menotime.arn
}
output "ecs_task_execution_role_arn" {
value = aws_iam_role.ecs_task_execution_role.arn
}
output "ecs_task_role_arn" {
value = aws_iam_role.ecs_task_role.arn
}
output "db_password_secret_arn" {
value = aws_secretsmanager_secret.db_password.arn
}
Monitoring Module
Manages CloudWatch dashboards, alarms, and GuardDuty security monitoring.
modules/monitoring/variables.tf:
variable "environment" {
type = string
}
variable "kms_key_id" {
type = string
}
variable "db_instance_id" {
type = string
}
variable "rds_endpoint" {
type = string
}
variable "ecs_cluster_name" {
type = string
}
variable "ecs_service_name" {
type = string
}
variable "sns_topic_arn" {
type = string
}
modules/monitoring/main.tf:
# CloudWatch Log Groups
resource "aws_cloudwatch_log_group" "guardduty" {
name = "/aws/guardduty/${var.environment}"
retention_in_days = var.environment == "prod" ? 90 : 30
kms_key_id = var.kms_key_id
tags = {
Name = "${var.environment}-guardduty-logs"
}
}
# GuardDuty
resource "aws_guardduty_detector" "main" {
enable = true
datasources {
s3_logs {
enable = true
}
kubernetes {
audit_logs {
enable = true
}
}
}
tags = {
Name = "${var.environment}-guardduty"
}
}
# CloudWatch Alarms - RDS CPU
resource "aws_cloudwatch_metric_alarm" "rds_cpu" {
alarm_name = "${var.environment}-rds-high-cpu"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = "2"
metric_name = "CPUUtilization"
namespace = "AWS/RDS"
period = "300"
statistic = "Average"
threshold = "80"
alarm_description = "Alert when RDS CPU > 80%"
alarm_actions = [var.sns_topic_arn]
dimensions = {
DBInstanceIdentifier = var.db_instance_id
}
}
# CloudWatch Alarms - RDS Storage
resource "aws_cloudwatch_metric_alarm" "rds_storage" {
alarm_name = "${var.environment}-rds-low-storage"
comparison_operator = "LessThanThreshold"
evaluation_periods = "1"
metric_name = "FreeStorageSpace"
namespace = "AWS/RDS"
period = "300"
statistic = "Average"
threshold = "10737418240" # 10 GB in bytes
alarm_description = "Alert when RDS storage < 10GB"
alarm_actions = [var.sns_topic_arn]
dimensions = {
DBInstanceIdentifier = var.db_instance_id
}
}
# CloudWatch Alarms - ECS CPU
resource "aws_cloudwatch_metric_alarm" "ecs_cpu" {
alarm_name = "${var.environment}-ecs-high-cpu"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = "2"
metric_name = "CPUUtilization"
namespace = "AWS/ECS"
period = "300"
statistic = "Average"
threshold = "75"
alarm_description = "Alert when ECS service CPU > 75%"
alarm_actions = [var.sns_topic_arn]
dimensions = {
ClusterName = var.ecs_cluster_name
ServiceName = var.ecs_service_name
}
}
# CloudWatch Alarms - ECS Memory
resource "aws_cloudwatch_metric_alarm" "ecs_memory" {
alarm_name = "${var.environment}-ecs-high-memory"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = "2"
metric_name = "MemoryUtilization"
namespace = "AWS/ECS"
period = "300"
statistic = "Average"
threshold = "85"
alarm_description = "Alert when ECS service memory > 85%"
alarm_actions = [var.sns_topic_arn]
dimensions = {
ClusterName = var.ecs_cluster_name
ServiceName = var.ecs_service_name
}
}
# SNS Topic for Alarms
resource "aws_sns_topic" "alarms" {
name = "${var.environment}-menotime-alarms"
kms_master_key_id = var.kms_key_id
tags = {
Name = "${var.environment}-alarms"
}
}
resource "aws_sns_topic_subscription" "alarms_email" {
topic_arn = aws_sns_topic.alarms.arn
protocol = "email"
endpoint = "ops-${var.environment}@timelessbiotech.com"
}
# CloudWatch Dashboard
resource "aws_cloudwatch_dashboard" "main" {
dashboard_name = "${var.environment}-menotime"
dashboard_body = jsonencode({
widgets = [
{
type = "metric"
properties = {
metrics = [
["AWS/RDS", "CPUUtilization", { stat = "Average" }],
[".", "DatabaseConnections", { stat = "Average" }],
[".", "FreeStorageSpace", { stat = "Average" }]
]
period = 300
stat = "Average"
region = "us-west-1"
title = "RDS Metrics"
}
},
{
type = "metric"
properties = {
metrics = [
["AWS/ECS", "CPUUtilization", { stat = "Average" }],
[".", "MemoryUtilization", { stat = "Average" }],
[".", "TaskCount", { stat = "Average" }]
]
period = 300
stat = "Average"
region = "us-west-1"
title = "ECS Service Metrics"
}
}
]
})
}
modules/monitoring/outputs.tf:
output "sns_topic_arn" {
value = aws_sns_topic.alarms.arn
}
output "guardduty_detector_id" {
value = aws_guardduty_detector.main.id
}
DNS Module
Manages Route 53 DNS records for menotime.ai domain.
modules/dns/variables.tf:
variable "environment" {
type = string
}
variable "domain_name" {
type = string
default = "menotime.ai"
}
variable "alb_dns_name" {
type = string
}
variable "alb_zone_id" {
type = string
}
variable "cloudfront_domain_name" {
type = string
}
variable "cloudfront_zone_id" {
type = string
}
modules/dns/main.tf:
# Route 53 Hosted Zone (assume it already exists)
data "aws_route53_zone" "menotime" {
name = var.domain_name
}
# DNS Record for ALB (staging/dev)
resource "aws_route53_record" "alb" {
count = var.environment != "prod" ? 1 : 0
zone_id = data.aws_route53_zone.menotime.zone_id
name = "${var.environment}.${var.domain_name}"
type = "A"
alias {
name = var.alb_dns_name
zone_id = var.alb_zone_id
evaluate_target_health = true
}
}
# DNS Record for CloudFront (production)
resource "aws_route53_record" "cloudfront" {
count = var.environment == "prod" ? 1 : 0
zone_id = data.aws_route53_zone.menotime.zone_id
name = var.domain_name
type = "A"
alias {
name = var.cloudfront_domain_name
zone_id = var.cloudfront_zone_id
evaluate_target_health = false
}
}
# MX Records for SES
resource "aws_route53_record" "mx" {
zone_id = data.aws_route53_zone.menotime.zone_id
name = var.domain_name
type = "MX"
ttl = 3600
records = [
"10 inbound-smtp.us-west-1.amazonaws.com"
]
}
# TXT Record for DKIM verification
resource "aws_route53_record" "ses_verification" {
zone_id = data.aws_route53_zone.menotime.zone_id
name = "_amazonses.${var.domain_name}"
type = "TXT"
ttl = 1800
records = [
"v=spf1 include:amazonses.com ~all"
]
}
modules/dns/outputs.tf:
output "zone_id" {
value = data.aws_route53_zone.menotime.zone_id
}
output "nameservers" {
value = data.aws_route53_zone.menotime.name_servers
}
Root Module Configuration
main.tf:
module "vpc" {
source = "./modules/vpc"
environment = var.environment
availability_zones = var.availability_zones
enable_nat_gateway = var.enable_nat_gateway
}
module "security" {
source = "./modules/security"
environment = var.environment
}
module "rds" {
source = "./modules/rds"
environment = var.environment
instance_class = var.rds_instance_class
allocated_storage = var.rds_allocated_storage
db_username = var.rds_username
db_password = var.rds_password
vpc_id = module.vpc.vpc_id
private_subnet_ids = module.vpc.private_subnet_ids
kms_key_id = module.security.kms_key_id
}
module "ecs" {
source = "./modules/ecs"
environment = var.environment
container_image = var.container_image
cpu = var.ecs_cpu
memory = var.ecs_memory
desired_count = var.ecs_desired_count
vpc_id = module.vpc.vpc_id
private_subnet_ids = module.vpc.private_subnet_ids
log_group_name = "menotime-${var.environment}"
kms_key_id = module.security.kms_key_id
task_role_arn = module.security.ecs_task_role_arn
execution_role_arn = module.security.ecs_task_execution_role_arn
alb_target_group_arn = aws_lb_target_group.menotime.arn
}
module "monitoring" {
source = "./modules/monitoring"
environment = var.environment
kms_key_id = module.security.kms_key_id
db_instance_id = module.rds.db_endpoint
rds_endpoint = module.rds.db_address
ecs_cluster_name = module.ecs.cluster_name
ecs_service_name = module.ecs.service_name
sns_topic_arn = module.monitoring.sns_topic_arn
}
module "dns" {
source = "./modules/dns"
environment = var.environment
alb_dns_name = aws_lb.menotime.dns_name
alb_zone_id = aws_lb.menotime.zone_id
cloudfront_domain_name = var.cloudfront_domain_name
cloudfront_zone_id = var.cloudfront_zone_id
}
# Application Load Balancer
resource "aws_lb" "menotime" {
name = "${var.environment}-menotime-alb"
internal = false
load_balancer_type = "application"
security_groups = [module.ecs.alb_security_group_id]
subnets = module.vpc.public_subnet_ids
tags = {
Name = "${var.environment}-menotime-alb"
}
}
resource "aws_lb_target_group" "menotime" {
name = "${var.environment}-menotime-tg"
port = 3000
protocol = "HTTP"
vpc_id = module.vpc.vpc_id
target_type = "ip"
health_check {
healthy_threshold = 2
unhealthy_threshold = 2
timeout = 3
interval = 30
path = "/health"
matcher = "200"
}
tags = {
Name = "${var.environment}-menotime-tg"
}
}
resource "aws_lb_listener" "menotime" {
load_balancer_arn = aws_lb.menotime.arn
port = "80"
protocol = "HTTP"
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.menotime.arn
}
}
# S3 Bucket for assets
resource "aws_s3_bucket" "menotime" {
bucket = "menotime-${var.environment}-${data.aws_caller_identity.current.account_id}"
tags = {
Name = "${var.environment}-menotime-assets"
}
}
resource "aws_s3_bucket_versioning" "menotime" {
bucket = aws_s3_bucket.menotime.id
versioning_configuration {
status = "Enabled"
}
}
resource "aws_s3_bucket_server_side_encryption_configuration" "menotime" {
bucket = aws_s3_bucket.menotime.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "aws:kms"
kms_master_key_id = module.security.kms_key_id
}
}
}
resource "aws_s3_bucket_public_access_block" "menotime" {
bucket = aws_s3_bucket.menotime.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
data "aws_caller_identity" "current" {}
variables.tf:
variable "environment" {
type = string
description = "Environment (dev, staging, prod)"
}
variable "aws_region" {
type = string
description = "AWS region"
default = "us-west-1"
}
variable "availability_zones" {
type = list(string)
description = "Availability zones"
default = ["us-west-1a", "us-west-1b"]
}
variable "enable_nat_gateway" {
type = bool
description = "Enable NAT Gateway"
default = true
}
variable "rds_instance_class" {
type = string
description = "RDS instance class"
}
variable "rds_allocated_storage" {
type = number
description = "RDS allocated storage"
}
variable "rds_username" {
type = string
description = "RDS master username"
sensitive = true
}
variable "rds_password" {
type = string
description = "RDS master password"
sensitive = true
}
variable "container_image" {
type = string
description = "Container image URI"
}
variable "ecs_cpu" {
type = number
description = "ECS task CPU"
}
variable "ecs_memory" {
type = number
description = "ECS task memory"
}
variable "ecs_desired_count" {
type = number
description = "Desired number of ECS tasks"
}
variable "cloudfront_domain_name" {
type = string
description = "CloudFront domain name for prod"
default = ""
}
variable "cloudfront_zone_id" {
type = string
description = "CloudFront hosted zone ID"
default = "Z2FDTNDATAQYW2"
}
outputs.tf:
output "alb_dns_name" {
value = aws_lb.menotime.dns_name
}
output "rds_endpoint" {
value = module.rds.db_endpoint
}
output "s3_bucket_name" {
value = aws_s3_bucket.menotime.id
}
Environment-Specific Configuration
Create separate *.tfvars files for each environment:
environments/dev.tfvars:
environment = "dev"
rds_instance_class = "db.m7g.large"
rds_allocated_storage = 100
rds_username = "menotime_admin"
container_image = "123456789012.dkr.ecr.us-west-1.amazonaws.com/menotime:latest"
ecs_cpu = 256
ecs_memory = 512
ecs_desired_count = 1
enable_nat_gateway = true
environments/staging.tfvars:
environment = "staging"
rds_instance_class = "db.m7g.large"
rds_allocated_storage = 150
rds_username = "menotime_admin"
container_image = "123456789012.dkr.ecr.us-west-1.amazonaws.com/menotime:staging"
ecs_cpu = 512
ecs_memory = 1024
ecs_desired_count = 2
enable_nat_gateway = true
environments/prod.tfvars:
environment = "prod"
rds_instance_class = "db.m7g.large"
rds_allocated_storage = 200
rds_username = "menotime_admin"
container_image = "123456789012.dkr.ecr.us-west-1.amazonaws.com/menotime:stable"
ecs_cpu = 1024
ecs_memory = 2048
ecs_desired_count = 3
enable_nat_gateway = true
cloudfront_domain_name = "d1234567890.cloudfront.net"
Using tfvars files:
# Deploy dev environment
terraform plan -var-file="environments/dev.tfvars"
terraform apply -var-file="environments/dev.tfvars"
# Deploy staging
terraform plan -var-file="environments/staging.tfvars"
terraform apply -var-file="environments/staging.tfvars"
# Deploy prod
terraform plan -var-file="environments/prod.tfvars"
terraform apply -var-file="environments/prod.tfvars"
CI/CD Integration with GitHub Actions
Create .github/workflows/terraform.yml:
name: Terraform CI/CD
on:
push:
branches: [main]
paths:
- 'terraform/**'
pull_request:
branches: [main]
paths:
- 'terraform/**'
jobs:
terraform:
runs-on: ubuntu-latest
env:
AWS_REGION: us-west-1
TF_VERSION: 1.5.0
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Setup Terraform
uses: hashicorp/setup-terraform@v2
with:
terraform_version: ${{ env.TF_VERSION }}
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v2
with:
role-to-assume: arn:aws:iam::123456789012:role/github-actions-terraform
aws-region: ${{ env.AWS_REGION }}
- name: Terraform Init
run: |
cd terraform
terraform init
- name: Terraform Format
run: |
cd terraform
terraform fmt -check
- name: Terraform Validate
run: |
cd terraform
terraform validate
- name: Terraform Plan (Dev)
if: github.event_name == 'pull_request'
run: |
cd terraform
terraform plan -var-file="environments/dev.tfvars" -out=dev.tfplan
- name: Upload Plan
if: github.event_name == 'pull_request'
uses: actions/upload-artifact@v3
with:
name: tfplans
path: terraform/*.tfplan
- name: Comment PR
if: github.event_name == 'pull_request'
uses: actions/github-script@v6
with:
script: |
const fs = require('fs');
const plan = fs.readFileSync('terraform/dev.tfplan', 'utf8');
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `## Terraform Plan (Dev)\n\`\`\`\n${plan.substring(0, 500)}\n\`\`\``
});
- name: Terraform Apply (Dev)
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
run: |
cd terraform
terraform apply -var-file="environments/dev.tfvars" -auto-approve
- name: Terraform Apply (Staging)
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
run: |
cd terraform
terraform apply -var-file="environments/staging.tfvars" -auto-approve
Importing Existing Resources into Terraform
For resources already created manually in AWS, use terraform import:
# Import existing RDS instance
terraform import module.rds.aws_db_instance.main dev-menotime-db
# Import existing ECS cluster
terraform import module.ecs.aws_ecs_cluster.main dev-menotime-cluster
# Import existing VPC
terraform import module.vpc.aws_vpc.main vpc-1234567890abcdef0
# Import existing security group
terraform import module.vpc.aws_security_group.rds sg-0123456789abcdef0
# Import S3 bucket
terraform import aws_s3_bucket.menotime menotime-dev-123456789012
After importing, update the corresponding Terraform code to match the imported resource configuration. Verify with:
terraform plan
# Should show "No changes. Infrastructure is up-to-date."
Region Migration with Terraform
Migrating from us-west-1 to us-east-1? Terraform makes this straightforward:
Step 1: Create new provider alias
Add to versions.tf:
provider "aws" {
alias = "us_east_1"
region = "us-east-1"
}
Step 2: Create migration script
#!/bin/bash
# migrate-region.sh - Migrate infrastructure to us-east-1
# Pull current state
terraform state pull > terraform.backup.state
# Create new state for us-east-1
terraform workspace new us-east-1
# Update variables for new region
export TF_VAR_aws_region="us-east-1"
export TF_VAR_availability_zones='["us-east-1a","us-east-1b"]'
# Apply in new region
terraform apply -var-file="environments/prod.tfvars"
# Update DNS to new region
# Manually verify before switching production traffic
Step 3: Execute migration
chmod +x migrate-region.sh
./migrate-region.sh
# Verify in us-east-1
terraform state list
# Only after verification, update Route 53 records
# and switch production traffic
Best Practices
1. State Management
- Always use remote state with S3 + DynamoDB locking
- Enable versioning on the S3 bucket for disaster recovery
- Never commit
.tfstatefiles to Git - Use
.gitignore:
# .gitignore
*.tfstate
*.tfstate.*
.terraform/
.terraform.lock.hcl
override.tf
override.tf.json
*_override.tf
*_override.tf.json
crash.log
.env
2. Secrets Handling
Never hardcode secrets in Terraform code:
# Bad - Never do this
db_password = "MySecurePassword123"
# Good - Use environment variables
export TF_VAR_rds_password=$(aws secretsmanager get-secret-value --secret-id menotime/prod/db-password --query SecretString --output text)
terraform apply
# Even better - Use Secrets Manager directly in application
# and reference via Secrets Manager secret ARN in container definitions
3. Module Versioning
Create separate Git repositories for shared modules:
module "vpc" {
source = "git::https://github.com/timelessbiotech/terraform-aws-vpc.git?ref=v1.0.0"
}
module "ecs" {
source = "git::https://github.com/timelessbiotech/terraform-aws-ecs.git?ref=v1.0.0"
}
4. Code Review Workflow
Before applying to production:
# 1. Create branch
git checkout -b feature/update-ecs-cpu
# 2. Make changes
vim modules/ecs/variables.tf
# 3. Plan changes
terraform plan -var-file="environments/prod.tfvars" -out=prod.tfplan
# 4. Show plan to team
terraform show prod.tfplan
# 5. After approval, apply
terraform apply prod.tfplan
# 6. Commit to Git
git add terraform/
git commit -m "Increase ECS CPU to 1024"
git push origin feature/update-ecs-cpu
5. Cost Estimation
Before applying changes:
# Use Infracost for cost estimation
infracost breakdown --path terraform/ \
--terraform-var-file="environments/prod.tfvars"
# Output shows estimated cost impact of changes
6. State Locking Best Practices
- DynamoDB locking prevents concurrent modifications
- If lock is stuck:
# Check locked state
terraform force-unlock <LOCK_ID>
# Verify no other processes are running terraform
ps aux | grep terraform
# Then retry apply
Troubleshooting
Common Issues
"Error acquiring the lock: ... ConditionalCheckFailedException"
State is locked from a previous operation. Check if another deployment is running:
# List all locks
aws dynamodb scan --table-name menotime-terraform-locks
# Force unlock (use with caution)
terraform force-unlock <LOCK_ID>
"InvalidClientTokenId: The security token included in the request is invalid"
AWS credentials not configured or expired:
# Verify credentials
aws sts get-caller-identity
# Re-authenticate
aws configure
# or
export AWS_PROFILE=menotime-dev
"Terraform has detected inconsistencies in your configuration"
Likely caused by module changes. Refresh state:
terraform refresh -var-file="environments/dev.tfvars"
terraform plan -var-file="environments/dev.tfvars"
"Error: Error reading EC2 Network Interface"
Resource was deleted outside of Terraform. Refresh and replan:
terraform refresh
terraform plan
# Resource will be marked for recreation
Migration Checklist
If you're migrating existing infrastructure to Terraform:
- [ ] Install Terraform and configure AWS provider
- [ ] Create S3 bucket and DynamoDB table for remote state
- [ ] Set up Git repository for terraform/ directory
- [ ] Create module structure (vpc, rds, ecs, security, monitoring, dns)
- [ ] Write module code with input variables and outputs
- [ ] Create environment-specific tfvars files
- [ ] Import existing resources:
terraform import ... - [ ] Validate with:
terraform plan(should show no changes) - [ ] Set up GitHub Actions workflow for plan/apply
- [ ] Create PR and perform code review
- [ ] Test apply in dev environment
- [ ] Gradually roll out to staging, then production
- [ ] Document any manual resources that can't be automated
- [ ] Set up cost monitoring with Infracost
- [ ] Train team on Terraform workflow