Terraform & Terragrunt¶

Before You Read¶

This page describes how infrastructure changes are made. For step-by-step change procedures see Change Management. For module documentation see Terraform Modules Reference.

Why Terragrunt?¶

Terraform alone handles resource provisioning but requires duplication when configuring multiple environments. Terragrunt adds:

DRY configuration — write module calls once, override only what differs per environment
Remote state management — each project/environment has its own GCS bucket and prefix
Dependency ordering — dependency blocks ensure modules apply in the right order
Before/after hooks — run scripts before or after Terraform commands

Repository Structure¶

infrastructure-management/
├── modules/                    ← Reusable Terraform modules (never apply directly)
│   ├── k8s/                   ← GKE cluster
│   ├── network/               ← VPC, subnets, firewall, NAT, static IPs
│   ├── datastore/             ← Cloud SQL MySQL
│   ├── redis/                 ← Cloud Memorystore Redis
│   ├── artifacts/             ← Artifact Registry repositories
│   ├── secretmanager/         ← GCP Secret Manager secrets
│   ├── service-accounts/      ← GCP service accounts + IAM + Workload Identity
│   ├── helm/                  ← Helm releases (Istio, cert-manager, ArgoCD, app ingress)
│   ├── dns/                   ← Cloud DNS records
│   ├── buckets/               ← GCS buckets
│   ├── users-access/          ← User IAM bindings
│   ├── cert-monitor/          ← Certificate monitoring (Python script + Docker)
│   ├── kms/                   ← Cloud KMS key rings and keys
│   ├── cloudsql-root-password/         ← Auto-generate + store root password
│   ├── cloudsql-microservice-credentials/ ← Per-service DB users + secrets
│   └── secretmanager-version/ ← Secret version management
│
└── projects/                   ← Environment-specific assemblies
    ├── orofi-dev/              ← Dev environment
    │   ├── local.tf            ← Variables + backend config
    │   ├── network.tf          ← Calls modules/network
    │   ├── k8s.tf              ← Calls modules/k8s
    │   ├── sql.tf              ← Calls modules/datastore
    │   ├── redis.tf            ← Calls modules/redis
    │   ├── service-accounts.tf ← Calls modules/service-accounts (×8 services)
    │   ├── secrets.tf          ← Calls modules/secretmanager (×40+ secrets)
    │   ├── dns.tf              ← Calls modules/dns
    │   └── ...
    ├── orofi-staging/          ← Staging environment (same structure)
    └── orofi-prod/             ← Production environment (sparse — managed manually)

State Management¶

Each environment stores its Terraform state in a dedicated GCS bucket:

Environment	Bucket	Prefix
Dev	`oro-dev-infra`	`terraform/oro/dev`
Staging	`oro-infra-stag`	`terraform/automation/staging`
Production	`oro-infra-production`	`terraform/automation/production`

State is never shared between environments. This means changes in dev don't affect staging state.

Backend Configuration (Dev Example)¶

# infrastructure-management/projects/orofi-dev/local.tf
terraform {
  backend "gcs" {
    bucket = "oro-dev-infra"
    prefix = "terraform/oro/dev"
  }
}

provider "google" {
  project = "orofi-dev-cloud"
  region  = "us-central1"
}

Module Architecture¶

Modules are pure Terraform — no Terragrunt. Each module accepts variables and creates GCP resources.

The `modules/helm` Module¶

This is the most complex module. It deploys Kubernetes components via Helm:

Istio (base + istiod + ingressgateway + egressgateway)
cert-manager
ArgoCD
App Ingress (Gateway resource + ArgoCD VirtualService)

The Istio versions and values are defined in modules/helm/main.tf.

The `modules/cloudsql-microservice-credentials` Module¶

This module creates a complete per-service database identity: 1. A MySQL user in the Cloud SQL instance 2. A secret in GCP Secret Manager with the connection string 3. IAM binding for the microservice's service account to read the secret

This pattern ensures each microservice has its own database user and no service can access another service's database.

The `modules/service-accounts` Module¶

For each microservice this module: 1. Creates a GCP service account 2. Grants IAM roles (workload identity + any storage/KMS roles) 3. Creates Workload Identity binding to the Kubernetes namespace/service account 4. Grants access to the service's secrets in Secret Manager

Environment Variables and Inputs¶

Each project directory defines its variables in local.tf:

# infrastructure-management/projects/orofi-staging/local.tf
locals {
  project_id  = "orofi-stage-cloud"
  env         = "stage"
  region      = "us-central1"
  zone        = "us-central1-a"
  domain      = "*.stage.orofi.xyz"
  network_cidr = "11.0.0.0/16"
}

These locals flow into every module call in the same project directory:

# infrastructure-management/projects/orofi-staging/k8s.tf
module "k8s" {
  source     = "../../modules/k8s"
  project_id = local.project_id
  env        = local.env
  region     = local.region
  zone       = local.zone
  ...
}

Atlantis (Planned)¶

The atlantis-integration-plan.md file describes a planned automation where Bitbucket PRs trigger terragrunt plan automatically and require an atlantis apply comment to execute.

Until Atlantis is running, all Terraform changes are applied manually by engineers with the appropriate service account credentials.

Planned Atlantis configuration: - Server: GCE VM e2-medium in us-east1 - Autodiscover: All Terragrunt directories - Workflow: terragrunt plan on PR open, terragrunt apply on comment - Ignored paths: modules/, common/ - Parallel plans, sequential applies - Production: Manual only

For current manual procedures see Change Management.

CI/CD for Infrastructure¶

The infrastructure-management/bitbucket-pipelines.yml handles: 1. Python linting for modules/cert-monitor/scripts/monitor.py 2. Building and pushing the cert-monitor Docker image on version tags

Infrastructure changes (Terraform apply) are not automated via CI — they require manual execution.