Environments¶

Environment Overview¶

Orofi runs three environments. All use the same Terraform modules and Kubernetes manifests — differences are controlled entirely through variable values, not separate code paths.

Property	Development	Staging	Production
GCP Project	`orofi-dev-cloud`	`orofi-stage-cloud`	[NEEDS TEAM INPUT]
Domain	`*.dev.orofi.xyz`	`*.stage.orofi.xyz`	`*.orofi.xyz`
GKE Cluster	`orofi-dev-cloud-dev-k8s-cluster`	`orofi-stage-cloud-stage-k8s-cluster`	[NEEDS TEAM INPUT]
Region/Zone	`us-central1-a`	`us-central1-a`	[NEEDS TEAM INPUT]
VPC CIDR	`10.0.0.0/16`	`11.0.0.0/16`	[NEEDS TEAM INPUT]
Terraform State Bucket	`oro-dev-infra`	`oro-infra-stag`	`oro-infra-production`
Terraform State Prefix	`terraform/oro/dev`	`terraform/automation/staging`	`terraform/automation/production`
Terraform SA	`terraform-mnl@orofi-dev-cloud`	`orofi-mnl-sa-terraform@orofi-stage-cloud`	[NEEDS TEAM INPUT]
IaC Automation	Manual + Bitbucket	Manual + Bitbucket	Manual only (planned: Atlantis)

What Changes Per Environment¶

Database Tier¶

Dev uses micro/HDD for cost savings. Staging uses a larger tier with SSD and regional HA to mirror production behavior.

Property	Dev	Staging	Production
Instance tier	`db-f1-micro`	`db-n1-standard-1`	[NEEDS TEAM INPUT]
Disk type	`PD_HDD`	`PD_SSD`	[NEEDS TEAM INPUT]
Disk size	20 GB	100 GB	[NEEDS TEAM INPUT]
Availability	`ZONAL`	`REGIONAL`	[NEEDS TEAM INPUT]
Backup retention	[NEEDS TEAM INPUT]	30 backups	[NEEDS TEAM INPUT]

GKE Autoscaling¶

Both clusters autoscale from 1 to 15 nodes. The minimum of 1 means the cluster can scale to zero nodes (all workloads evicted) when using the manual scale-down pipelines.

Kafka Configuration¶

Property	Dev	Staging
Controller replicas	1	3
Broker replicas	1	3
Topic replication factor	1	3
Min ISR	N/A	2
Network threads	default	8
IO threads	default	16

In dev, Kafka runs as a single broker — no replication. Staging mirrors production HA configuration.

MongoDB¶

Property	Dev	Staging
Replica set size	1	3
WiredTiger cache	0.2 GB	[NEEDS TEAM INPUT]

Redis Backup Cadence¶

Property	Dev	Staging
RDB snapshot interval	Every 12 hours	Every 6 hours

Zero-Trust Firewall¶

Zero-trust network policy is enabled in dev and disabled in staging:

Dev:   zero_trust = true   → GCP firewall denies all traffic except known IPs
Stage: zero_trust = false  → Istio controls access, firewall is open

Deployment Controls¶

Property	Dev	Staging	Production
ArgoCD auto-sync	[NEEDS TEAM INPUT]	[NEEDS TEAM INPUT]	[NEEDS TEAM INPUT]
Manual scale pipelines	Available	Available	N/A
Terraform automation	Manual CLI	Manual CLI	Manual only

Why Three Environments?¶

Development is the integration environment where all services are deployed after a successful build. It should reflect the current state of the main (or develop) branch. It uses cheaper resources because cost matters more than reliability at this stage.

Staging is a production-mirror environment. It uses production-equivalent resource sizes (REGIONAL Cloud SQL, HA Kafka with 3 replicas) so that performance and reliability issues surface before production. Staging deployments precede production deployments.

Production is the live environment serving real users. [NEEDS TEAM INPUT: describe production deployment gate criteria — who approves, what tests must pass].

Manual Scale-Down / Scale-Up¶

To reduce costs, the dev and staging clusters can be scaled to zero nodes using Bitbucket pipeline custom triggers:

Pipeline	Effect
`scale-down-dev`	Scales dev GKE cluster to 0 nodes
`scale-up-dev`	Scales dev GKE cluster back to normal
`scale-down-staging`	Scales staging GKE cluster to 0 nodes
`scale-up-staging`	Scales staging GKE cluster back to normal

These pipelines use the k8s-scaler-cross service account which has roles/container.admin on the target clusters.

Scale-down impacts

Scaling to zero nodes will evict all pods. ArgoCD will re-deploy everything when the cluster scales back up. Expect 5–10 minutes for services to be fully healthy after scale-up.

Cross-Environment Access¶

The staging Private Service Connector (PSC) accepts auto-connections from orofi-devops-cloud. This allows the devops project to reach the staging Cloud SQL instance directly over the private network — used by migration tooling and Flyway.

The k8s-scaler-cross service account in staging has cross-project IAM bindings that allow scaling the dev cluster from staging automation.

Environment Variable Differences¶

Each environment has its own set of secrets in GCP Secret Manager, prefixed with dev- or stage-. The External Secrets Operator in each cluster only reads from that environment's secrets.

See Environment Variables Guide for per-service configuration details.