Skip to content

Security Incident Runbook

Severity: Critical


Symptoms

  • Unexpected API calls in GCP audit logs using a service account
  • Secret accessed from an IP outside known ranges
  • Anomalous Kubernetes API calls (unusual kubectl exec, pod creation in unexpected namespaces)
  • Leaked credentials in a public Git repository
  • Alerts from GCP Security Command Center
  • Reports of unauthorized data access

Impact

  • Potential data exfiltration
  • Service compromise or backdoor installation
  • Credential chain compromise (one leaked secret can expose others)
  • Regulatory/compliance implications

Prerequisites

  • GCP IAM Admin access
  • kubectl access to the affected cluster
  • Access to GCP audit logs (Cloud Logging)
  • Access to GCP Secret Manager

Steps

Phase 1: Contain (Immediate — within minutes)

Do not investigate before containing. Every minute of delay increases the blast radius.

1. Identify the compromised credential/resource

The first question: What was accessed and how?

# Check GCP audit logs for anomalous service account activity
gcloud logging read \
  'protoPayload.authenticationInfo.principalEmail="{suspect-sa}@{project}.iam.gserviceaccount.com"
   AND protoPayload.@type="type.googleapis.com/google.cloud.audit.AuditLog"' \
  --project=orofi-{env}-cloud \
  --freshness=1h \
  --format=json | jq '.[] | {time: .timestamp, method: .protoPayload.methodName, caller: .protoPayload.requestMetadata.callerIp}'

2. Revoke the compromised credential immediately

If a GCP service account key is compromised:

# List keys for the service account
gcloud iam service-accounts keys list \
  --iam-account={sa}@{project}.iam.gserviceaccount.com \
  --project={project}

# Delete the compromised key
gcloud iam service-accounts keys delete {key-id} \
  --iam-account={sa}@{project}.iam.gserviceaccount.com \
  --project={project}

If a GCP Secret Manager secret is compromised:

# Disable the current secret version
gcloud secrets versions disable {version} \
  --secret={secret-name} \
  --project={project}

If a Kubernetes service account token is compromised:

# Delete the service account token secret (K8s will re-issue a new one)
kubectl delete secret {token-secret-name} -n {namespace}

3. If a pod is compromised — isolate it

# Apply a network policy that denies all traffic to/from the pod
# First, label the pod for isolation
kubectl label pod {pod-name} -n {namespace} security=isolated

# Apply a deny-all NetworkPolicy for pods with this label
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: isolate-compromised-pod
  namespace: {namespace}
spec:
  podSelector:
    matchLabels:
      security: isolated
  policyTypes:
  - Ingress
  - Egress
EOF

# Then terminate the pod for forensic preservation
kubectl delete pod {pod-name} -n {namespace} --grace-period=0

Phase 2: Assess Scope

4. Determine what was accessed

# Review Secret Manager access logs
gcloud logging read \
  'resource.type="secretmanager.googleapis.com/Secret"
   AND protoPayload.methodName="google.cloud.secretmanager.v1.SecretManagerService.AccessSecretVersion"' \
  --project=orofi-{env}-cloud \
  --freshness=24h \
  --format=json | jq '.[] | {time: .timestamp, secret: .resource.labels.secret_id, caller: .protoPayload.authenticationInfo.principalEmail}'

# Review Cloud SQL access logs
gcloud logging read \
  'resource.type="cloudsql_database"' \
  --project=orofi-{env}-cloud \
  --freshness=24h \
  --limit=100

# Review Kubernetes audit logs for exec/attach/portforward
gcloud logging read \
  'resource.type="k8s_cluster"
   AND protoPayload.methodName=~"exec|attach|portforward"' \
  --project=orofi-{env}-cloud \
  --freshness=24h

5. Identify all credentials that may have been exposed

If a pod was compromised, it had access to: - All environment variables (including secrets mounted as env vars) - All volume-mounted secrets - The GCP service account (via Workload Identity)

List all secrets for the compromised namespace:

kubectl get secrets -n {namespace}
kubectl get externalsecrets -n {namespace}


Phase 3: Remediate

6. Rotate ALL exposed credentials

For each compromised secret:

# Generate new value
NEW_SECRET=$(openssl rand -base64 32)

# Create a new version in Secret Manager
echo -n "$NEW_SECRET" | gcloud secrets versions add {secret-name} \
  --data-file=- --project={project}

# Disable the old version
gcloud secrets versions disable {old-version} \
  --secret={secret-name} --project={project}

# Force ESO resync
kubectl annotate externalsecret {secret-name} -n {namespace} \
  force-sync=$(date +%s) --overwrite

For database credentials: follow the rotation procedure in Secrets Management Guide.

7. Rotate the Firebase service account if exposed

[NEEDS TEAM INPUT: document Firebase service account key rotation procedure]

8. Rotate JWT private key (if identity service compromised)

High Impact

Rotating microservice-identity-jwt-private-key-secret will invalidate all active user sessions. All logged-in users will be signed out. Schedule during a maintenance window if possible, or proceed immediately if the compromise is confirmed.

9. Restart all services that used compromised credentials

for ns in microservice-communication microservice-identity microservice-monolith microservice-analytics \
  api-gateway-public api-gateway-account api-gateway-oro api-gateway-admin-dashboard; do
  kubectl rollout restart deployment -n $ns
done

10. Review and harden

  • Remove unused service account keys (all service accounts should use Workload Identity, not static keys)
  • Verify no secrets are committed to Git: git log -S "password\|secret\|key" --all
  • Check if the zero-trust firewall in dev is correctly configured
  • Review which GCP users have roles/secretmanager.secretAccessor on sensitive secrets

Verification

# Confirm old credentials are disabled
gcloud secrets versions list {secret-name} --project={project}
# Old version should show state: DISABLED

# Confirm new credentials work
# (test by restarting the affected service and checking it starts cleanly)
kubectl rollout status deployment/{service} -n {namespace}

# Confirm no anomalous activity in the last 1 hour
gcloud logging read \
  'protoPayload.authenticationInfo.principalEmail="{suspect-sa}@{project}.iam.gserviceaccount.com"' \
  --project={project} --freshness=1h

Post-Incident

  1. Preserve evidence before making changes when possible (screenshot logs, export audit trails)
  2. Write a security incident report — include: timeline, what was accessed, how it was contained, what was rotated
  3. Notify affected parties — [NEEDS TEAM INPUT: legal/compliance notification requirements]
  4. Regulatory obligations — [NEEDS TEAM INPUT: does this trigger GDPR/SOC2 notification requirements?]
  5. Root cause analysis — how did the credential leak? Fix the process gap.

Escalation

  1. [NEEDS TEAM INPUT: security contact / CISO]
  2. [NEEDS TEAM INPUT: legal/compliance contact]
  3. [NEEDS TEAM INPUT: GCP security support]

See Also