Security Incident Runbook¶
Severity: Critical
Symptoms¶
- Unexpected API calls in GCP audit logs using a service account
- Secret accessed from an IP outside known ranges
- Anomalous Kubernetes API calls (unusual
kubectl exec, pod creation in unexpected namespaces) - Leaked credentials in a public Git repository
- Alerts from GCP Security Command Center
- Reports of unauthorized data access
Impact¶
- Potential data exfiltration
- Service compromise or backdoor installation
- Credential chain compromise (one leaked secret can expose others)
- Regulatory/compliance implications
Prerequisites¶
- GCP IAM Admin access
- kubectl access to the affected cluster
- Access to GCP audit logs (Cloud Logging)
- Access to GCP Secret Manager
Steps¶
Phase 1: Contain (Immediate — within minutes)¶
Do not investigate before containing. Every minute of delay increases the blast radius.
1. Identify the compromised credential/resource
The first question: What was accessed and how?
# Check GCP audit logs for anomalous service account activity
gcloud logging read \
'protoPayload.authenticationInfo.principalEmail="{suspect-sa}@{project}.iam.gserviceaccount.com"
AND protoPayload.@type="type.googleapis.com/google.cloud.audit.AuditLog"' \
--project=orofi-{env}-cloud \
--freshness=1h \
--format=json | jq '.[] | {time: .timestamp, method: .protoPayload.methodName, caller: .protoPayload.requestMetadata.callerIp}'
2. Revoke the compromised credential immediately
If a GCP service account key is compromised:
# List keys for the service account
gcloud iam service-accounts keys list \
--iam-account={sa}@{project}.iam.gserviceaccount.com \
--project={project}
# Delete the compromised key
gcloud iam service-accounts keys delete {key-id} \
--iam-account={sa}@{project}.iam.gserviceaccount.com \
--project={project}
If a GCP Secret Manager secret is compromised:
# Disable the current secret version
gcloud secrets versions disable {version} \
--secret={secret-name} \
--project={project}
If a Kubernetes service account token is compromised:
# Delete the service account token secret (K8s will re-issue a new one)
kubectl delete secret {token-secret-name} -n {namespace}
3. If a pod is compromised — isolate it
# Apply a network policy that denies all traffic to/from the pod
# First, label the pod for isolation
kubectl label pod {pod-name} -n {namespace} security=isolated
# Apply a deny-all NetworkPolicy for pods with this label
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: isolate-compromised-pod
namespace: {namespace}
spec:
podSelector:
matchLabels:
security: isolated
policyTypes:
- Ingress
- Egress
EOF
# Then terminate the pod for forensic preservation
kubectl delete pod {pod-name} -n {namespace} --grace-period=0
Phase 2: Assess Scope¶
4. Determine what was accessed
# Review Secret Manager access logs
gcloud logging read \
'resource.type="secretmanager.googleapis.com/Secret"
AND protoPayload.methodName="google.cloud.secretmanager.v1.SecretManagerService.AccessSecretVersion"' \
--project=orofi-{env}-cloud \
--freshness=24h \
--format=json | jq '.[] | {time: .timestamp, secret: .resource.labels.secret_id, caller: .protoPayload.authenticationInfo.principalEmail}'
# Review Cloud SQL access logs
gcloud logging read \
'resource.type="cloudsql_database"' \
--project=orofi-{env}-cloud \
--freshness=24h \
--limit=100
# Review Kubernetes audit logs for exec/attach/portforward
gcloud logging read \
'resource.type="k8s_cluster"
AND protoPayload.methodName=~"exec|attach|portforward"' \
--project=orofi-{env}-cloud \
--freshness=24h
5. Identify all credentials that may have been exposed
If a pod was compromised, it had access to: - All environment variables (including secrets mounted as env vars) - All volume-mounted secrets - The GCP service account (via Workload Identity)
List all secrets for the compromised namespace:
Phase 3: Remediate¶
6. Rotate ALL exposed credentials
For each compromised secret:
# Generate new value
NEW_SECRET=$(openssl rand -base64 32)
# Create a new version in Secret Manager
echo -n "$NEW_SECRET" | gcloud secrets versions add {secret-name} \
--data-file=- --project={project}
# Disable the old version
gcloud secrets versions disable {old-version} \
--secret={secret-name} --project={project}
# Force ESO resync
kubectl annotate externalsecret {secret-name} -n {namespace} \
force-sync=$(date +%s) --overwrite
For database credentials: follow the rotation procedure in Secrets Management Guide.
7. Rotate the Firebase service account if exposed
[NEEDS TEAM INPUT: document Firebase service account key rotation procedure]
8. Rotate JWT private key (if identity service compromised)
High Impact
Rotating microservice-identity-jwt-private-key-secret will invalidate all active user sessions. All logged-in users will be signed out. Schedule during a maintenance window if possible, or proceed immediately if the compromise is confirmed.
9. Restart all services that used compromised credentials
for ns in microservice-communication microservice-identity microservice-monolith microservice-analytics \
api-gateway-public api-gateway-account api-gateway-oro api-gateway-admin-dashboard; do
kubectl rollout restart deployment -n $ns
done
10. Review and harden
- Remove unused service account keys (all service accounts should use Workload Identity, not static keys)
- Verify no secrets are committed to Git:
git log -S "password\|secret\|key" --all - Check if the zero-trust firewall in dev is correctly configured
- Review which GCP users have
roles/secretmanager.secretAccessoron sensitive secrets
Verification¶
# Confirm old credentials are disabled
gcloud secrets versions list {secret-name} --project={project}
# Old version should show state: DISABLED
# Confirm new credentials work
# (test by restarting the affected service and checking it starts cleanly)
kubectl rollout status deployment/{service} -n {namespace}
# Confirm no anomalous activity in the last 1 hour
gcloud logging read \
'protoPayload.authenticationInfo.principalEmail="{suspect-sa}@{project}.iam.gserviceaccount.com"' \
--project={project} --freshness=1h
Post-Incident¶
- Preserve evidence before making changes when possible (screenshot logs, export audit trails)
- Write a security incident report — include: timeline, what was accessed, how it was contained, what was rotated
- Notify affected parties — [NEEDS TEAM INPUT: legal/compliance notification requirements]
- Regulatory obligations — [NEEDS TEAM INPUT: does this trigger GDPR/SOC2 notification requirements?]
- Root cause analysis — how did the credential leak? Fix the process gap.
Escalation¶
- [NEEDS TEAM INPUT: security contact / CISO]
- [NEEDS TEAM INPUT: legal/compliance contact]
- [NEEDS TEAM INPUT: GCP security support]