Reading Logs¶

Log Architecture¶

flowchart LR
    App["Application Pod\n(stdout/stderr)"]
    Node["Node\n(kubelet)"]
    Promtail["Promtail\n(log agent)"]
    Loki["Loki\n(log storage)"]
    Grafana["Grafana\n(UI)"]

    App --> Node --> Promtail --> Loki --> Grafana

All application logs go to stdout/stderr → collected by Promtail → stored in Loki → visualized in Grafana.

Grafana Log Explorer¶

The primary log interface is Grafana's Explore view with the Loki data source.

Dev: https://grafana.dev.orofi.xyz
Staging: https://grafana.stage.orofi.xyz

Basic LogQL Queries¶

# All logs from a specific service
{namespace="microservice-identity"}

# Filter by log level
{namespace="microservice-identity"} |= "ERROR"

# Filter by multiple namespaces
{namespace=~"microservice-.*"}

# Logs from a specific pod
{namespace="microservice-identity", pod="microservice-identity-abc123"}

# Parse JSON logs and filter by field
{namespace="microservice-identity"} | json | level="error"

# Count errors per minute
count_over_time({namespace="microservice-identity"} |= "ERROR" [1m])

# All gateway logs
{namespace=~"api-gateway-.*"} |= "ERROR"

Time Range¶

Use the time picker in the top right. For incidents, set to the window around when the issue started. Start broad (1 hour), then narrow.

kubectl Logs (Quick Triage)¶

For quick checks without opening Grafana:

# Last 100 lines from a service
kubectl logs -n microservice-identity \
  -l app=microservice-identity \
  --tail=100

# Follow logs in real time
kubectl logs -n microservice-identity \
  -l app=microservice-identity \
  --follow

# Logs from the previous pod restart (useful when pod is crash-looping)
kubectl logs -n microservice-identity \
  -l app=microservice-identity \
  --previous

# Logs from all containers in a namespace
kubectl logs -n microservice-communication \
  -l app=microservice-communication \
  --all-containers=true \
  --tail=50

# Logs from specific pod
kubectl logs -n microservice-identity \
  pod/microservice-identity-5d8b9c-xkp7z \
  --tail=200

Finding the Right Pod¶

# List all pods in a namespace
kubectl get pods -n microservice-identity

# Get the pod name matching a label
kubectl get pod -n microservice-identity -l app=microservice-identity -o name

Istio Access Logs¶

Istio logs all HTTP requests via its Envoy proxy sidecar. These are useful for debugging routing issues.

# Istio proxy (Envoy) access logs for a pod
kubectl logs -n microservice-identity \
  pod/microservice-identity-5d8b9c-xkp7z \
  -c istio-proxy \
  --tail=100

Format: [timestamp] "METHOD /path HTTP/version" status_code response_bytes latency

IngressGateway Logs¶

For requests that aren't reaching services at all:

# Check Istio IngressGateway logs
kubectl logs -n istio-system \
  -l app=istio-ingressgateway \
  --tail=100

# Look for routing errors
kubectl logs -n istio-system \
  -l app=istio-ingressgateway \
  --tail=200 | grep "error\|WARN\|503\|404"

Log Levels by Environment¶

Environment	Expected Log Level	What to look for
Dev	`DEBUG` or `INFO`	Normal startup, request logs, all debug info
Staging	`INFO`	Request logs, business events
Production	`WARN` or above	Only warnings and errors

[NEEDS TEAM INPUT: confirm the log level configured per environment for each service.]

Common Log Patterns¶

Successful Request¶

{"level":"info","timestamp":"2026-04-01T10:00:00Z","service":"microservice-identity","method":"POST","path":"/auth/login","status":200,"duration_ms":45}

Database Connection Error¶

ERROR: Unable to acquire JDBC Connection; nested exception is org.hibernate.exception.JDBCConnectionException

→ Check DB connectivity: kubectl exec -it -n microservice-identity <pod> -- nc -zv microservice-identity-db.{env}.orofi.xyz 3306

Secret Not Found¶

ERROR: secret "microservice-identity-secret" not found

→ Check ESO sync: kubectl get externalsecret -n microservice-identity

OOMKilled¶

Container "app" in pod "microservice-identity-xxx" was OOM killed

→ Check memory usage in Grafana, increase resources.limits.memory in Helm values.