Kubernetes Security Best Practices: Production Checklist for Real Clusters

SCR Security Research Team
May 8, 2026
16 min read
812 words
Share

Production Kubernetes Security Is Mostly About Restraint

Most Kubernetes incidents do not begin with an exotic exploit. They begin with a cluster that was too open, too trusting, or too hard to reason about under pressure.

That is why production Kubernetes security is less about buying one more tool and more about tightening defaults.

The public lesson has repeated itself for years. In one widely reported Tesla cryptojacking incident, attackers found an exposed Kubernetes console and used the access to run mining workloads and reach sensitive cloud resources. The exact environment details were unusual, but the core pattern was familiar: exposed management surface, weak access boundaries, and too much trust inside the cluster.

If you are running Kubernetes in production, this checklist is the right baseline.


1. Lock Down Access First

If RBAC is loose, everything else becomes optional.

Production checklist:

  • No human users bound to cluster-admin for day-to-day work
  • Separate roles for deploy, read-only, break-glass, and cluster operations
  • Short-lived federated access instead of long-lived local credentials
  • automountServiceAccountToken: false unless the workload truly needs API access
  • Periodic review of service accounts, ClusterRoleBindings, and namespace-level roles

Example:

An internal metrics service does not need permission to list secrets, create pods, or read all configmaps in the namespace. It usually needs almost nothing. Yet many teams still deploy it with a broad default service account because it is convenient.


2. Enforce Restricted Pod Security by Default

Production workloads should not run as root, should not add Linux capabilities casually, and should not get writable filesystems without a reason.

Minimum baseline:

securityContext:
  runAsNonRoot: true
  allowPrivilegeEscalation: false
  readOnlyRootFilesystem: true
  seccompProfile:
    type: RuntimeDefault

Add Pod Security Admission labels to production namespaces and make exceptions rare and explicit.


3. Treat the Cluster Network as Hostile

The default flat network is one of Kubernetes' worst habits in production.

Checklist:

  • Default-deny ingress and egress at namespace level
  • Explicit allow rules between frontend, API, data, and observability tiers
  • No broad east-west access for convenience
  • Private control plane access where the platform allows it
  • Internet exposure only through reviewed ingress paths

Case pattern:

One compromised pod should not become a sightseeing pass for the rest of the cluster.


4. Stop Storing Secrets Like Configuration Files

Base64 is transport encoding, not protection.

In production, prefer:

  • External Secrets Operator with Vault, AWS Secrets Manager, Azure Key Vault, or GCP Secret Manager
  • Encryption at rest for etcd-backed secrets
  • Narrow namespace and workload access to secrets
  • Rotation plans for database credentials, API keys, and signing keys

If an etcd snapshot leak or backup exposure would reveal plaintext credentials, the design is still fragile.


5. Add Admission Control Before Developers Add Exceptions Everywhere

Use Kyverno or OPA Gatekeeper to block the patterns you already know are dangerous.

High-value policies:

  • Reject privileged containers
  • Reject :latest image tags
  • Require image digests for production
  • Enforce resource limits
  • Block hostPath, hostPID, and hostNetwork unless approved
  • Require non-root execution and seccomp profile

6. Sign, Scan, and Pin Images

Production clusters should not be pulling whatever tag happens to exist today.

Checklist:

  • Scan images in CI with Trivy or equivalent
  • Pin images to immutable digests
  • Prefer minimal base images
  • Sign release images with Cosign
  • Restrict image sources to approved registries

This is also where software supply chain discipline starts paying off.


7. Watch Runtime Behavior, Not Just Manifests

Static reviews find misconfigurations. Runtime monitoring catches what changed after deployment or what only becomes visible during an intrusion.

High-signal runtime events:

  • Shell spawned inside an application container
  • Unexpected outbound traffic from a normally quiet workload
  • Reads of sensitive files such as /proc/1/environ
  • New binaries written to a container filesystem
  • Container processes contacting mining pools or suspicious control servers

Falco and managed runtime detection tools can help here, but only if somebody owns triage.


8. Practice Cluster Recovery, Not Just Cluster Creation

Production Kubernetes security is also about how you recover from a bad day.

Make sure you have:

  • Backup strategy for manifests, state, and supporting data stores
  • Documented secret rotation path after compromise
  • A way to isolate namespaces or workloads quickly
  • Audit logs retained somewhere attackers cannot silently erase
  • A tested playbook for compromised service account or leaked kubeconfig scenarios

Production Kubernetes Checklist

  • No routine use of cluster-admin
  • Pod Security Admission enforced on production namespaces
  • Default-deny network policies in place
  • Secrets externalized or tightly controlled
  • Admission control blocks known-bad patterns
  • Images are scanned, signed, and pinned
  • Runtime detection configured and monitored
  • Audit logs retained and reviewed
  • Break-glass access documented and time-bounded
  • Recovery procedures tested

Further Reading

Related SecureCodeReviews guides:

The safest production clusters are usually the least surprising ones. Tight roles, narrow network paths, boring workload defaults, and tested recovery plans still beat cleverness.

Cloud Assessment

Need a cloud security review before rollout?

We review IAM, network exposure, storage security, deployment posture, and the misconfigurations that usually get missed in fast-moving teams.

AWS, Azure, and GCP posture reviews
IAM, storage, network, and encryption validation
Clear findings with prioritized fixes for engineering teams

Talk to SecureCodeReviews

Get a scoped review path fast

Manual review
Actionable fixes
Fast turnaround
Security-focused

Advertisement