CoSign with Kubernetes: Ensure integrity of images before deployment

CoSign with Kubernetes: Ensure integrity of images before deployment

Notary vs CoSign? Is CoSign a good alternative? Can we automate keys & signature rotation?

During the post-exploitation phase, attackers try to enumerate & exploit systems in stealth mode. With containers, it's very easy to run a malicious service by just changing the image name of any deployment. No SOC/IR team will get an alert for this kind of operation as it looks like a regular deployment but it will open a gateway for an innumerous amount of data exfiltration & act as a backdoor.

Hence with cloud & containerized environments ensuring the integrity of the images getting deployed is crucial than ever.

Notary & CoSign are prominent in the industry for signing & validating the integrity of images.

Thanks to GiantSwarm for the ValidatingWebHook boiler template.

TL;DR

High-level overview

  1. We create a private & public pair (CoSign generates ECDAS-P256 key pair). Use a CI pipeline to perform this operation.
  2. These keys need to be stored in KMS solutions like Hashicorp Vault, AWS KMS, etc.
  3. Image signing happens via CI pipeline.
  4. Private key will be fetched from KMS provider, stored in CI secret store & used for signing of images.
  5. We use the public key pair for validating images.

Image repository snapshot

CoSign-Random-Image-Dockerhub.PNG

Push signed information to the registry

CoSign-Random-Image-Signature-Pushed.PNG

Image repository snapshot (Updated)

CoSign-Random-Image-Signature-Dockerhub.PNG

CoSign Workflow

Deployments can be triggered manually or in an automated fashion by leveraging solutions like Argo CD.

CoSign-Workflow.jpg

We will be using a ValidatingWebHook to perform integrity validation of images. An admission controller is written in golang to perform this validating operation.

Notary vs CoSign

NotaryCoSign
Notary uses TUF (The Update Framework) to sign & manage signatures. This framework is quite complex to maintain.CoSign doesn't use TUF framework
Notary creates multiple keys - root, timestamp, snapshots, targets, delegation, and so on making things further complex.Like the TUF framework, there's no structure of different keys. We can of course use the TUF framework with CoSign but that level of complexity isn't required in our environment
Notary uses same keys for signing & validation.CoSign uses a private key for signing & a public key for validation.
There's no direct KMS support for key management making things further complex. This requires a lot of manual effort in securing & rotating the keys.KMS support is available for standard providers like GCP, AWS, Hashicorp.
Notary requires a separate database for storing the signature data.No additional database required. All the image signatures are pushed directly to the registry.
HA feature isn't there by default, we need to build solutions for it. We need to build solutions to handle HA/auto-scaling.Doesn't require additional hardware to run. It's just a single binary. So no requirement for HA/auto-scaling.
We need to ensure the notary server is up & running for seamless integration.CoSign connects directly with the image registry, so a health check isn't applicable.
Eminently, notary validation is possible only on docker runtime. Most of the environments use containerd and it can't differentiate b/w signed & unsigned imagesSame limitation with container runtime for CoSign
Due to the above limitation of validating things at runtime, we have to use Notary client to validate the images. An additional component for maintenance in the future.We can use ValidingAdmissionWebhook to achieve image validation.
Notary doesn't have RBAC capabilities, allowing anyone to perform privileged operations. To fix this, we have to build Notary from a source with limited capabilities.CoSign uses different keys for signing & validation, hence the lack of RBAC capabilities won't be an issue.
Synchronizing the signature information between multiple data centers is not practical in real-time.Synchronizing the signature data between data centers is very easy.
Without signature data duplication, we cannot validate images in other data center deployments. Additionally, we have to distribute multiple keys for image validation across data centers which is a real overhead.Since, we have only one public & private key, we can re-use it in different data centers without much hassle.
Since pushing & maintaining all the tens of keys across multiple data centers is not feasible, we have to use a different set of keys in each data center for image signing & validation. We have to re-sign the images again & validate them again.This is easily achievable with CoSign. All signature information is stored in registries along with images.
If we use different root keys, target keys, delegation keys, and so on in different data centers, that violates the basic trust principle, single source of truth.CoSign allows us to ensure we follow the single source of truth principle.

Image signing

  1. After key generation, the CI pipeline pushes private & public keys to the KMS provider.
  2. Private keys pulled from KMS & stored in CI secret store.
  3. We use the CI pipeline to sign images with private keys and push them to the registry.

Image validation

We need to validate the signature of the images before deployment. We will use ValidatingWebHook in Kubernetes to verify the signatures.

  1. Create a ValidatingWebHook to validate the image. Sample PoC
  2. ValidatingWebHook deployment fetches public key from KMS for validation.
  3. If a signature exists, it allows the deployment, else it rejects the deployment request.

ValidatingWebHook

ValidatingWebHook can be a SPOF (Single Point of Failure). So, precautionary measures should be taken to ensure it doesn't go down.

  1. Health checks at 30-second intervals using liveness probes. If the webhook is down, kill the pod & spin up a new one.
  2. Enable HPA (Horizontal Pod Autoscaler) for this deployment. In production, the traffic is very high.
  3. Every deployment/pod creation in the Kubernetes cluster hits our ValidatingWebHook pod for image validation. Mandatory to have multiple pods to handle the load.

Key Management/Rotation

To comply with security guidelines, it's better to automate the key rotation process using pipelines.

  1. Build pipelines should have a script to retrieve the list of all signed images from the repository.
  2. A clean-up has to be done on all the signature data in the registry.
  3. A new key-pair needs to be created using pipeline & the keys to be sent to the KMS provider.
  4. Pull the private key from KMS & put it in the CI secret store.
  5. From step 1, we have the list of previously signed images. Re-sign all the images with the new private key & push them to the registry.

POC

Hands-on experience: https://github.com/rewanthtammana/grumpy

Conclusion

Considering the design, usage, maintenance & architectural edges, CoSign is undoubtedly a better choice to achieve our goal.