Overview of third-party addons for EKS (Datree, GuardDuty EKS Runtime Monitoring)

Table of Contents

This is the third post where we continue the discovery of EKS add-ons. The first one was about Kubecost, Dynatrace, and Istio. The second one was about Teleport. In this one, we will take a look at Datree. Datree secures your Kubernetes by blocking the deployment of misconfigured resources. Amazon GuardDuty EKS Runtime Monitoring will be shortly described as well, as a new feature, released at the end of March 2023.

Datree installation

It can be installed as any other EKS add-on. Make sure you completed a subscription and your EKS version supports the add-on.

Then you need to Sign Up and choose the pricing plan on the website.

You can use 14 days free tier and try all functions.

After that generate a new token and use the following command to set your Datree token (replace <YOUR_TOKEN> with your token):

kubectl get deployment datree-webhook-server -n datree -o yaml | yq '(.spec.template.spec.containers[].env[] | select(.name=="DATREE_TOKEN")) |= .value="<YOUR_TOKEN>"' | kubectl replace -f -

And you will see your EKS clusters on the website.

Datree capabilities

Manifest validation

The first and simplest function is scanning Kubernetes manifests locally. This feature is available even in the free version. First of all, you need to install Datree CLI and configure it with your token.

$ curl https://get.datree.io | /bin/bash
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2218  100  2218    0     0   8872      0 --:--:-- --:--:-- --:--:--  8872
Installing Datree...
[V] Downloaded Datree
[V] Finished Installation

Let’s scan a simple Kubernetes manifest:

$ cat k8s-demo.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: rss-site
  namespace: test
  labels:
    owner: --
    environment: prod
    app: web
spec:
  replicas: 2
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      namespace: test
      labels:
        app: web
    spec:
      containers:
        - name: front-end
          image: nginx:latest
          readinessProbe:
            tcpSocket:
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 10
          resources:
            requests:
              memory: "64Mi"
              cpu: "64m"
            limits:
              cpu: "500m"
          ports:
            - containerPort: 80
        - name: rss-reader
          image: datree/nginx@sha256:45b23dee08af5e43a7fea6c4cf9c25ccf269ee113168c19722f87876677c5cb2
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8080
              httpHeaders:
                - name: Custom-Header
                  value: Awesome
          readinessProbe:
            tcpSocket:
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 10
          resources:
            requests:
              cpu: "64m"
              memory: "128Mi"
            limits:
              memory: "128Mi"
              cpu: "500m"
          ports:
            - containerPort: 88

Execute datree test command:

$ datree test k8s-demo.yaml 
>>  File: k8s-demo.yaml

[V] YAML validation
[V] Kubernetes schema validation

[X] Policy check

❌  Ensure each container has a read-only root filesystem  [2 occurrences]
    - metadata.name: rss-site (kind: Deployment)
      > key: spec.template.spec.containers.0 (line: 22:11)
      > key: spec.template.spec.containers.1 (line: 37:11)

💡  Incorrect value for key `readOnlyRootFilesystem` - set to 'true' to protect filesystem from potential attacks

❌  Ensure each container image has a pinned (tag) version  [1 occurrence]
    - metadata.name: rss-site (kind: Deployment)
      > key: spec.template.spec.containers.0.image (line: 23:18)

💡  Incorrect value for key `image` - specify an image version to avoid unpleasant "version surprises" in the future

❌  Prevent containers from escalating privileges  [2 occurrences]
    - metadata.name: rss-site (kind: Deployment)
      > key: spec.template.spec.containers.0 (line: 22:11)
      > key: spec.template.spec.containers.1 (line: 37:11)

💡  Missing key `allowPrivilegeEscalation` - set to false to prevent attackers from exploiting escalated container privileges

❌  Ensure each container has a configured liveness probe  [1 occurrence]
    - metadata.name: rss-site (kind: Deployment)
      > key: spec.template.spec.containers.0 (line: 22:11)

💡  Missing property object `livenessProbe` - add a properly configured livenessProbe to catch possible deadlocks

❌  Ensure each container has a configured memory limit  [1 occurrence]
    - metadata.name: rss-site (kind: Deployment)
      > key: spec.template.spec.containers.0.resources.limits (line: 34:15)

💡  Missing property object `limits.memory` - value should be within the accepted boundaries recommended by the organization

❌  Ensure workload has valid label values  [1 occurrence]
    - metadata.name: rss-site (kind: Deployment)
      > key: metadata.labels.owner (line: 7:12)

💡  Incorrect value for key(s) under `labels` - the vales syntax is not valid so the Kubernetes engine will not accept it


(Summary)
- Passing YAML validation: 1/1
- Passing Kubernetes (1.20.0) schema validation: 1/1
- Passing policy check: 0/1

+-----------------------------------+-----------------------+
| Enabled rules in policy "Starter" | 34                    |
| Configs tested against policy     | 1                     |
| Total rules evaluated             | 34                    |
| Total rules skipped               | 0                     |
| Total rules failed                | 6                     |
| Total rules passed                | 28                    |
| See all rules in policy           | https://app.datree.io |
+-----------------------------------+-----------------------+

The result is also saved in the web interface:

Cluster-level enforcement and monitoring

The first interesting feature that can help you with the EKS control plane upgrade is checking deprecated API versions:

In my case, nothing should be updated, but when deploying a resource with a deprecated API version, the Kubernetes engine will reject it.

For example, what needs to be checked and fixed before upgrading to v.1.24:

Deprecated API versionSupported API version
storage.k8s.io/v1beta1​storage.k8s.io/v1

The next capability helps you to follow the EKS security best practices

By default, Datree scans everything, but you can easily configure it to skip some namespace, resource, or specific rule for a single object:

When you select a namespace, you can see resources with failing rules:

Every resource has a list of failing rules and fix instructions:

Nothing special here, just EKS security best practice suggestions:

The default policy is “Monitor”, but you can also enforce your rules and block resources that are not compliant:

One helm value should be updated:

helm upgrade -n datree datree-webhook datree-webhook/datree-admission-webhook --reuse-values --set datree.enforce="true"

Action on policy failure has changed:

Now let’s try to apply the previously checked manifest:

$ kubectl apply -f k8s-demo.yaml
Error from server: error when creating "k8s-demo.yaml": admission webhook "datree-webhook-server.datree.svc" denied the request: 
---
webhook-rss-site-Deployment.tmp.yaml

[V] YAML validation
[V] Kubernetes schema validation

[X] Policy check

❌  Ensure each container has a read-only root filesystem  [2 occurrences]
    - metadata.name: rss-site (kind: Deployment)
      > key: spec.template.spec.containers.0 (line: 151:9)
      > key: spec.template.spec.containers.1 (line: 173:9)

💡  Incorrect value for key `readOnlyRootFilesystem` - set to 'true' to protect filesystem from potential attacks

❌  Prevent containers from having unnecessary system call privileges  [3 occurrences]
    - metadata.name: rss-site (kind: Deployment)
      > key: spec.template.spec.securityContext (line: 211:24)
      > key: spec.template.spec.containers.0 (line: 151:9)
      > key: spec.template.spec.containers.1 (line: 173:9)

💡  Incorrect value for key seccompProfile - set an explicit value to prevent malicious use of system calls within the container

❌  Ensure each container image has a pinned (tag) version  [1 occurrence]
    - metadata.name: rss-site (kind: Deployment)
      > key: spec.template.spec.containers.0.image (line: 151:16)

💡  Incorrect value for key `image` - specify an image version to avoid unpleasant "version surprises" in the future

❌  Prevent containers from escalating privileges  [2 occurrences]
    - metadata.name: rss-site (kind: Deployment)
      > key: spec.template.spec.containers.0 (line: 151:9)
      > key: spec.template.spec.containers.1 (line: 173:9)

💡  Missing key `allowPrivilegeEscalation` - set to false to prevent attackers from exploiting escalated container privileges

❌  Ensure each container has a configured liveness probe  [1 occurrence]
    - metadata.name: rss-site (kind: Deployment)
      > key: spec.template.spec.containers.0 (line: 151:9)

💡  Missing property object `livenessProbe` - add a properly configured livenessProbe to catch possible deadlocks

❌  Ensure each container has a configured memory limit  [1 occurrence]
    - metadata.name: rss-site (kind: Deployment)
      > key: spec.template.spec.containers.0.resources.limits (line: 167:13)

💡  Missing property object `limits.memory` - value should be within the accepted boundaries recommended by the organization

❌  Ensure each container fully utilizes CPU with no limitations  [2 occurrences]
    - metadata.name: rss-site (kind: Deployment)
      > key: spec.template.spec.containers.0.resources.limits (line: 167:13)
      > key: spec.template.spec.containers.1.resources.limits (line: 201:13)

💡  Invalid key `limits.cpu` - refrain from setting a CPU limit to better utilize the CPU and prevent starvation

❌  Ensure container memory request and memory limit are equal  [1 occurrence]
    - metadata.name: rss-site (kind: Deployment)
      > key: spec.template.spec.containers.0.resources (line: 166:11)

💡  Invalid value for memory request and/or memory limit - ensure they are equal to prevent unpredictable behavior


(Summary)

- Passing YAML validation: 1/1

- Passing Kubernetes (v1.24.10-eks-48e63af) schema validation: 1/1

- Passing policy check: 0/1

+-----------------------------------+-----------------------+
| Enabled rules in policy "Starter" | 57                    |
| Configs tested against policy     | 1                     |
| Total rules evaluated             | 57                    |
| Total rules skipped               | 0                     |
| Total rules failed                | 8                     |
| Total rules passed                | 49                    |
| See all rules in policy           | https://app.datree.io |
+-----------------------------------+-----------------------+

The report is present in the web UI:

Amazon GuardDuty EKS Runtime Monitoring

A new GuardDuty feature was introduced several days ago (Mar 30, 2023). A new lightweight, fully-managed security agent can be installed as an EKS add-on. It monitors on-host operating system-level behavior, such as file access, process execution, and network connections.

Make sure you have enabled it in GuardDuty console:

A DaemonSet is deployed:

$ kubectl get all -n amazon-guardduty
NAME                            READY   STATUS    RESTARTS        AGE
pod/aws-guardduty-agent-7c8sr   1/1     Running   4 (2m53s ago)   3m36s
pod/aws-guardduty-agent-pbr26   1/1     Running   4 (2m50s ago)   3m36s

NAME                                 DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/aws-guardduty-agent   2         2         2       2            2           <none>          13m

Then you can see that your EKS cluster is under protection:

GuardDuty agent analyses dozens of runtime events and if it detects a potential threat, it generates an EKS Runtime Monitoring finding. The available findings are below:

  • CryptoCurrency:Runtime/BitcoinTool.B

  • Backdoor:Runtime/C&CActivity.B

  • UnauthorizedAccess:Runtime/TorRelay

  • UnauthorizedAccess:Runtime/TorClient

  • Trojan:Runtime/BlackholeTraffic

  • Trojan:Runtime/DropPoint

  • CryptoCurrency:Runtime/BitcoinTool.B!DNS

  • Backdoor:Runtime/C&CActivity.B!DNS

  • Trojan:Runtime/BlackholeTraffic!DNS

  • Trojan:Runtime/DropPoint!DNS

  • Trojan:Runtime/DGADomainRequest.C!DNS

  • Trojan:Runtime/DriveBySourceTraffic!DNS

  • Trojan:Runtime/PhishingDomainRequest!DNS

  • Impact:Runtime/AbusedDomainRequest.Reputation

  • Impact:Runtime/BitcoinDomainRequest.Reputation

  • Impact:Runtime/MaliciousDomainRequest.Reputation

  • Impact:Runtime/SuspiciousDomainRequest.Reputation

  • UnauthorizedAccess:Runtime/MetadataDNSRebind

  • Execution:Runtime/NewBinaryExecuted

  • PrivilegeEscalation:Runtime/DockerSocketAccessed

  • PrivilegeEscalation:Runtime/RuncContainerEscape

  • PrivilegeEscalation:Runtime/CGroupsReleaseAgentModified

  • Execution:Runtime/ReverseShell

  • DefenseEvasion:Runtime/FilelessExecution

  • Impact:Runtime/CryptoMinerExecuted

  • Execution:Runtime/NewLibraryLoaded

  • PrivilegeEscalation:Runtime/ContainerMountsHostDirectory

  • PrivilegeEscalation:Runtime/UserfaultfdUsage

Finding type contains “:Runtime/” in the name:

GuardDuty can now identify specific containers within your EKS clusters that are potentially compromised and detect attempts to escalate privileges from an individual container to the underlying Amazon EC2 host and the broader AWS environment.

Conclusion

In this post, we focused on security add-ons for the EKS cluster, such as Datree, that help you monitor and enforce the best security practices for Kubernetes resources, and the new integration with Amazon GuardDuty, that add more events for analysis and threat detection in EKS runtime.