Containers
Measure cluster performance impact of Amazon GuardDuty EKS Agent
Introduction
Amazon GuardDuty is a threat detection service that continuously monitors your AWS environment for malicious activity and anomalous behavior. Since its launch in 2017, Amazon GuardDuty has expanded its visibility and threat detection coverage. Amazon GuardDuty is capable of analyzing tens of billions of events per minute across multiple AWS data sources such as AWS CloudTrail event logs, Amazon Virtual Private Cloud (Amazon VPC) Flow Logs, DNS query logs, Amazon Simple Storage Service (Amazon S3) data plane events, Amazon Relational Database Service (Amazon RDS) login events, Amazon Elastic Kubernetes Service (Amazon EKS) audit logs. In addition, as of March 30, 2023, Amazon GuardDuty also analyzes Amazon EKS runtime events.
With the release of Amazon GuardDuty EKS Runtime Monitoring, over 30 new security findings can be generated based on Amazon EKS event data originating from processes inside containers and hosts. These new findings are made possible by an eBPF agent that inspects activities occurring inside the container runtime environment such as process execution, file access, and network connections. Amazon GuardDuty EKS Runtime Monitoring can be enabled for your whole organization with just a few clicks.
How Amazon EKS Runtime Monitoring works
Amazon EKS Runtime Monitoring captures runtime activity from your Amazon EKS workloads through an agent installed on your nodes (or Amazon EC2 instances). The Amazon EKS Runtime Monitoring agent was built using eBPF, a Linux technology that allows you to extend the capabilities of the Linux kernel by loading and running custom programs that run in the kernel in a safe, sandboxed, environment. eBPF was chosen for Amazon GuardDuty EKS Runtime Monitoring due to its simplicity, safety, portability, and the detailed telemetry it can get from the kernel. The Amazon GuardDuty agent is packaged as an Amazon EKS add-on, which makes it easy to deploy and manage. While Amazon GuardDuty supports automated deployment and updates of the add-on across all clusters (i.e., within an AWS organization), it can also be managed manually, allowing you to fine-tune the clusters you’d like protected.
The Amazon EKS Runtime Monitoring agent is deployed as a DaemonSet. The DaemonSet instantiates an instance of the agent on every matching node in an Amazon EKS cluster. The agent loads an eBPF probe directly into the kernel in a sandboxed-like environment. Once installed, the agent starts capturing data from the underlying kernel, including host level events and container processes. Data from the kernel is then enriched with additional metadata gathered from userspace such as the Kubernetes Pod name, the namespace the pod is running in, and the cluster name.
From there, the event data is forwarded to Amazon GuardDuty’s backend through a managed VPC endpoint. To communicate with Amazon GuardDuty, the container agent uses the Amazon EC2 instance identity role for temporary credentials in order to securely send the telemetry data to the Amazon GuardDuty endpoint. Finally, Amazon GuardDuty ingests the events from the agent, analyzes them for threat activity, and generates findings as needed.
Monitoring cluster performance impact
Amazon GuardDuty EKS Runtime Monitoring, like all features of Amazon GuardDuty, was designed to have a negligible impact in your cluster and its workloads’ performance. The agent has upper limits of 1000m and 1 GB for CPU and memory, respectively. Accessing runtime event data requires some presence on the node, but the only observable activity is the eBPF agent collecting data and forwarding it to Amazon GuardDuty for analysis. Customers that would like to observe the impact of the agent on their cluster’s compute resources can explore using Inspektor Gadget and the top command as discussed in the following sections.
Inspektor Gadget is an eBPF-based toolset for debugging and inspecting Kubernetes resources and applications. Inspektor Gadget is well integrated with Kubernetes and spins up Pods that inject eBPF programs into the kernel. These programs then extract and display information about the activities occurring within Pods that are running on that node.
The following screenshot shows the usage and performance of eBPF programs running in an Amazon EKS cluster with Amazon GuardDuty Runtime Monitoring enabled and active. The Amazon EKS cluster runs an application that generated a number of Amazon EKS Runtime Monitoring findings.
While this application and the agent were actively running on the Amazon EKS cluster, a 4-second trace was performed using the top ebpf gadget. This trace noted that the Amazon EKS Runtime Monitoring agent was called 525 times and ran for just over 1 millisecond during that 4-second window. Note that workloads are unique, so results vary depending on the nature of your workload and the runtime events it causes. You can find guidance for using Inspektor Gadget in your environment here.
Customers can also measure the CPU and memory usage of the agent by executing the kubectl top command in their Amazon EKS nodes. This command shows resource usage as a percent of total CPU and Memory on the node. The following screenshot was taken on a node within the same cluster discussed earlier after executing the top command.
Interpreting Amazon EKS Runtime Monitoring findings
Amazon EKS Runtime Monitoring analyzes runtime events in your protected clusters to generate security findings. These findings can be viewed within the Amazon GuardDuty console from the findings tab. Note how the Resource Type criteria was set to EKSCluster, which displays findings related to your Amazon EKS clusters.
Each finding contains the relevant data for addressing the potential threat, which can be viewed within the console by selecting the finding you are interested in. From there, you can quickly see an overview of the finding, including its severity, the impacted Amazon EKS cluster, and when it was last detected. To learn more about the finding and how to remediate it, you can select Info next to the finding summary. To address findings automatically you can use Amazon EventBridge or the Custom Action feature of AWS Security Hub. A full list of the findings generated by Amazon EKS Runtime Monitoring is available on this page in the Amazon GuardDuty User Guide.
Amazon EKS Runtime Monitoring provides detailed threat information with minimal administration and cluster performance impact
Amazon EKS Runtime Monitoring allows security and platform teams to get runtime event specific details related to their workloads. Impact to workloads is minimal and can be observed using Inspektor Gadget, as discussed previously. To get started with Amazon EKS Runtime Monitoring, it takes just a few choices within the Amazon GuardDuty console.
There is a 30-day free trial for new Amazon GuardDuty users. If you are enabling Amazon GuardDuty for the first time, then Amazon EKS Runtime Monitoring won’t be enabled by default, and needs to be enabled as described here. If you are an existing Amazon GuardDuty user, then you can still use Amazon EKS Runtime Monitoring for 30 days at no additional charge. During your trial period, you can see the estimated cost of Amazon EKS Runtime Monitoring on the usage tab of the Amazon GuardDuty console. When evaluating the cost of Amazon EKS Runtime Monitoring, note that clusters covered by this protection won’t incur Amazon GuardDuty VPC Flow Log Analysis charges, which may result in material overall cost savings. This is because Amazon GuardDuty findings that previously relied on VPC Flow logs for detection can use the Amazon GuardDuty agent instead. To learn more, see the Amazon GuardDuty pricing page.
For more information, see the Amazon GuardDuty User Guide or reach out to your usual AWS support contacts.