What is Bottlerocket and how to use it in EKS?
If you are in amidst of a compliance or trying to begin its difficult journey, you might find lot of red alerts for the EC2 instances you are spinning up. Moreover, you also start to question the usage of your default Amazon Linux images, whether they are right for your use case? You might be wondering if there is/are custom AMI(s) which might support your containerized workloads, has less overhead to patch and provider better security.
Bottlerocket is a purpose-built Linux-based operating system meticulously crafted to serve as an optimal container host. Designed to reduce unnecessary overhead, Bottlerocket emerges as a lean and efficient solution tailored for the modern demands of containerized applications.
Bottlerocket Core Values
Bottlerocket relies on three main core values elaborated below
Being Minimal
Why to have hundreds of unwanted applications when all I want to run is a docker image. Bottlerocket brings a minimalist approach to only have required packages in its bare image so that its becomes light and easier to manage.
All the require packages are installed in a single Linux image which is a combination of different AWS services along with different platforms/architectures it supports. Each combination is called a variant.
For example bottlerocket-aws-k8s-1.28-x86_64-1.16
is a variant for Bottlerocket image of EKS version 1.28 on x86_64
architecture. Another variant for ECS would be something like bottlerocket-aws-ecs-2-x86_64-1.16
Safe updates
Bottlerocket relies fundamentally on the concepts of bin packing everything in a single AMI image and hence there is no package manager.
As everything gets bind to an image, to update anything you would simply update the image version which would have the latest (stable and secure) version of all the components tested, removing the need to manage the packages separately.
Left side shows the running instance where the running version has a newer version update. This is being downloaded into the same instance but at a separate location. Once the download is complete a reboot takes place switching the new version as the primary source for the kernel.
Security
Security is the topmost concern of the bottlerocket instance and they are doing few things to ensure that
- Root filesystem is immutable — As the root file system becomes immutable, only the stable and tested versions are part of the instance image.
dm-verity
is used for transparent integrity to check the filesystem alongside a root hash using Merkle tree. - SeLinux is by default enforced
- No shell — so lesser chance of remote attacks.
Concepts
Below are few high-level concepts to understand more on bottlerocket instances
API Driven
As the bottlerocket instances don’t have any shell, how would you query things like image version and its updates, basic operations or admin level tasks. To solve these tasks bottlerocket gives a well-defined HTTP API which can solve all these problems for you, simultaneously ensuring that only rightful operations are getting performed on the instances with precise steps for each operation.
Bootstrap containers
As we have already discussed that root file system is immutable and is verified by dm-verity
, the /etc
becomes the part of your mutable file system using tmpfs
.
Using bootstrap containers you can enable certain programs or features that you want to install on top of the root file system during your instance bootup. These are set of containers that run on top of the container runtime containerd
. You can run multiple such bootstrap containers and the instance booting will finish once all of containers have exited successfully. You can apply certain exit conditions on these bootstrap containers. You can read more on this here.
By default secure boot is also enabled making sure that right software gets offloaded by the UEFI firmware when trying to boot up machine.
Components
Bottlerocket instances are specific to containerized workloads and for this two sets of containerd instances run. One of them is used to run your normal containers on your orchestrator engine like kubelet
and other is to run an admin container which can act as a pseduo shell for you to run API driven HTTP calls using apiclient
, a tool given by bottlerocket to run API requests, and to debug your instances. This admin containers doesn’t ensure mutability on root file system.
Using Bottlerocket instances in EKS
Using bottlerocket instances in your EKS nodes is very simple. We just need to make sure that we are passing right AMIs and correct labels to the nodes so that bottlerocket update operator can actually check image updates on these nodes and reboot them whenever an update is available. We will discuss bottlerocket update operator shortly.
In our current EKS deployment we deploy nodes in two forms
- terraform — which spins the initial node group for us. This initial node group is used to run karpenter pods which then further spins the node as per need
- karpenter nodes — these nodes are spin by the karpenter controller whenever there is a pending workload.
Terraform EKS changes
To make changes in the our terraform code for EKS, we pass an option eks_managed_node_groups
in which we add an additional node pool something like this
eks_managed_node_groups = {
bottle = {
enable_bootstrap_user_data = true
platform = "bottlerocket"
bootstrap_extra_args = <<-EOT
[settings]
"motd" = "TrueFoundry: MLOps platform"
[settings.kubernetes.node-labels]
"bottlerocket.aws/updater-interface-version" = "2.0.0"
EOT
instance_types = local.env.user_input.tfy_control_plane.enabled == "True" ? ["c6a.xlarge", "m6a.xlarge", "c6i.xlarge", "r6a.xlarge"] : ["c6a.large", "m6a.large", "c6i.large", "r6a.xlarge"]
capacity_type = "SPOT"
ami_type = "BOTTLEROCKET_x86_64"
# Not required nor used - avoid tagging two security groups with same tag as well
create_security_group = false
# Ensure enough capacity to run 2 Karpenter pods
min_size = 2
max_size = 2
desired_size = 2
labels = {
"class.truefoundry.io" = "bottle"
}
tags = {
# This will tag the launch template created for use by Karpenter
"karpenter.sh/discovery" = local.env.cluster_name
}
block_device_mappings = {
xvdb = {
device_name = "/dev/xvdb"
ebs = {
volume_size = 100
volume_type = "gp3"
throughput = 150
delete_on_termination = true
}
}
}
}
}
In these there are few important things to note in the above spec
- We need to pass the
"bottlerocket.aws/updater-interface-version" = "2.0.0"
so that bottlerocket update operator can interface it. ami_type = "BOTTLEROCKET_X86_64"
— to pass the bottlerocket AMI- block device mapping to
/dev/xvdb
— bottlerocket can’t use/dev/xvda
as custom AMI is using/dev/xvda
to store the root image of size 2GB.
Karpenter
Karpenter is relatively newer way of autoscaling your workloads. Based on the compute required it will try to bring the right sized node, simultaneously bin-packing the daemonsets so that all necessary workloads gets executed on the node.
We rely heavily on karpenter to spin Compute and GPU workloads. Karpenter has a concept called Provisioner
(which is now deprecated and named as NodePool
) defining the allowed size of nodes with right labels and taints if required. Moreover through AwsNodeTemplates
(which is now deprecated and named as NodeClasses
) you can define the template of the node, giving the right security group, AMI family and root volume size.
Now you can understand where we might have to make changes in the Karpenter provisioner
and awsnodetemplates
to make sure Karpenter spins Bottlerocket instances.
- We give the label
"bottlerocket.aws/updater-interface-version" = "2.0.0"
in theprovisioner
section. - We give the root volume to be
/dev/xvdb
andamiFamily
as Bottlerocket inawsnodetemplate
Through this Karpenter is able to support bottlerocket instances as well.
I am trying to avoid mentioning theprovisioner
andawsnodetemplate
spec as they are now deprecated by karpenter in older versions.
Bottlerocket Update operator (brupop)
Bottlerocket update operator or brupop
is a controller for keeping your bottlerocket instances in an EKS cluster up-to-date.
Brupop design
It consists of three main components
- Controller — Controller is the main brain which checks for upstream image updates and orchestrates the entire update process
- Agent — Agent runs on each node which gets instructions by controller to perform the operation
- API server — API which authenticates the agent and the API calls its trying to make
Installing brupop
Brupop has two helm charts which we need to install
- bottlerocket-shadow — which consists of
bottlerocketshadow
CRDs
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: bottlerocket-shadow
namespace: argocd
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
destination:
namespace: brupop-bottlerocket-aws
server: https://kubernetes.default.svc
project: default
source:
chart: bottlerocket-shadow
repoURL: https://bottlerocket-os.github.io/bottlerocket-update-operator
targetRevision: 1.0.0
syncPolicy:
automated: {}
syncOptions:
- CreateNamespace=true
- bottlerocket-update-operator — which consists of the actual operator
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: bottlerocket-update-operator
namespace: argocd
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
destination:
namespace: brupop-bottlerocket-aws
server: https://kubernetes.default.svc
project: default
source:
chart: bottlerocket-update-operator
repoURL: https://bottlerocket-os.github.io/bottlerocket-update-operator
targetRevision: 1.3.0
syncPolicy:
automated: {}
syncOptions:
- CreateNamespace=true
Make sure destination namespace is always brupop-bottlerocket-aws
as its fixed in their helm chart.
Controller uses the bottlerocketshadow
CRDS to manage your nodes. Earlier in the document we asked the nodes to have labels "bottlerocket.aws/updater-interface-version" = "2.0.0"
. This was done so that controller can identify which bottlerocket instances you want brupop to control for you.
You can simply check the status of your nodes by running
$ kubectl get brs -n brupop-bottlerocket-aws
NAME STATE VERSION TARGET STATE TARGET VERSION CRASH COUNT
brs-ip-xx-xx-1-243.ec2.internal Idle 1.17.0 Idle <no value> 0
brs-ip-xx-xx-14-136.ec2.internal Idle 1.17.0 Idle <no value> 0
brs-ip-xx-xx-31-78.ec2.internal Idle 1.17.0 Idle <no value> 0
Bottlerocket support for TrueFoundry
We at TrueFoundry support bottlerocket instances to support both CPU and GPU workloads so that your entire MLOps journey can now be focussed on developing the right code to solve right problems, releasing the pressure on maintainable patches and security fixes.