Why We're Moving to ARM
AWS Graviton (ARM) instances are cheaper than their equivalent x86 instances and, for a lot of workloads, faster per dollar. For anyone watching their EKS bill, migrating to ARM is one of the better levers you can pull.
The catch: your container images have to be built for the target architecture. An image built only for amd64 won't run on an arm64 node. So step one of any ARM migration is making your build pipelines produce multi-arch images.
The "Tiny" Pipeline Change That Wasn't So Tiny
Building multi-arch images is, on paper, a one-line change with docker buildx:
docker buildx build \
--platform linux/amd64,linux/arm64 \
-t myrepo/app:tag \
--push .
That single --platform linux/amd64,linux/arm64 tells the build to produce a manifest with both architectures. Once it's pushed, the container runtime on each node automatically pulls the variant that matches the node's CPU. Beautiful.
But there's a cost: you're now building twice. And if your build host is x86, the arm64 half is built under emulation (QEMU), which can be dramatically slower. For most of our services this was fine. For the frontend, though, the build time ballooned - the extra arm64 build turned a quick pipeline into a slow one.
Attempt 1: Just Use a PriorityClass (Spoiler: Not Enough)
My first thought was a PriorityClass. The idea: make the frontend "more important" than other pods so it always gets a spot on the AMD nodes.
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: frontend-high-priority
value: 1000000
globalDefault: false
description: "Frontend wins contention on the limited AMD nodes."
This is useful - but it does not do what I first assumed. Here's the crucial distinction:
So a PriorityClass answers "who gets scheduled first?" - not "where does this pod run?" Those are two completely different questions, and I was conflating them.
The Real Fix: Three Pieces That Each Do One Job
Keeping the frontend reliably on AMD takes three mechanisms working together. Each solves a different part of the problem.
1nodeSelector - Decides WHERE the pod can land
This is the piece that actually pins the frontend to x86. Kubernetes labels every node with its architecture automatically, so you just select for it:
spec:
template:
spec:
priorityClassName: frontend-high-priority
nodeSelector:
kubernetes.io/arch: amd64
containers:
- name: frontend
image: myrepo/frontend:tag # amd64-only is fine now
With kubernetes.io/arch: amd64, the scheduler will only ever place the frontend on an AMD node. This was the missing piece - PriorityClass could never have done this.
2PriorityClass - Decides WHO wins when AMD nodes are full
Now that the frontend is restricted to AMD nodes, a new risk appears: those AMD nodes are now a scarce resource (we're shrinking them as we move to ARM). If other pods fill them up, the frontend could be stuck Pending.
This is exactly where the PriorityClass earns its keep. When the high-priority frontend can't fit, the scheduler will preempt lower-priority pods on the AMD nodes to make room - and those evicted pods get rescheduled elsewhere (for us, onto the plentiful ARM nodes). So nodeSelector gets the frontend to the AMD nodes, and PriorityClass makes sure it wins the space there.
3Taints & Tolerations - Keep everyone else OFF the AMD nodes
Relying on preemption works, but it's reactive - pods get scheduled, then evicted, which causes churn. The cleaner approach is to stop other pods from landing on the AMD nodes in the first place. That's what taints do: they repel pods unless the pod explicitly tolerates the taint.
Taint the AMD nodes (or set it on the AMD node group / Karpenter NodePool):
kubectl taint nodes <amd-node> workload=frontend:NoSchedule
Then let only the frontend tolerate it:
tolerations:
- key: "workload"
operator: "Equal"
value: "frontend"
effect: "NoSchedule"
Now the AMD nodes are effectively reserved for the frontend: other pods are repelled by the taint, and the frontend lands there thanks to its toleration + nodeSelector. The PriorityClass becomes a safety net rather than the primary mechanism.
- nodeSelector / affinity = where a pod is allowed to go (attraction)
- Taints / tolerations = which pods a node repels (reservation)
- PriorityClass = who gets scheduled first and who can evict whom (order)
Gotchas Worth Knowing
- Don't taint your AMD nodes without checking system pods. DaemonSets and critical add-ons need to tolerate the taint or run elsewhere, or you'll break things like logging/monitoring agents on those nodes.
- Preemption can cause churn. If you lean on PriorityClass-driven preemption instead of taints, expect lower-priority pods to be evicted and rescheduled. Use
preemptionPolicy: Neverif you want priority ordering without evicting others. - Keep priority values sane. Don't set your frontend above the reserved system priority classes (
system-cluster-critical,system-node-critical) - you don't want app pods preempting core cluster components. - This is a transition state. The end goal is still a native multi-arch frontend on ARM. Pinning to AMD is a bridge, not a destination.
Frequently Asked Questions
Does a PriorityClass control which node a pod runs on?
No. PriorityClass only affects scheduling order and preemption. To control which node or architecture a pod runs on, use a nodeSelector or node affinity.
How do I keep a pod on amd64 nodes during an ARM migration?
Add nodeSelector: kubernetes.io/arch: amd64 to pin it to x86 nodes. Add a high PriorityClass so it wins contention on the limited AMD capacity, and taint the AMD nodes with a matching toleration on the pod to reserve them.
What's the difference between nodeSelector, taints, and PriorityClass?
nodeSelector/affinity controls where a pod can go. Taints/tolerations repel pods from nodes to reserve them. PriorityClass controls scheduling order and preemption. Different jobs - usually used together.
Why not just build the frontend for ARM too?
Eventually you should. We paused it only because the multi-arch frontend build was too slow under emulation. With a native ARM build runner the build time drops and the frontend can move to ARM like everything else.
My Takeaway
The lesson from this migration was simple but easy to get wrong: scheduling priority and pod placement are not the same thing. A PriorityClass will never keep a pod on a particular architecture - it just decides who goes first. To actually pin our frontend to AMD nodes, the nodeSelector did the placement, taints reserved the capacity, and the PriorityClass was the safety net for contention.
If you're partway through an ARM (Graviton) migration and need certain workloads to stay on x86 for a while, reach for all three - and know which problem each one is solving.
For the official details, see the Kubernetes docs on Pod Priority and Preemption, assigning pods to nodes, and taints and tolerations.