Certified Kubernetes Application Developer (CKAD)
Overview
Once you get past the KCNA and KCSA exams, you should have your sights set on the CKAD exam. This was once considered an entry-level exam but due to its intense hands-on nature, it's better classified as intermediate. Despite this, it makes an excellent starting point for those looking to dive deeper into Kubernetes.
Read through the Certified Kubernetes Application Developer (CKAD) page for Domain and Competency details and information on how to register for the exam.
Some of the hands on activities that you should be comfortable with are:
- Deploying applications and choosing the right workload resource type for the job (i.e., Deployment, StatefulSet, DaemonSet, Job, CronJob)
- Configuring applications using ConfigMaps and Secrets
- Creating and managing Persistent Volumes and Persistent Volume Claims
- Using ServiceAccounts to give applications the right permissions
- Using network policies to restrict traffic to applications
- Using Ingress resources to expose applications to the outside world
Deploying Applications
A Pod is the smallest deployable unit in Kubernetes. It is a logical host for one or more containers which runs your application.
You would not typically create Pods directly, but rather use a workload resource to manage the pods for you. There are several different types of workload resources in Kubernetes, each with its own use case and knowing when to use each is important. The most common types of workload resources are:
- Deployment: A deployment is a declarative way to manage a set of replicas of a pod. It is the most common way to deploy applications in Kubernetes.
- ReplicaSet: A replica set is a workload resource that is used to ensure that a specified number of pod replicas are running at any given time. It is used to manage the scaling of applications.
- StatefulSet: A stateful set is a workload resource that is used to manage stateful applications. It is used for applications that require stable, unique network identifiers and stable storage.
- DaemonSet: A daemon set is a workload resource that ensures that a copy of a pod is running on all or some nodes in the cluster. It is used for applications that need to run on every node in the cluster, such as logging or monitoring agents.
- Job: A job is a workload resource that is used to run a batch job. It is used for applications that need to run to completion, such as data processing jobs.
- CronJob: A cron job is a workload resource that is used to run a batch job on a schedule. It is used for applications that need to run on a schedule, such as backups or report generation.
The resources that you request are reconciled by various controllers in the Kubernetes control plane. For example, when you create a Deployment, the Deployment controller will create a ReplicaSet and the ReplicaSet controller will create the Pods. When you submit a resource through the Kubernetes API, the desired state is stored in etcd and controllers are responsible for ensuring that the actual state matches the desired state. This is known as the reconciliation loop.
Info
Kubernetes also supports custom controllers through its extension patterns which allow you to extend the functionality of Kubernetes.
Use kubectl to generate YAML
In the KCNA section of this workshop, you used kubectl imperative commands to deploy applications. That is a good way to quickly get started but it isn't the best way to deploy applications in production. In production, you should be using declarative configuration files in the form of YAML manifests to deploy applications. This allows you to version control application deployments and makes it easier to manage changes over time.
Getting started with YAML might seem daunting at first, but it can be made easy if you leverage a feature of kubectl and have it generate the YAML for you.
Run the following command to create a namespace.
kubectl create namespace n654
Let's deploy the application that you packaged up in the KCNA section of this workshop.
Run the following command to create a deployment, but this time use the --dry-run=client
and -o yaml
flags to generate the YAML for you and redirect the output to a file called myapp2.yaml.
kubectl create deployment myapp2 \
--namespace n654 \
--image=<PASTE_IN_YOUR_CONTAINER_IMAGE_NAME> \
--replicas=3 \
--port=3000 \
--dry-run=client \
-o yaml > myapp2.yaml
Warning
This assumes you have a container image that you built in the KCNA section of this workshop.
View the myapp2.yaml file.
cat myapp2.yaml
In the output, you'll see that the entire Deployment manifest has been generated for you. If needed, you can edit the file to make any changes that you want and then apply it.
Run the following command to apply the manifest.
kubectl apply --namespace n654 -f myapp2.yaml
The application should now be deployed. Run the following command to get the Pods.
kubectl get pods --namespace n654
Why stop there? We can add the Service and Ingress resources to the same file too.
In YAML manifests, you can have multiple resources in the same file as long as they are separated by three dashes. So let's add that then append the service YAML to the file.
echo "---" >> myapp2.yaml
Then run the following command to generate the Service YAML and append it to the file.
kubectl expose deployment myapp2 \
--namespace n654 \
--port=80 \
--target-port=3000 \
--dry-run=client \
-o yaml >> myapp2.yaml
Do the same for creating an Ingress for the application to expose it externally using a load balancer.
echo "---" >> myapp2.yaml
Then run the following command to generate the ingress YAML and append it to the file.
kubectl create ingress myapp2 \
--namespace n654 \
--class=nginx \
--rule="myapp2.example.com/*=myapp2:80" \
--dry-run=client \
-o yaml >> myapp2.yaml
Now, view the myapp2.yaml file and notice how it has all three resources in it.
cat myapp2.yaml
Finally, apply the manifest.
kubectl apply --namespace n654 -f myapp2.yaml
Tip
You can use the --dry-run=client
and -o yaml
flags to generate the YAML for any resource type. Throughout the exam, you should leverage this technique as much as possible to save time. It is easier to use kubectl to generate most of the YAML for you than it is to write it from scratch.
If all went well, you should be able to access the application from your host machine using the following URL.
curl http://control -H "Host: myapp2.example.com"
Use kubectl for reference
kubectl is also a great tool for reference. You can use it to get information about resources in the cluster.
To see all the available resources in the cluster, run the following command.
kubectl api-resources
The output will show you all the resources that are available in the cluster. You will see the short name, full name, and the namespaced status of each resource. The short name in particular is useful for quickly referencing resources in kubectl commands. Remember, time is of the essence so you save a few seconds by using the short name. For example, you can use po
instead of pods
. Combine this with command aliases and you can save even more time.
You can run something like this to get information about an Ingress resource.
k get ing -n n654
You can also use the explain
command to get detailed information about a resource spec. For example, if you want to know more about the Ingress resource, you can run the following command.
kubectl explain ingress
To dig deeper into the spec, you can run the following command.
kubectl explain ingress.spec
You can also use the --recursive
flag to get information about nested fields, but it won't give you the full details of each field.
kubectl explain ingress.spec --recursive
The --help
flag is also incredibly useful. You can use it to get information about the flags that are available for a command. It will even give you examples of how to use the command.
Run this command and see what you get.
kubectl create configmap --help
Tip
kubectl is your friend. Use it to get information you need to get a task done. It is faster than looking up the documentation.
Configuring Applications
All applications need to be configured in some way. In Kubernetes, there are two main ways to configure applications: using ConfigMaps and Secrets.
ConfigMaps are used to store non-sensitive configuration data in key-value pairs. They can be used to store configuration files, command-line arguments, environment variables, and more. ConfigMaps can be mounted as volumes or exposed as environment variables in pods.
Secrets are used to store sensitive information, such as passwords, OAuth tokens, and SSH keys. Secrets stored in etcd and are base64 encoded --not encrypted; which is a key consideration when using them. Anyone with access to the etcd database can read the secrets. Therefore, it is important to use encryption at rest for etcd data.
Applications may also need to be configured to use persistent storage. In Kubernetes, persistent storage is managed using Persistent Volumes (PV) and Persistent Volume Claims (PVC). A PV is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes. A PVC is a request for storage by a user.
Using PVs and PVCs
PVs and PVCs are used to manage persistent storage usage in Kubernetes. To use persistent storage, you need to create a PV to represent a chunk of storage and a PVC to request the storage --an allocation. There are several different types of PVs, including hostPath, NFS, iSCSI, and cloud provider-specific PVs.
Note
Since we are working with a local Kubernetes cluster and all we have are local disks, we will use the hostPath type of PV. The hostPath type of PV uses a file or directory on the host to store the data. This is useful for testing and development purposes but isn't recommended for production use.
Run the following command to create a PV on the host's /tmp directory. We don't have much storage space to work with so we'll only carve out 20Mi of storage.
kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-pv
spec:
capacity:
storage: 20Mi
accessModes:
- ReadWriteOnce
hostPath:
path: "/tmp"
EOF
Note
The PV is a cluster resource that represents a piece of storage in the cluster so a namespace isn't needed.
Run the following command to view the PV. You should see that the PV is available. This is because the PV hasn't been claimed by a PVC yet.
kubectl get pv -n n239
Now create a PVC that requests the 20Mi of storage.
kubectl create ns n239
kubectl apply -n n239 -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Mi
EOF
If you run the command to view the PV again, you'll see that the PV is bound to the PVC.
kubectl get pv -n n239
Additionally, if following command to view the PVC, you'll see that the PVC is linked to the PV.
kubectl get pvc -n n239
PVCs are then used in workload resources and mounted as volumes in the Pod spec. Run the following command to create a Pod that uses the PVC.
kubectl apply -n n239 -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
containers:
- name: my-pod
image: busybox
command: ["sleep", "3600"]
volumeMounts: # Define the volume mounts for the container
- name: data # Name of the volume
mountPath: /data # Path to mount the volume in the container
volumes: # Define the volumes for the Pod
- name: data # Name of the volume
persistentVolumeClaim: # Source type of the volume
claimName: my-pvc # Name of the PVC
EOF
If you run the following command, you should be able to view the contents of the mounted volume in the Pod.
kubectl exec -it my-pod -n n239 -- ls /data
Since we used the host's /tmp directory for the PV, the Pod now has access to the host's /tmp directory and all the files within it.
Info
This is a security risk and should be avoided in production. Another concern of using hostPath is that the data is limited to the host that the Pod was scheduled on. If the Pod is rescheduled to another host, the data will not be available.
Using ConfigMaps
There are a few ways to use ConfigMaps in Kubernetes. The most common way is to use them as environment variables in a pod. This allows you to pass configuration data to your application without hardcoding it into the application itself.
In this example, you'll create a ConfigMap and use it to configure an application.
Run the following command to create a temporary file to store plugin configuration for RabbitMQ.
cat << EOF | tee /tmp/rabbitmq_enabled_plugins
[rabbitmq_management,rabbitmq_prometheus,rabbitmq_amqp1_0].
EOF
Create a namespace and a ConfigMap from the RabbitMQ plugins file.
kubectl create ns n754
kubectl create configmap rabbitmq-enabled-plugins -n n754 --from-file=/tmp/rabbitmq_enabled_plugins
Next, run the following command to use the ConfigMap as a file mounted in the RabbitMQ Pod to enable the plugins.
kubectl apply -n n754 -f - <<EOF
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: rabbitmq
spec:
serviceName: rabbitmq
replicas: 1
selector:
matchLabels:
app: rabbitmq
template:
metadata:
labels:
app: rabbitmq
spec:
nodeSelector:
"kubernetes.io/os": linux
containers:
- name: rabbitmq
image: rabbitmq:3.10-management-alpine
ports:
- containerPort: 5672
name: rabbitmq-amqp
- containerPort: 15672
name: rabbitmq-http
env:
- name: RABBITMQ_DEFAULT_USER
value: "username"
- name: RABBITMQ_DEFAULT_PASS
value: "password"
resources:
requests:
cpu: 10m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
volumeMounts: # Define the volume mounts for the container
- name: rabbitmq-enabled-plugins # Name of the volume
mountPath: /etc/rabbitmq/enabled_plugins # Path to mount the volume in the container
subPath: enabled_plugins # Mount as a file in the container
volumes: # Define the volume for the Pod
- name: rabbitmq-enabled-plugins # Name of the volume
configMap: # Source type of the volume
name: rabbitmq-enabled-plugins # Name of the ConfigMap
items: # Specify the items to mount
- key: rabbitmq_enabled_plugins # Name of the key in the ConfigMap
path: enabled_plugins # Name of the file in the Pod
EOF
Based on the comments in the YAML, you can see that we're mounting the ConfigMap as a file in the container. The subPath
field is used to specify the name of the file in the container. This allows us to mount a single key from the ConfigMap as a file in the container.
If you run the following command, you should be able to view the contents of the plugins file in the RabbitMQ Pod.
kubectl exec -it rabbitmq-0 -n n754 -- cat /etc/rabbitmq/enabled_plugins | grep rabbitmq_amqp1_0
The other ways to use ConfigMaps is to mount them as environment variables, so be sure to check out ths guide on how to do that.
Reference: https://kubernetes.io/docs/tutorials/configuration/updating-configuration-via-a-configmap/
Using Secrets
Secrets are used to store "sensitive" information. I put that in quotes because secrets aren't really secret. They are base64 encoded and anyone with access to the etcd database can read them. In terms of usage, they are similar to ConfigMaps. You can use them as environment variables or mount them as volumes.
If you noticed in the previous example, the username and password for RabbitMQ were hardcoded in the YAML. This isn't a good practice. Instead, you should use a Secret to store the username and password.
Run the following command to create a temporary file to store the RabbitMQ username and password.
kubectl create secret generic rabbitmq-secret -n n754 --from-literal=RABBITMQ_DEFAULT_USER=username --from-literal=RABBITMQ_DEFAULT_PASSWORD=password
Run the following command to use the Secret as an environment variable in the RabbitMQ Pod to set the admin user and password, and expose the RabbitMQ management interface on port 15672.
kubectl apply -n n754 -f - <<EOF
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: rabbitmq
spec:
serviceName: rabbitmq
replicas: 1
selector:
matchLabels:
app: rabbitmq
template:
metadata:
labels:
app: rabbitmq
spec:
nodeSelector:
"kubernetes.io/os": linux
containers:
- name: rabbitmq
image: rabbitmq:3.10-management-alpine
ports:
- containerPort: 5672
name: rabbitmq-amqp
- containerPort: 15672
name: rabbitmq-http
envFrom: # Use envFrom to load all keys from the secret as environment variables
- secretRef: # Reference the secret
name: rabbitmq-secret # Name of the secret
resources:
requests:
cpu: 10m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
EOF
If you run the following command, you'll see that the username and password are now being set as environment variables in the RabbitMQ Pod.
kubectl describe pod rabbitmq-0 -n n754 | grep "Environment Variables from" -A 1
Exposing Applications
Understanding the different types of services is also important. There are four types of services in Kubernetes: ClusterIP, NodePort, LoadBalancer, and ExternalName.
The most common type of service is ClusterIP, which exposes the service on a cluster-internal IP. This means that the service is only accessible from within the cluster. NodePort exposes the service on each node's IP at a static port. This means that the service is accessible from outside the cluster by requesting the node IP on the node port. LoadBalancer exposes the service externally using a cloud provider's load balancer, which ultimately gives you a public or private IP address. ExternalName maps the service to the contents of the externalName field (e.g., foo.bar.example.com), by returning a CNAME record with its value.
It is also important to remember that services uses the selector to find the pods that it should route traffic to. This is done using labels. Labels are key-value pairs that are attached to resources in Kubernetes.
Using Services
Run the following command to expose the RabbitMQ deployment that you just created.
kubectl apply -n n754 -f - <<EOF
apiVersion: v1
kind: Service
metadata:
name: rabbitmq
labels:
app: rabbitmq
spec:
ports:
- name: rabbitmq-amqp
port: 5672
targetPort: 5672
- name: rabbitmq-http
port: 15672
targetPort: 15672
selector:
app: rabbitmq # The selector is used to find the pods that the service should route traffic to
EOF
With the service created, you can access the nginx application using the service name. Run the following command to get the service IP.
kubectl run mycurl -n n754 --image=curlimages/curl -it --rm --restart=Never -- curl rabbitmq:15672
Using Ingress
The default service type is ClusterIP, which means that the service is only accessible from within the cluster. We demonstrated this with the curl command above with the curl pod running in the same cluster, you were able to access the RabbitMQ service using the service name. But services especially HTTP-based services are often meant to be accessed from outside the cluster. To do this, you can use an Ingress resource.
Think of an Ingress as a reverse proxy that routes traffic to the appropriate service based on the request URL. It is most commonly used to route HTTP and HTTPS traffic, but it can also be used to route TCP and UDP traffic.
Run the following command to create an Ingress resource for the RabbitMQ service.
kubectl apply -n n754 -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: rabbitmq
annotations:
nginx.ingress.kubernetes.io/rewrite-target: / # Rewrite the URL to remove the path prefix
spec:
ingressClassName: nginx
rules:
- host: rabbitmq.example.com
http:
paths:
- backend:
service:
name: rabbitmq
port:
number: 15672
path: /manage # Request path to the RabbitMQ management interface
pathType: Prefix
- backend:
service:
name: rabbitmq
port:
number: 5672
path: /amqp # Request path to the RabbitMQ AMQP interface
pathType: Prefix
EOF
Note
Notice that the Ingress resource is using path-based routing to route traffic to the appropriate service based on the request URL. The path /manage is used to route traffic to the RabbitMQ management interface, while the path /amqp is used to route traffic to the RabbitMQ AMQP interface. But these paths mean nothing to the actual RabbitMQ service so you're using the nginx.ingress.kubernetes.io/rewrite-target
annotation to rewrite the URL to remove the path prefix before it is sent to the RabbitMQ service.
Now you can access the RabbitMQ management interface from your host machine. Run the following command to access the RabbitMQ management interface.
curl http://control/manage -H "Host: rabbitmq.example.com"
Warning
Make sure you run the command from your host machine and not from the control node.
Securing network traffic
Network policies are extremely important to understand for all of the hands-on exams. In the previous section, we were introduced to NetworkPolicy but it is worth revisiting it here as it is a major part of all the Kubernetes exams and tends to be a stumbling block for many test takers.
In Kubernetes, there are two important behaviors to understand regarding network policies:
- By default, with no NetworkPolicies applied, all pods can communicate with all other pods
- Once you apply any NetworkPolicy with a podSelector to a namespace, the selected pods become isolated and only accept traffic explicitly allowed by NetworkPolicies.
Deny all ingress traffic
Run the following command to create a namespace and a deployment.
kubectl apply -n pets -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-all
spec:
podSelector: {}
policyTypes:
- Ingress
EOF
Warning
This assumes you deployed the AKS Store Demo microservices app in the previous section section of this workshop. If you've not already, please go back and deploy it before you proceed.
The deny-all policy you created earlier (with podSelector: {}
) selects all pods in the namespace and defines no ingress rules, effectively blocking all incoming traffic to every pod.
Let's test the network policies to confirm all ingress traffic is blocked.
Run the following command to get the pod name of the store-front application.
POD_NAME=$(kubectl get pod --namespace pets -l app=store-front -o jsonpath='{.items[0].metadata.name}')
Connection from store-front to product-service should timeout.
kubectl exec -it $POD_NAME --namespace pets -- wget -qO- http://product-service:3002/health --timeout=1
Connection from store-front to order-service should also timeout.
kubectl exec -it $POD_NAME --namespace pets -- wget -qO- http://order-service:3000/health --timeout=1
Finally, get the pod name of the order-service and confirm the connection from order-service to RabbitMQ also times out.
POD_NAME=$(kubectl get pod --namespace pets -l app=order-service -o jsonpath='{.items[0].metadata.name}')
kubectl exec -it $POD_NAME --namespace pets -- wget -qO- http://rabbitmq:15672/ --timeout=1
Configure specific traffic patterns
All traffic is blocked at the moment. Let's open up ingress traffic to the product-service from the store-front application. This is a common use case for network policies. You want to restrict access to a service to only allow traffic from specific applications.
Run the following command to allow ingress on the product-service only from the store-front application.
kubectl apply -n pets -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-storefront-to-product-service
spec:
podSelector:
matchLabels:
app: product-service
ingress:
- from:
- podSelector:
matchLabels:
app: store-front
policyTypes:
- Ingress
EOF
Similarly, the order-service should also only be accessible from the store-front application. So let's create a network policy for that as well.
kubectl apply -n pets -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-storefront-to-order-service
spec:
podSelector:
matchLabels:
app: order-service
ingress:
- from:
- podSelector:
matchLabels:
app: store-front
policyTypes:
- Ingress
EOF
Finally, the RabbitMQ service should only be accessible from the order-service application. So let's create a network policy for that as well.
kubectl apply -n pets -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-order-service-to-rabbitmq
spec:
podSelector:
matchLabels:
app: rabbitmq
ingress:
- from:
- podSelector:
matchLabels:
app: order-service
policyTypes:
- Ingress
EOF
Run the following command to get the pod name of the store-front application.
POD_NAME=$(kubectl get pod --namespace pets -l app=store-front -o jsonpath='{.items[0].metadata.name}')
Connection from store-front to product-service should now be successful.
kubectl exec -it $POD_NAME --namespace pets -- wget -qO- http://product-service:3002/health --timeout=1
Connection from store-front to order-service should now be successful as well.
kubectl exec -it $POD_NAME --namespace pets -- wget -qO- http://order-service:3000/health --timeout=1
Finally, get the pod name of the order-service and confirm the connection from order-service to RabbitMQ should now be successful.
POD_NAME=$(kubectl get pod --namespace pets -l app=order-service -o jsonpath='{.items[0].metadata.name}')
kubectl exec -it $POD_NAME --namespace pets -- wget -qO- http://rabbitmq:15672/ --timeout=1
Network policies are a powerful tool for securing your Kubernetes cluster. It is important to understand how they work and how to use them effectively. It is also important to efficiently test your network policies to ensure that they are working as expected during the exam.
For more NetworkPolicy practice, check out the NetworkPolicy Editor by Isovalent for an interactive experience.
Additional Resources
There is a lot more to cover for the CKAD exam. Here are some additional resources to help you prepare: