About two years ago, I worked on a Serverless Kubernetes showcase using Azure Kubernetes services and Azure Container instances (ACI) for VK and Azure functions. They are based on two Cloud Native Computing Foundation (CNCF) projects :
Naturally I’m curious about the latest development of those projects (also credit to Seattle area’s mischievous weather code), hence I started to play with them and recaptured my samples on another more open-sourced centric Github repository called Cloud-native Serverless, you can see this repository is more of an open-source edition of the repository that I started 2 years ago. In this blog, I will be sharing my findings of my recent playthrough.
For people who are familiar with Kubernetes cluster architecture. The kubelet agent is the captain of the worker node, the official documentation describes kubelet as a primary “node agent” that runs on each node. The main responsibility of the kubelet agent is to take a set of PodSpecs to provision the pod and ensure that the containers described in those PodSpecs are running and healthy. Besides that, it also registers the worker node with the apiserver by matching the metadata.name field of the Node. A node has to be healthy and available for scheduling for a workload to be scheduled on it.
As an open-source project, Virtual Kubelet was accepted to CNCF in April, 2018 and is at the Sandbox project maturity level ( to learn more about CNCF maturity levels from here ). Virtual Kubelet is an implementation of the Kubernetes kubelet in order to connect a Kubernetes cluster to other APIs, therefore Virtual Kubelet can be used in the use cases in which to enable the extension of the Kubernetes API work in conjunction with serverless container platforms in the cloud or in the air-gapped environment.
KEDA was created in May 2019 as a partnership between Microsoft and Red Hat and joined the CNCF Sandbox in March 2020. In August 2021, KEDA was accepted as an incubating project. Kubernetes Event-Driven Autoscaling (KEDA) is landing their focus scaling applications in Kubernetes based on the demanding events as an event-driven autoscaler.
Capabilities & features
On a high level, it requires the following aspects to make virtual kubelet as an awesome technology be meaningful to real-life business use cases.
- Provisioning & Scalability
Compared to a worker node which is very often to be a virtual machine or a pool of virtual machines shared common resources ( such as VMSS in Azure ), the value of a virtual node is to be able to spin up a new instance in a matter of seconds (The reality may still not the ideal but this is the goal ). Together with KEDA to scale containerized applications based on the on-demanding events. Those technologies will be extremely helpful for organizations such as those in the retail business for their significant demanding requests on black Fridays ( typical cloud bursting scenario). Imagine how easy and quickly a user would give up on the threshold if the website is not responsive enough when they’d add items to their cart or worse encounter exceptions during the payment process. Those are highly impactful to the user experience and business continuity.
VK opens the possibility to dedicate a virtual node for on-demand workloads by offering them compute resources tailored to the workload requirements.This is often time accomplished by taking advantage of Kubernetes features such as node selector, node affinity or anti-affinity, taint or torrents for scheduling, then we could easily claim back the virtual nodes when the job is done.
Ensuring the effectiveness of executing the workloads and jobs is challenging as it may be impacted by lots of other factors such as networking throughput and latency. The effectiveness also shows on the auto-scaling aspect while having other ongoing workloads up and running.
- Availability & reliability
While executing workloads and jobs reply on on-demanding infrastructure, reliability counts, it affects directly whether the job is to be done. Meanwhile availability is also important, it affects whether the workload is available to be consumed, to be accessed or to be monitored.
- Security & compliances
Despite the on-demand nature, security matters. As it’s part of the Kubernetes cluster, we need to consider if it fits in the big picture of the overall security and compliance requirements of its existence. A possible use case for VK is to use a virtual node to execute external source code and thus minimize the security risk to the rest of the members in the same cluster, we need to ensure no blind spots during its execution, hacking could happen at any point of time. An effective networking policy, and security & governance policies could help a lot in the scenario.
From a manageability point of view, an end-to-end full-stack monitoring is needed while running VK or KEDA, it helps understand the resource usage and performance aspect of things such as how many instances are needed in real-time while the workloads are still in a strong demanding state, how much time we spend to spin up a new instance or new pod and whether it’s hitting the limits threshold, such as resources limits such as CPU, memory, networking limits such as IP addresses, and storage limits enough IOPS and so on. A clear definition of Pod Disruption Budget (PDB) is important in this scenario and it helps to define monitoring KPIs.
You no longer need to preserve a compute instance, thus no maintenance, OS patches or other administration efforts. On-demand scheduling when you need it and deallocated when it’s no longer in need. Since the billing is pay-per-use, users only need to pay for the amount of vCPU and memory resources your containerized application requests. Optimized and flexible billing model.
Challenges & Resolutions
There are numerous Virtual Kubelet providers in the market, providers from Microsoft such as Azure Container Instances (ACI) and Azure Batch are traditional VK providers, and it’s dealing with the on-demand workload by spinning up more instances in a matter of seconds, the latter is proved its effectiveness in data analytics and machine learning scenarios, it will become more significant as the cloud native data pattern is playing more and more important role in the cloud-native world. Plus they are backed by other Azure services including the Azure serverless platform. AWS Fargate is based on Amazon ECS to run containers without clusters of Amazon EC2 instances, similarly it has great potential in the platform integration with other AWS services.
From the open-source side, VK providers such as Elotl Kip provide AWS, GCP and are actively working on Azure support. Other providers such as HashiCorp Nomad are more like an alternative to Kubernetes, it uses VK tech. Other ones such as Admiralty Multi-Cluster Scheduler and Liqo are more focused on the Kubernetes cluster federation use cases.
To get a complete and always latest list of virtual Kubelet provider lists from here.
Event-driven serverless functions
Function as a service or FaaS is a subcategory of serverless. Compared to Platform as a service (PaaS), FaaS is more about an individual “function” with a code snippet containing a piece of business login. It’s light-weighted and flexible, it resolves the challenges around infrastructure, setting up dependencies, configurations everything lays out the groundwork for a developer to get started on their real job and allows developers to focus on programming that matters most to them. We may have heard a lot about Azure functions, AWS Lambda functions and GCP cloud functions. Those are serverless functions running in the public cloud, oftentimes they are very powerful as they’re backed by great cloud providers, event-driven integration with other cloud services, richness in the language runtimes, and various development tools are all parts of the ecosystem.
When all public cloud providers are doing serverless function as a service in the cloud, it’s nice to see some interesting open-source Kubernetes native serverless function offerings such as Fission OpenFaaS, Knative.
Taking Fission as an example of serverless functions on Kubernetes, it supports Java, node.js, Golang, and python to write short-lived functions and supports HTTP,timer and a few other triggers, it’s not as rich as what we get from those already available from major cloud providers.
The more sophisticated Serverless functions offered on Kubernetes, OpenFaaS and KNative, not only come handy when it comes to deploying event-driven functions and microservices, but also have an effective mechanism to handle autoscaling, traffic splitting and better observability. The main advantage of those functions is the capability to be running across platforms, some may work better than the others for sure, but technically it makes it possible to be up and running in an air-gapped environment while taking advantage of the Kubernetes for autoscaling which is a great plus.
Speaking of which, worth to mention, Last year Microsoft brought Azure functions to Azure-enabled Kubernetes, therefore we can now run Azure functions on the custom location which could be on-premises or an air-gapped environment, and with all its richness in language runtime and other integration benefits, I’m totally excited about those positive developments in the cloud-native space, the next generation of application will go beyond and above with those innovations are happening.
A few important limitations are worth mentioning here :
- Most VK providers don’t support persistent volumes and hence an even lesser chance for stateful workloads.
- Despite the fact they support ConfigMaps and Secret mount as a Volume, most providers don’t manage the update of configMaps and Secrets.
- Due to the downward API limitation of virtual Kubelet, things such as PodIP are not supported, which could be impactful on the networking aspect for some important scenarios.
- Thinking of SecurityContext for managing workloads in the cluster as a whole, VK providers such as Kip don’t support PodSecurityContext fields such as FSGroup, RunAsNonRoot.
Therefore I went ahead to build a showcase on my Youtube channel about Serverless functions on Kubernetes with Fission. Watch out CloudMelonVision channel, more videos about serverless functions on Kubernetes will be coming soon.
All in all, lots of interesting findings from my recent playthrough. I hope it would come true one day VK becomes a better technology than traditional worker nodes or node pools to drive more innovation and make a difference with Kubernetes. Serverless functions will bring true freedom to the developers by allowing them to write code in any language, any platform, and anywhere. I’m very positive about the future development in cloud-native space. And you ? Feel free to comment down below your thoughts ? Let’s stay tuned !