Distributed locks in Kubernetes with Scala
Protect critical sections across nodes using Kubernetes leases
Did you know that Kubernetes offers a Lease API? I didn’t until quite recently. When focusing on deployment and infrastructure, it’s easy to forget that applications running inside the famous application container also gain ubiquitous access to its control plane. As it turns out, this can be used in various ways to solve distributed computing puzzles.
In this series of blog articles, I will illustrate how we can solve increasingly complex distributed computing problems in Scala using the Kubernetes API, from a simple lock to protect a shared resource, to leader-election, map-reduce and even forming a full stateful cluster.
Unmesh Joshi includes the lease in his list of fundamental patterns for distributed systems: “Use time-bound leases for cluster nodes to coordinate their activities”. Let’s start by looking at how Kubernetes implements this concept.
Kube Leases
The official documentation states: “Distributed systems often need leases, which provide a mechanism to lock shared resources and coordinate activity between members of a set. In Kubernetes, the lease concept is represented by Lease objects in the coordination.k8s.io
Api Group”.
Kubernetes itself uses leases for kubelet heartbeats, leader election, and more. A Lease
is like any other object in Kubernetes: it is managed via the standard HTTP verbs POST, PUT, PATCH, DELETE, GET
. Properties of leases are specified in a LeaseSpec
data structure. Its most relevant fields are:
name
: the name is an identifier that is unique per namespace (same as for other kube resources)holderIdentity
: identifier of the current lease holderacquireTime
andrenewTime
: when the lease was first acquired, and when it was last renewed, respectivelyleaseDurationSeconds:
the minimum amount of time candidates should wait before attempting lease acquisitionresourceVersion:
a string managed by the server that changes every time the lease is modified, serving as a marker for the current state for both concurrency control and to track updates from a specific point in time
These fields cover what is needed to implement acquisition, renewal, and revocation behaviors. There isn’t any active process in Kubernetes managing leases: it’s a simple CRUD API. But it’s equipped with the semantics required for various coordination scenarios when combined with watch notifications and optimistic locking based onresourceVersion.
Coordination scenarios
The most intuitive use case for a lease is to represent temporary exclusive access to some shared resource, i.e. a distributed locking and synchronization mechanism. The following sequence diagram illustrates a typical interaction between two nodes competing for lease acquisition:
Of course, while nodes are alive and well, things are simple: we can rely on active lease ownership management. However, nodes can disappear abruptly and leave owned leases stale. For this reason, the lease is time-bound with a defined expiry duration, requiring the owner to renew its commitment periodically. Failing to renew in time permits for other nodes to take over, as illustrated in the following scenario:
When multiple nodes compete for the claim, resource version, and name uniqueness guarantee a unique successor:
Scala abstractions
Thanks to the great work of contributors of the kubernetes-client project, interacting with the Kubernetes API Server is a smooth experience in Scala. Built on top of cats-effect, http4s, and fs2, this library exposes pure-functional abstractions and uses code generation to match API evolutions.
Backed by this foundational layer, we have defined functional abstractions for leases which we published in the leases4s library. leases4s-core
defines core abstractions, namelyLease
and LeaseRepository.
The repo also publishes lease4s-patterns
packaging ready-to-use coordination constructs based on leases.
Example application
The library features a small example application illustrating these concepts, a book repository where texts can be uploaded and are accessible for download, sorted alphabetically or by word count. The web application is backed by an S3 bucket acting as a static website (a scalable and cost-effective option). This example is fully functional and can be deployed locally using localstack and a small pulumi program written with besom.
The upload form calls a scala service that stores the file in S3 and updates the various versions of the index.html file, one for each sorting order. For write scalability and the sake of this example the service is replicated and upload requests are load-balanced. This therefore requires atomicity in updating the index files.
Guarded updates
You guessed it, a lease will represent exclusive write access to the index files. A call toguard
is all we need to protect this critical section from concurrent execution:
Further
The leases4s Scala library provides a lightweight distributed lock for functional applications running in Kubernetes. Future articles in this series will show how we can solve more elaborate distributed computing challenges on top of this foundation, such as map-reduce and cluster formation. Stay tuned!