Distributed locks in Kubernetes with Scala

Protect critical sections across nodes using Kubernetes leases

Jonas Chapuis
4 min readAug 7, 2024

Did you know that Kubernetes offers a Lease API? I didn’t until quite recently. When focusing on deployment and infrastructure, it’s easy to forget that applications running inside the famous application container also gain ubiquitous access to its control plane. As it turns out, this can be used in various ways to solve distributed computing puzzles.

In this series of blog articles, I will illustrate how we can solve increasingly complex distributed computing problems in Scala using the Kubernetes API, from a simple lock to protect a shared resource, to leader-election, map-reduce and even forming a full stateful cluster.

Unmesh Joshi includes the lease in his list of fundamental patterns for distributed systems: “Use time-bound leases for cluster nodes to coordinate their activities”. Let’s start by looking at how Kubernetes implements this concept.

Kube Leases

The official documentation states: “Distributed systems often need leases, which provide a mechanism to lock shared resources and coordinate activity between members of a set. In Kubernetes, the lease concept is represented by Lease objects in the coordination.k8s.io Api Group”.

Kubernetes itself uses leases for kubelet heartbeats, leader election, and more. A Lease is like any other object in Kubernetes: it is managed via the standard HTTP verbs POST, PUT, PATCH, DELETE, GET. Properties of leases are specified in a LeaseSpec data structure. Its most relevant fields are:

  • name: the name is an identifier that is unique per namespace (same as for other kube resources)
  • holderIdentity: identifier of the current lease holder
  • acquireTime and renewTime: when the lease was first acquired, and when it was last renewed, respectively
  • leaseDurationSeconds:the minimum amount of time candidates should wait before attempting lease acquisition
  • resourceVersion: a string managed by the server that changes every time the lease is modified, serving as a marker for the current state for both concurrency control and to track updates from a specific point in time

These fields cover what is needed to implement acquisition, renewal, and revocation behaviors. There isn’t any active process in Kubernetes managing leases: it’s a simple CRUD API. But it’s equipped with the semantics required for various coordination scenarios when combined with watch notifications and optimistic locking based onresourceVersion.

Coordination scenarios

The most intuitive use case for a lease is to represent temporary exclusive access to some shared resource, i.e. a distributed locking and synchronization mechanism. The following sequence diagram illustrates a typical interaction between two nodes competing for lease acquisition:

Nodes Alice and Bob are competing to acquire the lease. Alice and Bob are subscribed to notifications via the watch API (represented in orange). Alice manages to create the lease first. The subsequent attempt of Bob fails as a lease already exists under this name. Alice periodically renews the lease to indicate liveliness, thus forcing Bob to remain patient. When Alice releases the lease (by deleting it), Bob can finally gain entry.

Of course, while nodes are alive and well, things are simple: we can rely on active lease ownership management. However, nodes can disappear abruptly and leave owned leases stale. For this reason, the lease is time-bound with a defined expiry duration, requiring the owner to renew its commitment periodically. Failing to renew in time permits for other nodes to take over, as illustrated in the following scenario:

Alice has acquired the lease and has been holding to it while Bob waits for his turn. However, Alice suddenly crashes. If the lease duration elapses without renewal notification, Bob can consider the lease expired and is allowed to recreate the lease for himself. (Note that Bob could have also patched all the relevant lease fields, instead of recreating it. However, the resulting deletion and creation notifications make it clear that the lease was released and reacquired by a different owner.)

When multiple nodes compete for the claim, resource version, and name uniqueness guarantee a unique successor:

Same scenario as above, with node Alex also in the picture. Just as Bob does, Alex initiates lease take-over once the timer locally expires. However, its DELETE request gets to the control plane just a bit later (but before it is notified of Bob’s deletion). Therefore, the API server rejects this with 409 forcing Alex to resume its wait.

Scala abstractions

Thanks to the great work of contributors of the kubernetes-client project, interacting with the Kubernetes API Server is a smooth experience in Scala. Built on top of cats-effect, http4s, and fs2, this library exposes pure-functional abstractions and uses code generation to match API evolutions.

Backed by this foundational layer, we have defined functional abstractions for leases which we published in the leases4s library. leases4s-core defines core abstractions, namelyLeaseand LeaseRepository.The repo also publishes lease4s-patterns packaging ready-to-use coordination constructs based on leases.

Abstractions are expressed with a higher-kinded F effect type (tagless style). This trait represents a lease and its various properties. The identifier must be unique, it’s the Kube resource name. The holder ID tracks the current owner of the lease. Meta-data such as labels and annotations are available. A boolean property indicates expiry status and a stream can be subscribed to track expiry.
Leases are acquired from a repository, which operates on leases with a certain label combination. Labels act as a filter for leases existing in the Kube namespace: multiple applications can co-exist in the same namespace if they don’t use the same labels. The `acquire` method is the entry point to obtain a lease with a certain ID: this call will semantically block until the lease is acquired, returning an instance of `HeldLease` (see the definition below). Once acquired, the lease is automatically renewed by a fiber supervised by the resource, according to passed parameters. Subscription to changes in the repository is available via the `watcher` stream: this enables scenarios of node awareness (which we will cover in future articles)
A held lease offers the `guard` utility method. It runs the given `fa` action (an effect on a value of type `A`) guarded by the lease. This implements a distributed critical section: exclusivity of the lease ownership is maintained during execution. In the normal scenario, the lease is still held after completion of the action, and the outcome is `Succeeded`. If for some reason such as an unexpected node disconnection the lease expires in the meantime, a best-effort cancellation is attempted and the outcome is `Canceled`. If the action fails with an exception, the outcome is `Errored`

Example application

The library features a small example application illustrating these concepts, a book repository where texts can be uploaded and are accessible for download, sorted alphabetically or by word count. The web application is backed by an S3 bucket acting as a static website (a scalable and cost-effective option). This example is fully functional and can be deployed locally using localstack and a small pulumi program written with besom.

The upload form calls a scala service that stores the file in S3 and updates the various versions of the index.html file, one for each sorting order. For write scalability and the sake of this example the service is replicated and upload requests are load-balanced. This therefore requires atomicity in updating the index files.

Index files of the example application, a book repository filled with Jules Verne’s works. The left screenshot is a list of books sorted by name, and the right screenshot shows the same content sorted by word count. These two HTML pages are stored statically in S3 along with the texts and updated atomically upon uploading a new file.

Guarded updates

You guessed it, a lease will represent exclusive write access to the index files. A call toguard is all we need to protect this critical section from concurrent execution:

Updating the index files with various sort orders while maintaining consistency is done by acquiring a `file-uploader` lease and running the `updateIndexFiles` function under guard.
Updating the indices consists of parsing the main page for existing entries, rendering the new pages with different sort order, and uploading them in parallel for efficiency. Thanks to the lease guard, we know that no other replica of the upload service will be parsing or editing the files at the same time (which would lead to corrupted indices).

Further

The leases4s Scala library provides a lightweight distributed lock for functional applications running in Kubernetes. Future articles in this series will show how we can solve more elaborate distributed computing challenges on top of this foundation, such as map-reduce and cluster formation. Stay tuned!

--

--