Durable execution versus session backends

There are two up-and-coming architectural paradigms for app development in the web world:

In this post I’m going to lay out my understanding of the similarities and differences between them.

This was prompted by me discussing CloudFlare’s Durable Objects product in the Changelog Zulip, and someone thinking I was talking about durable execution, when I was actually trying to talk about session backends.

There is some overlap between products which can be used to build either architecture, so I want to try to clarify the situation.

I have been waiting for somebody to clone Durable Objects so I could build session backends without so much vendor lockin, so I have a particular interest in the landscape around these concepts.

Durable execution

Durable execution is described by Temporal as:

Durable execution systems run our code in a way that persists each step the code takes. If the process or container running the code dies, the code automatically continues running in another process with all state intact, including call stack and local variables.

https://temporal.io/blog/building-reliable-distributed-systems-in-node

Durable execution frameworks, also sometimes called workflows-as-code, evolved from “workflow engine” systems which expressed sequences and branching logic as state machines using GUIs or configuration languages like XML or JSON.

Examples of durable execution products include Temporal, Azure Durable Functions, and Restate among many others. In this post from last year, Chris Riccomini gives an overview of the durable execution startup landscape.

Temporal’s example apps include order fulfilment, money transfers, lambda function orchestration for stock trading, and background checks.

A hallmark of durable execution systems is that a single “function” or “workflow” can be very long-running compared to usual web request handlers or queue jobs.

The Temporal post above gives this example of waiting for some time to send a “rate your delivery” message:

await sleep('1 hour');
await sendPushNotification('✍️ Rate your meal');

Session backends

Jamsocket describes a session backend as:

Spin up a dedicated process on the server for each user, and maintain state in that process. Give the frontend a persistent connection to its dedicated process, and shut the process down when the user closes the app.

https://jamsocket.com/blog/session-backends

This is in contrast to the usual “stateless” approach to building web apps where “every request sent from the frontend to the backend contains all the information needed to fulfill it, or at least, contains enough context for the server to locate that information in a database.”

It is also different to a heavily stateful client, like an editor or game. This is the territory of single-page-app frontend frameworks and state management libraries. (But session backends usually go hand-in-hand with a heavily stateful client.)

When building realtime apps, the database becomes the bottleneck, as described in this excellent post:

Keeping our request handlers stateless, does NOT really solve the scalability problem; instead – it merely pushes it to the database.

Sure, if we’re working in a classical Huge-Company environment, we (as app developers) can say “it is not our problem anymore” (washing our hands of the matter Pilate-style). However, if our aim is not only to cover our ***es to keep our current job while the project goes down, but rather want (in a true DevOps spirit) to make sure that the whole system succeeds – we need to think a bit further.

http://ithare.com/scaling-stateful-objects/

(This post contains an example of trying to shoehorn this kind of realtime app into AWS: CloudFlare’s durable multiplayer moat.)

Figma, in their engineering blog, describes in some detail how they built their own “session backend” for their multiplayer editor:

We use a client/server architecture where Figma clients are web pages that talk with a cluster of servers over WebSockets. Our servers currently spin up a separate process for each multiplayer document which everyone editing that document connects to. If you’re interested in learning more, this article talks about how we scale our production multiplayer servers.

https://www.figma.com/blog/how-figmas-multiplayer-technology-works/

Session backend products include Jamsocket and their Plane project, CloudFlare Durable Objects, and maybe Fly Machines if you built some more coordination on top of them?

CloudFlare’s Durable Object documentation describes the following use-cases for session backends:

Use Durable Objects to build collaborative editing tools, interactive chat, multiplayer games and applications that need coordination among multiple clients, without requiring you to build serialization and coordination primitives on your own.

A hallmark of session backend systems is the ability for multiple clients to open realtime connections (i.e. WebSockets) to a unique “room” or “object” and coordinate or collaborate.

Here’s an example of broadcasting a message to all connected clients from CloudFlare’s chat example app:

// Iterate over all the sessions sending them messages.
let quitters = [];
this.sessions.forEach((session, webSocket) => {
  if (session.name) {
    try {
      webSocket.send(message);
    } catch (err) {
      quitters.push(session);
    }
  } else {
    // This session hasn't sent the initial user info message yet, so we're not sending them
    // messages yet (no secret lurking!). Queue the message to be sent later.
    session.blockedMessages.push(message);
  }
});

Having in-memory state for things like the list of connected clients and their WebSocket endpoint makes realtime coordination code easier to write.

Comparison

Where it gets confusing is that some products overlap these concepts.

CloudFlare Durable Objects are both a uniquely-addressable process, and also a persistent database unique to each instance. You could use this per-object database for durable execution - and in fact, this is what CloudFlare themselves did, building Workflows on top of Durable Objects.

Going in the other direction, Rivet built what they refer to as an “actor” framework for session backends on top of their durable execution engine - doing it the other way around!

It seems to me that the similarity is in the “private state” which allows a “session” to be identified persistently even when it isn’t currently running.

This allows a “session” to correspond 1-to-1 with some other persistent entity or concept like a “document”, “project”, “chat room” or “game”. Durable execution needs that same “private storage” to remember the execution log of each unique invocation of a workflow.

There are some players in this space I can’t place due to unfamiliarity. For example, Microsoft Orleans seems to have lots of similarities to Durable Objects, but I can’t see any documentation about connecting to an instance of an actor/grain. Akka’s cluster routing seems to provide some similar capabilities.

Bonus: Erlang

When either of these concepts are discussed, Erlang/Elixir users like to drop in and say something to the effect of

OTP does all of this already, thank goodness I don’t have to work with these complicated products!

My experience with Erlang is restricted to learning me some Erlang, and I have had a tiny bit of Elixir experience on a basic web app. I wish I could better evaluate claims like this with some real experience.

From what I have been able to glean, it would still take some work to have as turnkey an experience as CloudFlare’s globally distributed objects. Of course, it’s the global routing that’s the hard part, not as much the management of the “session” process once a request has arrived at it.

Bonus 2: actors

Speaking of Erlang, where do “actors” fit into all this?

The “Actor Model” seems to be a fairly broad term, but I think it most closely fits the “session backend” concept more than “durable execution”. But with the connotations that it’s more used for “business logic” than it is for, e.g. a video game instance with high-performance needs.

The concept of unique identity doesn’t seem to be an integral part of the actor model. But it sort of needs to be in the current session backend architectural paradigm, because session-actors are exposed to outside client communications using a unique identifier.