Optimisation as vertical integration
An idea I keep returning to: a particular kind of performance optimisation consists of vertically integrating the source code that implements a particular functionality.
Often, we deal with code that is horizontally integrated, with generic interfaces between layers which are designed to facilitate flexibility, separation of concerns, developer experience, and communication between teams.
For example, in a typical tiered web application,
- The browser talks HTTP to the API layer - a protocol that can express nearly anything, and which serializes data in e.g. JSON
- The API layer typically involves some kind of framework with cross-cutting middlewares to perform tasks like authentication and authorization across all entrypoints into the app
- The API layer typically creates queries as SQL strings, which are high-level data fetching programs which are serialised and sent across another network boundary to the database server, which evaluates them to return serialised data back to the API layer
All this layering gives software teams incredible leverage. Each part of the layer can be reused for any kind of software. HTTP requests and responses can carry any kind of data imaginable, from poetry to stock tickers to multimedia. SQL databases can be used to create any entity model you desire. The browser, thanks to the marvel that is REST, can act as a client to interact with any kind of service you need.
But of course, all this layering brings costs. Serialization and deserialization over and over, inter-process communication (not to mention network latency), the care and feeding of query planners, distributed consensus and consistency.
TigerBeetle
An example of a vertically integrated network service is TigerBeetle, a database for financial transactions.
https://github.com/tigerbeetle/tigerbeetle/blob/main/docs/internals/ARCHITECTURE.md
Instead of providing a generic horizontal interface, they explicitly cater to only a single use case: durably recording debits and credits. They can do this incredibly fast due to heavy tailoring of the software to this problem and only this problem. From the network API down to the disk file format, they have eliminated memory allocation, serialization, query planning, and so on.
This means you can’t use TigerBeetle for anything except storing financial transactions. But at that, it excels.
Simple code, high performance
An example from the video game world is Casey Mutatori’s “Simple code, high performance”.
https://www.youtube.com/watch?v=Ge3aKEmZcqY
In this long talk, he describes how he transformed a very slow editor tool for The Witness into a very fast tool.
Very simply put, the tool was doing large numbers of vertical raycasts into the game world. Each raycast would go through a huge amount of very generic framework code, written to be as flexible and useful as possible whenever any developer had a need to cast a ray into the game world. By defining the exact problem he was trying to solve, and writing just the code to do exactly that, he reduces the operation time from 30sec to fractions of a second. This turns a “batch” operation that breaks an artist’s flow into a real-time experience.
Rather than relying on this “generic” interface of raycasting, Casey breaks through the layers from problem definition to implementation.
Data-oriented design
More broadly, the data-oriented design community (of which Mike Acton is a champion) eschews the “horizontal” integration features of high level programming languages, typically object-oriented ones. Object orientation tends to argue for things like encapsulation, and programming to interfaces - as Alan Kay put it, “late binding of all things”. But when you need to be fast and efficient, late binding doesn’t cut it.
https://dataorienteddesign.com/dodmain/node17.html
Data-oriented design advocates talk about knowing the details of the platforms where code will actually be running, in order to be able to optimise for it. Even if you don’t know the exact platform, as Mike Acton puts it in this Q&A, “we often do optimise for one platform, but we also optimise for a finite set of platforms … the range is not 6502 to Cell, right.”
This is the beginning of vertically optimising your software in the very broadest sense, beginning with the platform it will run on.
Oxide
An honourable mention, the Oxide computer company is vertically integrating server hardware and software. As they describe it, this is more about control than performance.
https://changelog.com/podcast/592#transcript-219
That is another benefit of vertically integrating: being the owner of the entire pipeline or stack. It is, of course, a drawback too - maybe you don’t want to own everything down to the assembly language.