Aiming for 1 second deploys

Aiming for 1 second deploys

Fast delivery of quality software depends heavily on some level of application of the agile methodology. A core pillar of this methodology being the rapid iteration and communication between stakeholders. It is a core Offworld Nexus belief that considering "environments" as a fluid, ephemeral notion greatly contributes to this objective.

However this author, being CTO of WITH — a web agency — observes that despite the best efforts to make deployments fast with a streamlined CI/CD pipeline, it appears that the average deployment time is 11 mins 3.597838 secs (measured over the past 7 days). This is a very long time. Too long to have a deployment happening during a synchronous conversation. You can't have all stakeholders around a table testing and fixing issues in real time, while it would be such a relief during high pressure phases of the project.

In 2003, it was usually taking a few seconds to deploy your website over a 56k modem using FileZilla. Just drop a few PHP files over FTP and done. Obviously things have evolved and we now have a lot more safety, yet it's impossible not to wonder what we could accomplish with all the extra transistors that we got. Could the time to deploy a modern application drop below 1 second?

Goals

Let's scope our work. Our target is a Vercel-type application with persistence such as databases and buckets provided as a commodity, with some front-end and back-end code that you write and deploy through a Git-based CI/CD.

Given that environments need to come and go as required, our deployments must include the following:

  • Creating the underlying infrastructure: ingress, VMs, routing, etc
  • Fetching the project's dependencies
  • Setting up and migrating the database
  • Starting and routing the servers

All this must fit within 1 second.

😱
But Rémy, you're crazy!
The reader right now

Yes, I am. DevOps is not for the sane. Let's make a quick back-of-envelope reality check:

  • Most elements of the infrastructure should be SaaS-like. There is no reason why it would take more than a handful of milliseconds to configure the ingress. A microVM can start in 150ms, a container less than 1ms. No worries there, especially if parallelized.
  • Modern dependency managers such as uv are extremely fast. It's often faster to install the virtualenv from scratch than to fetch it from the GHA cache. Now downloading all those packages might take more than 1s, but installing them is rarely more than 100ms. Given that dependencies stay stable between deployments, proper caching strategies can help keep our promise most of the time.
  • Your typical CI/CD build will create a complete Docker image, including the whole operating system. This obviously cannot work, you will have to find a way to transit only the source code and build artifacts. But this in itself can be done very quickly.
  • While back-end applications have pretty much no build step at all, front-end usually involves compiling a set of JS and CSS files into differently-organized JS and CSS, which can take some time. Can this be solved by caching, faster JS tooling or other options not imagined yet? Let's see when we get there.
  • Setting up the database is more of a challenge. On non-production environments obviously the amount of data will be reduced to some non-scary numbers, so maybe some dump/restore might do the trick. However if you have control of the DB server, why not imagine some filesystem-level COW magic for instant cloning? Typically you would copy your main dev DB for each feature branch.
  • DB migrations are typically done by the app itself. They do not happen often. If the system has a way to know when to run the migrations command or not you can probably skip the step all together most of the time. For the rest of it, well it might surpass budget.

As you can see, there is good and bad news for the 1s goal. Maybe we'll have to dillute our wine and set this as an average rather than a maximum. But overall, there is good reason to believe that most deployments can indeed fit within one second.

Methodology

This quest for 1-sec deploys will unfold in an unbound series of articles, where we will progressively try to reach goal, first trying to deploy a static web page on an already-existing infrastructure and then increasing the scope when a step is complete.

What we're looking for, essentially is:

  • Second 0: git push
  • Second 1: download a file served by the newly deployed app

At second 1, the file should match the expectation.

We are obviously not talking in absolutes here. This will not work if your machine is far away from the Git repo and the deployed server. This will also not work if your connection is flaky. This is a general idea. The target is 1 second, we want to know how much we will fit in there.

In order to achieve this, we'll first try tools from the market, and as the challenge unfolds we'll try to replace missing parts with PoCs if needed. Given that a GitHub Actions worker takes more than 1 second to start, it is suspected that said PoCs might have to be created rapidly.

Coming up next

Upcoming articles on this blog will relate the attempts at 1s deployments. Subscribe if you want to know how it ends!

Receive insights to build AI apps at scale.