A gentle introduction to async/await

19 minute read Published: 2025-06-07

If you've ever wondered “What even is this for?”, then this article is for you.

Table of Contents

Introduction

One of the most unintuitive parts of Rust is undoubtedly its async/await syntax. Endless articles have been written litigating and relitigating its usefulness, uselessness, simplicity, complexity, and everything in between.

Figuring this out as an on-looker who was unfamiliar with the problem was… not very easy. I think the biggest issue was how difficult it was to figure out just what, exactly, the problem even was. What are we even trying to solve?

Typically, an introduction to this would look as follows:

The async/await mechanism was introduced to Rust in order to solve the problem of servers having to serve lots of clients at the same time, for which threads are not light-weight enough. They represent the creation of a lightweight thread, or a state machine inside the program –a perfectly-sized stack, basically–, that when polled can return either Pending or Ready with a value—

… and that's about when my eyes would glaze over. Who cares about all this? An entire language feature just to make faster servers? And what is a “state machine” anyway? OK, I actually know what it is, but what does it have to do with the issue at hand? Can't you explain this using human language?1

After a very long time, having finally figured it out, I thought I'd share my findings with the world. As per this article's title, I'll aim for this introduction to be gentle—so gentle, in fact, that the first half will not contain any direct references to computers or programming. Our main goal is to communicate just two things: what the problem is, and how async/await solves it.

If there's just one thing you internalise from this entire article, please make it the following:

Summary of the entire article

async/await is how you tell the computer “This is going to take a while, during which time you'll have nothing to do. Feel free to go do something else in the mean-time.”

When you have too much work

Before anything else, let us explain what async/await is not like, so as to later understand exactly what it is like.

Imagine that you have to paint the walls of a hotel. This is going to take several days, of course. But the reason it's going to take so much is because there's just so much work to do. Our algorithm would look as follows:

  1. Ensure you have all necessary tools at hand
  2. Paint the room in which you are located ← Very slow!
  3. Move to the next room
  4. If the hotel has not yet been fully painted, repeat instructions from the beginning.

What does one do, if one wants to speed this up?

Parallelism

The solution is obvious: Gather more people. Our algorithm will then look as follows:

  1. Select first unpainted room
  2. Select first idle person ← Might need to wait for someone to finish.
  3. Give this person the following instructions:
    1. Ensure you have all necessary tools at hand
    2. Move to the room I tell you
    3. Paint that room ← Very slow!
    4. Return for further instructions
  4. If there are unpainted rooms, repeat instructions from the beginning.
  5. Wait untill all people are idle.

Easy enough, for now. If the problem is too much work, then the solution is more workers. That lets us speed things up, in the best case, by a factor of how many workers there are.

When you have too little work

Let us now imagine a different situation.

Let us say you want to apply for a job to several companies. For whatever reason, you need to do this via surface-mail: Pen and paper.

First you send an application. Then the company might reply (within 3-5 business days) looking for more information, which you will then provide. Eventually you will get a final reply back, either telling you you've been rejected or telling you a wage they're willing to pay you. If it is larger than some amount X, you accept immediately; else, you wait until everyone has answered and pick the largest wage.

The simplest possible way to go about this would be as follows:

  1. Write and send an introductory letter to the selected company
  2. Spend the next few days near your mail-box waiting for a reply like Hachikō
  3. Send the information they want, if any
  4. Spend the next few days near your mail-box waiting for a reply like Hachikō
  5. (and so on)
  6. If the wage is immediately acceptable, send acceptance letter. Else, write down the wage, select the next company, and repeat instructions from the beginning.

Like in the previous example, this is going to take several days. In this case, however, the reasons for the delay are completely different! Whereas previously we took a long while because there was just so much work to do, in here we take a long while because we have to wait for something independent of us to finish. The majority of our time is not spent working, but rather busy-waiting2.

How do we go about speeding this up?

Parallelism (redux)

The first possibility would be to use the same solution we used earlier, and just use more workers.

  1. Select first company you have yet to contact
  2. Select first idle person
  3. Give this person the following instructions:
    1. Write and send an introductory letter to the selected company
    2. Spend the next few days near your mail-box waiting for a reply like Hachikō
    3. Read response and send the information they want, if any
    4. Spend the next few days near your mail-box waiting for a reply like Hachikō
    5. (and so on)
    6. If the wage is immediately acceptable, inform everyone else to stop. Else, write down the wage and return for further instructions.
  4. If there are uncontacted companies, repeat instructions from the beginning
  5. Identify selected company and send acceptance letter.

This does, in fact, lead to a significant speed-up. However, the root problem hasn't really been addressed: The supermajority of the time of each worker is spent just waiting. Worst of all, this solution does not scale: if we had 100 companies, it'd need 100 workers, which is very far from a reasonable proposal.

What other possibilities are there?

Preëmptive multi-tasking

As a first attempt, let us adapt the “parallelism” example above, such that it can work with just one worker: yourself. Aside from that, the mode of operation will be as similar as possible.

To achieve that, each company's correspondence will be considered a different task; you, the worker, will attempt to schedule each task in the most suitable manner possible. The simplest way, the one we will adopt here, is to split your time in “chunks” of a specific duration, (10 minutes, for the sake of this example,) work on one task for one chunk, then move to the next task. This means that we decide in advance, preëmptively, when each task will run; for this reason, this scheme is called “preëmptive multi-tasking”.

The algorithm would look as follows:

  1. Select first company you have yet to contact
  2. Write and send an introductory letter to the selected company
  3. Write down the following instructions, so that hereafter they shall comprise a unified task:
    1. Wait for this company's reply
    2. Read response and send the information they want, if any
    3. Wait for the reply
    4. (and so on)
    5. If the wage is immediately acceptable, skip ahead to no. 10. Else, write down the wage and mark this task as complete.
  4. If there are uncontacted companies, repeat instructions from the beginning
  5. Pick up next non-finished task
  6. Work on picked up task for 10 minutes
  7. Put task down
  8. If there are yet unfinished tasks, repeat instructions from no. 5
  9. Select the largest of the wages you have been offered
  10. Send an acceptance letter to the company you have selected

Not half bad! Most problems we mentioned earlier are solved: All the tasks run at the same time, and the time spent waiting for replies is not wasted. Furthermore, this uniquely protects us from the possibility of one company wasting our time: even if they attempt to stall us by asking us to write 1000 pages for our response, our communication with the rest of the companies will still keep going, so it's not a big issue. This problem is, well… preëmpted.

For all the benefits that this solution has, however, it also has one very important draw-back. Said draw-back gives rise to one question, that will keep haunting us for the entire duration of the execution. The question, that will need to be asked once every 10 minutes, is the following:

Now… where was I, again?

Switching contexts

The big draw-back we mentioned earlier is this: Preëmptive multi-tasking has no notion of “Wait a minute, I'm in the middle of something”. When the clock tells you to switch, you switch. It doesn't matter whether you're in the middle of reading your mail, or authoring a response; it doesn't matter if you're in the middle of a specific paragraph, or even a specific sentence. As soon as the 10 minutes pass, you drop whatever you were doing and pick up the next task.

So, in order to not get lost, you have to keep notes. After the clock chimes, you write down what you'll need to remember next time you pick up this task, and then go do something else. In real life, as in computer science, this is called “context switching”.

As you can probably tell, context switching is not free. Therefore, there is by necessity a delicate balance to be struck. If you switch contexts too rarely, you might end up with uselessly large chunks of time; if you switch contexts too frequently, the time it takes to switch will start to dominate. For 10 companies, this will not be much of an issue; for 100 companies, it absolutely will be.

So, how do we communicate the notion of “Wait until I finish this one thing, then we can switch tasks”?

Coöperative multi-tasking

The bare-metal C developers will immediately jump at this: “Wait, I know the answer! First, you create a data-type that can remember at which point we are in the correspondence. Then you instantiate one such data-type for each company, and then in a loop inside the program you—”

Forgive the interruption. Yes, this is a way to solve the problem. However, this solution has the hall-mark of easy-to-make mistakes: It mixes what you want to do with how you want to achieve it. After all, this example is simplified on purpose: later on we might want to insert restrictions such as “if a company takes too long for one answer, drop it” or “if the entire correspondence with a company has been taking too long, drop it”. We would prefer to have a way to achieve this that does not explode in complexity as the requirements increase. And anyway, such a solution would merely be a manual implementation of the things we will describe hereon.

Ideally, we would like the ability to write the following:

  1. Select first company you have yet to contact
  2. Write and send an introductory letter to the selected company
  3. Write down the following instructions, so that hereafter they shall comprise a unified task:
    1. Wait for this company's reply. Feel free to do something else while you wait.
    2. Read response and send the information they want, if any
    3. Wait for the reply. Feel free to do something else while you wait.
    4. (and so on)
    5. If the wage is immediately acceptable, jump ahead to no. 11. Else, write down the wage and mark this task as complete.
  4. If there are uncontacted companies, repeat instructions from the beginning
  5. If you have new mail, select one, and pick up the corresponding task
  6. Work on picked up task until you reach a point at which you are permitted to wait
  7. Put task down
  8. If there is still more mail to reply to, repeat instructions from no. 5
  9. If there are still incomplete tasks, but no new mail, then do something else until tomorrow and repeat instructions from no. 5
  10. Select the largest of the wages you have been offered
  11. Send an acceptance letter to the company you have selected

This high-lights the biggest difference between this scheme and the immediately prior one: in this scheme, the task can only be interrupted if it declares itself to be interruptible, in the places it declares itself to be interruptible. Therefore, no task can be interrupted without its consent, so to speak; this is why this scheme is called coöperative multi-tasking.

Because of this, if we want to use this scheme, we must take care to ensure that all tasks are in fact coöperative—we are offered no defense against malicious tasks, that might try to waste our time. The up-shot, however, is that context switching is now dirt-cheap! Coöperative multi-tasking makes it possible to analyse the tasks before execution, separate them into pieces, and figure out which information needs to be saved between pieces and which can be discarded. Therefore, when switching tasks, we only save and retrieve what's absolutely necessary, eliminating most of the overhead needed. Furthermore, because we only switch when we have to wait for something else anyway, putting the task down usually means we don't have to concern ourselves with it for a while.

“Is it really that simple to use?”

Well… yes and no. If you already have in place the infrastructure you will need –say, tokio– then yes, all you need to write is the thing we wrote above. If you do not have this infrastructure, however, you will need to supply it yourself.

The necessary infrastructure for coöperative multi-tasking

The real-life analogies have already started to break down. Thus, henceforth we will mainly use programming terminology.

As you have probably guessed already, the “using more workers” approach was an analogy to using more CPU cores to do the same thing, and the preëmptive multi-tasking approach was about using lots of threads in the same core. As we have hopefully clarified by now, coöperative multi-tasking is neither of those things. As far as the produced machine code is concerned, it is in essence a way to permit the CPU to jump back-and-forth within the program, in-between pre-defined points, in whichever way it deems best during execution, in order to avoid having to idle.

Now: as we mentioned in the article's tl;dr, the thing that says “Feel free to do something else while you wait” in Rust is .await—that's all the word means, really. But what about async?

When a function contains even just one .await inside it, that means it will probably have to wait for something else to complete before it resumes doing the rest of the things it has to do. This means that, when we call it, its immediate reply will probably be “Not yet, I'm still waiting for thing A”. What we would want here is a way to check in on the result more than once if necessary, so that when it's done it changes its message to “right, thing A is finished, you can resume executing me” for our convenience.

(We are making a simplifying assumption here. See the Appendix for clarification.)

We therefore have a need to disambiguate between those two circumstances. Like everywhere else, Rust disambiguates them using enums—“either-or” types, essentially. Thus, any function that contains even just one .await inside it, instead of returning an ordinary data-type, instead returns a Future that contains it. Said Future can be asked if it has completed (or “polled” in Rust parlance) and its response to that can be either “No, I'm still waiting” (ie Poll::Pending) or “I'm finished, here you go” (ie Poll::Ready<the thing we want>). Thus, async translates to “I might have to wait for something, but in the meantime you can ask me if I'm ready and I'll respond appropriately”.

At this point, the reader might have a picture in mind, that of an adult driving a car and an impatient child in the back seat. The child can .poll() the driver, using the phrase “Are we there yet?”, to which the driver can respond either “No, not yet” or “Yes, we've arrived”. This picture is doubly useful: Once because that's in fact the simplest way to ask async functions if they've finished, and once because it immediately illustrates how wasteful such a thing would be.

No, instead of repeatedly asking every async function we have if it's completed, it would be better to give it a mechanism to tell us how long to wait for, and to notify us once it's finished or needs our input to continue. After that, if we have more than one function ready, we can use something to select which function to proceed with.

At this point, we have explained everything one needs in order to implement async/await in a language. Those are, in order:

  1. A way to save and load contexts when we have to switch tasks
  2. A way to decide/schedule when to .poll() each function we need
  3. A way for each function to notify us if our attention is required

Taking these three things together, and adding helper functions that the user might need, is how one creates an async runtime.

Saving and loading contexts

In Rust, when using async and await, the user needs to provide nothing of the sort. It comes prepackaged with the language via Rust Magic™. Other languages might each have different opinions; there's no one-size-fits-all solution.

Scheduling the polling of functions

This is undertaken by a piece of software called an “Executor”. A simple round-robin would be the simplest such thing, but it runs into the “are we there yet” problem mentioned earlier. More sophisticated ones can instruct the CPU to go idle if necessary, and manage the spawning of threads so as to avail themselves of both preëmptive and coöperative multi-tasking.

Notifying us for attention

This is undertaken by a piece of software called a “Waker”. A common implementation is to have a queue of tasks which are definitely ready: Wakers push tasks onto this queue, and executors pop and execute them.

“This all sounds pretty neat, but where exactly is it useful?”

Generally, it is useful whenever you want to juggle lots of things at once, each of which might have to wait for something different. The canonical example is a server connected to clients: it's not just the clients' responses that have to be waited for, but also eg data-base look-ups.

That said, another area that has adopted async/await rather enthusiastically is embedded programming. Embedded cores usually have to wait for their environment to change before they reäct accordingly: be it user input, timers, even some memory accesses can be slow enough to warrant this approach. Hence, the evolution of crates like embassy.

(Side-note: Web-dev and embedded are basically as unlike programming disciplines as you can get, so anything that is useful for both of them is probably very important.)

That's all for now!

In the future I think I'd like to implement a simple async run-time for very simple microcontrollers such as AVRs, but that's very far beyond this article's scope. Interested readers will have to wait indefinitely, or read this very helpful article on which my own was heavily based.

Appendix

We mentioned that an async function can either be “waiting for thing A” or “finished”. This was a simplification, in order to explain the mechanisms further.

In real code, functions will likely have multiple possible .await points. This means that, at each of them, the function might pause execution until the thing it wants becomes available; once it becomes indeed available, the CPU will need to pick it up in order to resume execution until either the next .await point or the conclusion of the function.

Furthermore, in the general case, .await points are not reached serially. For instance, time-outs are a very common use-case, that are implemented by running a “do this” async function and a “wait for that much time” async function in parallel and seeing which completes first.

But the neatest thing is when Rust itself figures out a way to do this. For instance, if you want to print a file, you'll need access to both the file and the printer. In this case, what Rust will do is request both of those things at once, and proceed once both have been acquired. Thus, instead of waiting for the sum of the two times, it merely waits for the longer of the two.


═════════════════════════════════════════════════════════════════════════════════

1

The quote above was not verbatim. The following quote, however, is presented word-for-word:

A Rust async fn is an explicit state machine that you can manipulate and pass around, that happens to be phrased using normal Rust syntax instead of tables and match statements. It generates a hidden type implementing the Future trait. The code that calls an async fn (or uses any Future, for that matter) has ultimate control over that Future, and can decide when it runs or doesn’t run, and can even discard it before it completes. — Cliff Biffle

With all due respect to mr Biffle, whom I hold in the highest esteem… this is not accessible as an introduction to async/await. It is, at best, an important clarification for people who already have an inkling but aren't 100% certain.

2

“Busy-waiting” is a computer-science term that, aptly enough, describes the process of waiting for something to finish without the possibilitity of managing to do anything else in the mean-time.