Understanding Kotlin Coroutines with this mental model

For the majority of developers, the concept of a coroutine is pretty unfamiliar, although the first ideas date back to the early sixties. This is because we started to learn Software Development with programming languages like Java, in which the concept of a coroutine simply does not exist. What we did learn, however, are how basic concepts like classes, functions, and interfaces work. Over time, we developed a fundamental understanding of these essential, yet abstract concepts.

Every Android developer will sooner or later get in touch with the concept of “coroutines” for the Kotlin programming language. When developers go through some online resources to learn about coroutines, what they most often get are oversimplifications like “coroutines are like light-weight threads” or code snippets about implementing some “Hello World” behavior. In the real world, when requirements are often more complex, their insufficient understanding of coroutines fails them to develop efficient and correct concurrent software.

This blog post will help you to form a solid mental model about this new emerging concept for modern software development. It will define 5 important building blocks on which your understanding will be based. This fundamental understanding will help you develop your concurrent applications successfully.

This blogpost is most helpful for developers, who already have some experience with coroutines but still aren’t able to completely wrap their heads around them.

ℹ️ You can also check out my new open source project, in which I developed 16 sample implementations of the most common use cases of Kotlin Coroutines for Android Development. I think you will find it helpful.

What exactly is a "mental model"? 🧠

Mental models are how we understand the world.

Mental models are how we simplify complexity, […]

A mental model is simply a representation of how something works.

Source: Mental Models by Farnam Street

Most of us (me included) don’t even understand basic things in life to their full extend. We know, for instance, that gravity pulls things towards the earth, but don’t really know the exact physical reasons for this. We don’t have to know all the details, we just have to have a sufficient understanding of gravity in order to survive or make practical decisions like not jumping from a diving platform into a pool that is unfilled.

The same is true for programming. We don’t need to know much about electricity, how exactly the CPU works, or every detail that compilers do. We can write software without detailed knowledge because we rely on abstractions and high-level programming languages that simplify complexity. However, it is still useful to have a mental representation of how these underlying components work.

In order to be able to successfully develop applications with coroutines, it is not necessary, to know all the source code in the coroutines library or the exact functioning of the compiler, however, a basic mental model about the coroutines machinery is essential to understand how code in coroutines is executed in contrast to regular code.

Routines and Coroutines

Let’s start our journey with an analysis of the term itself:

It consists of CO and ROUTINE. Every developer is familiar with ordinary routines. They are also called subroutines or procedures, but in Java and Kotlin they are known as functions or methods. Routines are the basic building blocks of every codebase.

A routine, according to Wikipedia, is a

sequence of program instructions, that performs a specific task, packaged as a Unit. This unit can then be used in programs wherever that particular task should be performed.

The following simple code snippet shows the control flow of regular routines:

fun main() {
    println("main starts")
    routine(1, 500)
    routine(2, 300 )
    println("main ends")
}

fun routine(number: Int, delay: Long) {
    println("Routine $number starts work")
    Thread.sleep(delay)
    println("Routine $number has finished")
}

// ======================================================
// print output:

// main starts
// Routine 1 starts work
// Routine 1 has finished
// Routine 2 starts work
// Routine 2 has finished
// main ends
// ======================================================

An important and very intuitive characteristic of a regular routine is that when it is invoked, its body is executed completely, from top to bottom and once the routine’s task is done returns the control flow back to the call site. You can see this in the print output, where routine 1 finishes its work before routine 2 starts its own work. For simplicity, Thread.sleep() is used here to represent either some kind of work or the time needed to wait for a response from an external system. The following diagram illustrates the control flow:

Now, that we know the characteristics of a regular routine, let’s have a closer look at coroutines. Co stands for cooperative. What follows is a similar example that uses coroutines instead of regular routines:

fun main() = runBlocking {
    println("main starts")
    joinAll(
        async { coroutine(1, 500) },
        async { coroutine(2, 300) }
    )
    println("main ends")
}

suspend fun coroutine(number: Int, delay: Long) {
    println("Coroutine $number starts work")
    delay(delay)
    println("Coroutine $number has finished")
}

// ======================================================
// print output:

// main starts
// Routine 1 starts work
// Routine 2 starts work
// Routine 2 has finished
// Routine 1 has finished
// main ends
// ======================================================

Here, we use the suspend function delay() to represent some kind of work. As you can see in the print output, the two coroutines are not executed completely before returning the control flow, as regular routines do. Instead, they are only executed partially, then get suspended and return back the control flow to other coroutines in the middle of their execution. That’s why they are called cooperative routines – they can pass execution back and forth between each other.

You can verify this characteristic by investigating the print output: Routine 2 was started even before Routine 1 finished! The following diagram now illustrates the control flow of coroutines:

Coroutines can be suspended at every “suspension point”. You might wonder what exactly a “suspension point” is. It is basically every point when a coroutine calls a suspend function. In Android studio, each call to a suspend function is marked by the following symbol in the gutter on the left:

In our example coroutine, the suspend function delay() is a suspension point and therefore Android Studio shows this special icon in the gutter:

🧱 Mental Model Building Block Number 1

With coroutines, it is possible to define blocks of code that are executed only partially before returning the control flow back to the call site. This works because coroutines can be suspended at suspension points and then resumed at a later point in time.

Coroutines and Threads 🧵

Well, you might say that you can achieve the same behavior as in the last example with routines by starting a new thread every time you call the routine. This is true, as the following code example illustrates:

fun main() {
    println("main starts")
    threadRoutine(1,500)
    threadRoutine(2,300)
    Thread.sleep(1000)
    println("main ends")
}

fun threadRoutine(number: Int, delay: Long) {
    thread {
        println("Routine $number starts work")
        Thread.sleep(delay)
        println("Routine $number has finished")
    }
}

// ======================================================
// Print output:

// main starts
// Routine 1 starts work
// Routine 2 starts work
// Routine 2 has finished
// Routine 1 has finished
// main ends
// ======================================================

As you can see, we get the same print output as in the coroutine example. However, when using coroutines, you don’t need to create threads to get concurrent behavior! All code in the coroutine example is executed on the main thread. Threads consume a considerable (1-2 MB per thread) amount of memory and switching between them is quite expensive.

🧱 Mental Model Building Block Number 2

Coroutines allow you to achieve concurrent behavior without switching threads, which results in more efficient code. Therefore, coroutines are often called “lightweight threads”.

A Coroutine can perform different operations on different threads

For some reason, I don’t really like the definition of “Coroutines are light-weight threads”. It just does not feel complete. With coroutines, it is possible to achieve concurrency without creating expensive threads, yes. But sometimes, you actually want to run some code on other threads. For instance, you don’t want to perform long-running operations on the main thread as they will freeze the UI and make your app unresponsive. With coroutines, we can define blocks of code that can each be executed on a different thread. Therefore, I see coroutines as abstractions on top of threads. The following code examples shows this feature in action by using the withContext() construct to switch to another dispatcher:

fun main() = runBlocking {
    println("main starts")
    joinAll(
        async { threadSwitchingCoroutine(1, 500) },
        async { threadSwitchingCoroutine(2, 300) }
    )
    println("main ends")
}

suspend fun threadSwitchingCoroutine(number: Int, delay: Long) {
    println("Coroutine $number starts work on ${Thread.currentThread().name}")
    delay(delay)
    withContext(Dispatchers.IO) {
        println("Coroutine $number has finished on ${Thread.currentThread().name}")
    }
}

// ======================================================
// print output:

// main starts
// Routine 1 starts work on main
// Routine 2 starts work on main
// Routine 2 has finished on DefaultDispatcher-worker-1
// Routine 1 has finished on DefaultDispatcher-worker-1
// main ends
// ======================================================

As you can see, the execution of code of each coroutine switched to another thread in the middle of the execution. The first String was printed on the main thread whereas the second one was printed on DefaultDispatcher-worker-1

🧱 Mental Model Building Block Number 3

Coroutines can be seen as abstractions on top of threads. Different code blocks within the same coroutine can be executed in different threads.

No magic involved 🧙‍♀️

It’s difficult to understand how it is possible that coroutines allow us to write concurrent code on a single thread. You might think that Jetbrains uses some kind of black JVM magic to make them work. But in fact, everything becomes clear as soon as you understand what the Kotlin compiler is doing when coroutine-based coded is compiled to byte code. It uses a method named Continuation Passing Style The following suspend function:

suspend fun coroutine(number: Int, delay: Long){
	println("Coroutine $number starts work")
	delay(delay)
	println("Coroutine $number has finished")
}

gets transformed to something like this, (as of Kotlin 1.3, can look different in the future):

fun coroutine(number: Int?, delay: Long?, continuation: Continuation<Any?>): Unit {
	  when(continuation.label){
        0 -> {
    		    println("Coroutine $number starts work.")
    		    delay(delay)
        }
	      1 -> {
	          println("Coroutine $number has finished")	  
            continuation.resume(Unit)
	      }
    }
}

I simplified the code quite a lot so that you can understand what’s going on without confusing you.

What happened? The suspend modifier got removed, so in the byte code, our function is just a regular one. The compiler also added an extra parameter to the function signature of type Continuation. The body now contains a when expression. Depending on the label property of the continuation object, either the first string is printed and delay() is called or the second string is printed and continuation.resume(Unit) is invoked to communicate back to the caller of this function that it completed. This mechanism allows suspend functions to be executed only partially. The continuation, on the one hand, acts as a callback in a sense that it notifies the caller when it has completed by invoking continuation.resume(Unit). If our suspend function would return a value, like a string, for instance, the following would code would be called: continuation.resume(someString). On the other hand, the continuation also acts as a state machine, because some state (like the label) of the execution is stored in it.

🧱 Mental Model Building Block Number 4

The compiler transforms suspend functions into regular functions that receive an additional continuation . Depending on the state of the continuation, different code of the suspend function is executed. That’s how a coroutine can be split up into separately running code blocks.

Coroutines are non-blocking

Now that you have a basic understanding of how the Kotlin Compiler transforms suspend functions, you should also be able to understand why coroutines are non-blocking. This can be best explained by taking a look at the differences between Thread.sleep() and delay().

Thread.sleep() blocks its host thread. This means that no other work can happen on this specific thread.

On the contrary, if you call delay(), the underlying thread is not blocked, so it can execute other code until the specified time for the delay has passed. How is this possible? I wasn’t able to understand this until I started to dig into the actual source code of the coroutines libraries to find out how delay() works. I found out that delay() is actually implemented by every dispatcher. So I checked out Dispatchers.Main of the library org.jetbrains.kotlinx:kotlinx-coroutines-android that contains the implementation of the Android main dispatcher. What follows is a simplified version of the delay() function of this dispatcher:

override fun delay(timeMillis: Long, continuation: CancellableContinuation<Unit>) {
    val block = Runnable {
        with(continuation) { resumeUndispatched(Unit) }
    }
    handler.postDelayed(block, timeMillis.coerceAtMost(MAX_DELAY))
    ...
}

So all that delay() is doing is creating a runnable and using a handler.postDelayed() to delay for the specified amount of time and then calls resumeUndispatched(Unit) to tell the coroutines machinery to continue execution of the coroutine. handler is just an ordinary Android handler with Looper.getMainLooper() as looper. Again, no magic at all 🧙‍♀️.

🧱 Mental Model Building Block Number 5

delay() is not blocking its host thread because it, depending on the dispatcher, uses mechanisms like handler.postDelayed() . So it basically just schedules all subsequent operations of the coroutine to a later point in time.

Summary

Let’s repeat the five mental model building blocks that we worked out:

🧱 With coroutines, it is possible to define blocks of code are executed only partially before returning the control flow back to the call site. This works because coroutines can be suspended at suspension points and then resumed at a later point in time.

🧱 Coroutines allow you to achieve concurrent behavior without switching threads, which results in more efficient code. Therefore, coroutines are often called “lightweight threads”.

🧱 Coroutines can be seen as abstractions on top of threads. Different code blocks within the same coroutine can be executed in different threads.

🧱 The compiler transforms suspend functions into regular functions that receive an additional continuation . Depending on the state of the continuation, different code of the suspend function is executed. That’s how a coroutine can be split up into separately running code blocks.

🧱 delay() is not blocking its host thread because it, depending on the dispatcher, uses mechanisms like handler.postDelayed() . So it basically just schedules all subsequent operations of the coroutine to a later point in time.

These building blocks have helped me a lot to finally wrap my head around coroutines and write better concurrent code.

🎓 If you enjoyed this article, you probably also like my course about Kotlin Coroutines and Flow for Android Development: