Dan's Musings

The Down Sides of Go's Goroutines

I recently wrote a three part post series entitled "Which Programming Language Should I choose?". I had some thoughts on the conclusions I drew, after a few experiences I had.

In general, I painted Go as the be-all, end-all; however I have recently found drawbacks in the very language decision that I originally found made Go great. This decision was its implicit event loop and its abstraction of the difference between cooperative and OS-level threads.

There are two key drawbacks to this otherwise interesting and useful decision. First, Go can't have exceptions. Second, Go does not have the ability to synchronize tasks in real (wall clock) time. Both of these drawbacks stem from Go's emphasis on coroutines.

With this new information in mind, my view of the programming language landscape shifted. All of a sudden, older languages like Java that lacked this feature didn't seem like they were just worse; they simply chose different trade-offs. They regained credit in my eyes.

Broken Callstacks

Go is famous for its cheap, easy-to-spin-up goroutines. These routines are able to manage their own, independent callstack, allowing programmers easy, ergonomic access to creating cheap callstacks and the ability in some sense to manipulate them.

In a variation of Greenspun's Tenth Rule, we find that very often, when the question of "which language had this feature first?", the answer is very often "Lisp".

In this case, we can look at Scheme's continuations, presented to the world in 1975 and predating Go by about 30 years.

Scheme allowed the programmer to clone the callstack, pass it around, and return from it as a primitive. It called this primitive a "continuation". Using this primitive, it is trivial to construct "transparent" coroutines and event loops in something that closely approximates that "clean" goroutine feeling. This feature of continuations is one treasured by the Scheme community in much the same way as Go's community reveres their colorless functions.

Therefore, we may look at the history of Scheme as a language to see if there are any problems with this rather powerful feature.

It turns out there is a well-documented trade-off: If the programmer clones the callstack, hands the callstack off to a coroutine, and then both callstacks have e.g. a reference to an open file handle in one of their stack frames, it means the programmer cannot safely unwind the stack. (If they did, the unwinding would close the file handle and the other coroutine might then return to that stack frame, only to find the file handle already closed.) Thus, granting programmers the ability to pass callstacks around precludes the ability to reason about resources and their lifetimes.

Sidestepping these issues, goroutines don't have a shared call history with the functions that spawned them; each goroutine has its own stack. Since goroutine stacks are thus made disparate -- goroutines do not "share" common "ancestor" stack frames like Scheme's continuations do -- they can unwind their own stacks. However, this also means that when a goroutine is spawned, it has no memory of its parent, nor the parent for the child. This has already been noticed by other thinkers as a bad thing.

Because of it, goroutines can't guarantee lifetimes of the resources to which they have access, in a similar way that Scheme can't guarantee them.

Consider the following scenario:

  1. Goroutine a opens a file handle, then spawns an anonymous func() as goroutine b.
  2. As a consequence of b being a closure, it inherits a reference to the open file handle and uses it.
  3. The a goroutine returns and closes the file handle.

Consider something of a converse scenario:

  1. Goroutine a spawns a goroutine b, without using an anonymous function this time. No closure, just a simple function spawn.
  2. Goroutine a opens a database connection.
  3. Goroutine b panics, crashing the program.
  4. The database connection is then left open as a zombie TCP connection.

This problem of a broken parent-child link is not exactly the same problem as Scheme, but it creates that same unmoored feeling. All of a sudden, a goroutine can't make guarantees about the lifetime of its objects or opened resources.

Further, goroutines' lack of their stack ancestry means they can't natively give out the nice stacktraces found in other languages, though there are a lot of workarounds, to be sure.

Because of this stack disconnection of parents and children, it really becomes impossible to have exceptions. Not just won't, as Rob Pike would have you believe. Can't. Rather than an explicit up-front design choice, getting rid of exceptions now appears to me to be a price the Go designers paid to have the power of cheap goroutines. Rather than an innovation in efficiency, goroutines are an innovative trade-off.

Let us compare the database example with Java. Databases are opened in functions and closed in those same functions. If the handle is shared with another thread, the programmer can ensure that the other thread can't crash the program if an exception causes stack-unwind, ensuring that handles opened by a given function also get closed by that function. This is different from Go's defer. Defer lets clean-up function calls recover from panics. By contrast, the creators of threads in Java can choose to set a default exception handler. In Go, only calls made by the offending goroutine can recover from a panic, while in Java, the parent caller/creator of the new thread can itself set a recovery mechanism.

As an aside, my Dad works on IBM DB2 mainframe databases. This contrived example of a dropped database connection is real. He says the mainframers at his company hate UnixODBC for this reason, it tends to leave zombie connections behind. (Go uses UnixODBC, which my Dad thought was sad, because it means he couldn't use the language against the database in a practical manner at work.) The one the admins love, though, is Java's JDBC. Huh, Java is a language that has exceptions, with full child-to-parent stack unwinding. It also visibility into OS-level threads, thus allowing the programmer to handle exceptions between different threads. Go figure.

Go's ability to make these cheap goroutines effortlessly is its greatest strength, but it is also its greatest weakness.

Gone Colorblind

Go is famous for being the first language to have colorless functions with its implicit eventloop and refusal to acknowledge the difference between an OS thread and an event loop task. We will quickly see that this decision comes with pain.

Consider the problem of a task scheduler cancel button. The program must schedule a task to be run on an agent. At any given time, however, the task must be able to be cancelled.

This can be safely managed in Java and friends, but not in Go.

In Go, goroutines are on the eventloop, and may (will) be cooperatively scheduled with other goroutines. Thus, goroutines can only be stopped at specific points. The Go community is painfully aware of this.

This highlights a key difference between cooperative scheduling and OS-level scheduling: OS-level threads can be stopped at any time, while cooperatively scheduled coroutines cannot. Glossing over this difference means that we lose the power that would otherwise be available to us with OS-level threads when it comes to scheduling.

Just like with first consequence, that of the inability to have exceptions, the golang community blusters, saying this is actually a good thing, while publishing a long list of workarounds.

In Java, it is child's play to interrupt a running thread. Interrupts are implemented using -- surprise, surprise! -- exceptions. Lovely, stack-unwinding, cleans-up-all-the-open-handles exceptions. Threads can be stopped immediately.

The problem with Go's famous colorless functions is that the "color", or the fact that coroutines and CPS and OS threads are being used, is still there under the covers. It's just hidden from the programmer. This means the programmer doesn't have to deal with the complexity, but it also ties the programmer's hands from distinguishing them when it matters. Go is not so much colorless as it is colorblind.

Other Languages Are Still OK

OS threads provide some very nice constructs for programmers, and are hardened, battle-tested tools. Sometimes cooperative scheduling is warranted, but most of the time, a programmer can get along without learning about or using cooperative scheduling for many, many tasks. In those languages where a difference is made, cooperative scheduling is, yes, more painful. In these situations, we can at least take comfort that our pain at least buys us other nice things, like scheduling primitives and the guarantees of clean-up and lifetimes that come with exceptions.