Go Language Goroutines

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

Internal

Overview

Goroutines are fundamental components of the concurrent programming model in Go. Goroutines and how they fit with Go concurrency is briefly explained in Go Concurrency, and in more detail in this article.

Unlike other programming languages, that had chosen to reify O/S threads and expose them as part of the concurrency programming model, Go uses a different fundamental concurrency primitive, the goroutine, which can be thought of as a concurrently-executing function, alongside many other concurrently-executing functions in the program. Threads are not exposed in the language.

A goroutine is declared by prepending a function invocation with the go keyword:

func somefunc() {
  ...
}

go somefunc()

Once declared as such, the function/goroutine is treated as an independent unit of work that runs concurrently with other functions. During the program execution, the function code is mapped transparently by the Go runtime onto O/S threads and executed, possibly in parallel.

Sharing data between concurrent goroutines must be approached with care, like in any other concurrent programming language. Go provides memory access and execution synchronization primitives as part of the sync package, but their use is recommended only in very specific cases, such as guarding the internal state of a struct in a small lexical scope. Go language designers advise programmers to share data by communicating, instead of communicating by sharing data - which is what the sync package helps with. A much more robust, scalable and composable communication mechanism between goroutines is provided by channels. Channels are an implementation of the communicating sequential processes (CSP) model, and the entire Go concurrency model has been built around it.

This approach allows programmers to directly map concurrent problems onto natural concurrent constructs, the goroutines, instead of dealing with the minutia of starting and managing threads.

Goroutines are very inexpensive to create and manage, which is not the case with the operating system threads exposed in the language. Thousands or tens of thousands of goroutines can be created in the same address space with no adverse side effects. Internally, the goroutine is a structure managed by the Go runtime that with a very small initial memory footprint (a few kilobytes) and that is managed by the runtime, which grows or shrinks the memory allocated for the stack automatically. The CPU overhead averages about three instructions per function call. Many goroutines execute within a single O/S thread. From the O/S point of view, only one thread is scheduled. The goroutine schedule is done by the Go runtime scheduler. The Go runtime scheduler uses a logical processor. The goroutines scheduled on a logical processor are executing concurrently, not |in parallel. However, it is possible to have more than one logical processor, each logical processors can be mapped onto an O/S thread, which may be scheduled to work on different cores. In this case, goroutines execute in parallel.




A goroutine is always created automatically, to run the main() function.

Sharing mutable variables between goroutines is discouraged. If it has to be done, mutual exclusion mechanisms like sync.Mutex are available. Communication between threads should be done preferably with channels. This pattern comes from a paradigm called "communicating sequential processes" (CSP). CSP is a message-passing model that works by sending data between goroutines instead of locking data for mutual exclusion or synchronized access ("Do not communicate by sharing memory; instead, share memory by communicating").

The go statement gives access to concurrency. When a function is executed as a goroutine, it is treated as an independent unit of work that gets scheduled and then executed on a logical processor.

Goroutines free us from having to think about our problem space in terms of parallelism and instead allow us to model problems closer to their natural level of concurrency: functions. The benefit of the more natural mapping between problem space and Go code is the likely increased amount of the problem space that is modeled in a concurrent manner. The problem we work on as developers are naturally concurrent more often than not, we'll naturally be writing concurrent code at a finer level of granularity that we perhaps would in other languages.

Goroutines are lightweight, and we normally won't have to worry about creating one. There are appropriate times to consider how many goroutines are running in your system, but doing so upfront is a premature optimization.

Creation and Invocation

To explicitly create a goroutine and schedule it, use the go keyword, by providing a function invocation.

func somefunc(i int) {
  ...
}

...

go somefunc(10)

The go invocation returns immediately to the next line, while the invoked function is executed on a different thread.

Any function invocation can be used to be sent to a goroutine:

go fmt.Printf("something")

Anonymous functions (lambdas) can also be executed as goroutines:

...
go func(s string) {
  ...
  fmt.Println(s)
  ...
}("test")

Note that this syntax only schedules a goroutine. It is not determined when it will be actually executed.

what happens with the result of the function?

Goroutines and main()

If the invocation is done from main(), the scheduler always seems to continue to execute main(), it does not preempt main() to executes the new goroutine. Also, given the fact that when main() exists, all other goroutines are forcibly terminated, unless there's a mechanism that ensures they will be executed, they might not be executed at all. It is bad practice to use time.Sleep() to preempt and delay main(), because we're making assumptions about timing, and these assumptions can be wrong, and also we're assuming that the scheduler will schedule the other goroutine, when the main goroutine goes to sleep.

Also see:

main()

Exiting

A goroutine exits when the code is complete.

When the main goroutine is complete, all other goroutines are forced to exit. It is said that those goroutines exit early.

Pausing

time.Sleep()

Atomic Primitives

atomic package

Communication

Communication between threads is essential for concurrent programming. Data can be sent to goroutines by simply providing it as arguments to the functions that are invoked with the go keyword. This only works when the goroutine is started, though. Another way to send data to goroutines is via channels. Goroutines can use channels to receive and send data to other goroutines.

Channels

Go Channels

Deadlock

The Go runtime detects a deadlock where all goroutines are locked. It does not detect partial deadlocks, when only a subset of goroutines are deadlocked.

Also see:

Concurrent (Parallel) Programming | Deadlock