Go Language Goroutines: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
 
(31 intermediate revisions by the same user not shown)
Line 7: Line 7:
Goroutines are fundamental components of the concurrent programming model in Go. Goroutines and how they fit with Go concurrency is briefly explained in [[Go_Concurrency#Overview|Go Concurrency]], and in more detail in this article.
Goroutines are fundamental components of the concurrent programming model in Go. Goroutines and how they fit with Go concurrency is briefly explained in [[Go_Concurrency#Overview|Go Concurrency]], and in more detail in this article.


Unlike other programming languages, that had chosen to reify [[Concurrent_(Parallel)_Programming#O/S_Threads|O/S threads]] and expose them as part of the concurrency programming model, Go uses a different fundamental concurrency primitive, the '''goroutine''', which can be thought of as a function that executes concurrently alongside many other functions that executed concurrently in the program. Threads are not exposed in the language.
Unlike other programming languages, that had chosen to reify [[Concurrent_(Parallel)_Programming#O/S_Threads|O/S threads]] and expose them as part of the concurrency programming model, Go uses a different fundamental concurrency primitive, the '''goroutine''', which can be thought of as a function that executes concurrently alongside many other functions that executed concurrently in the program, a unit of work executed concurrently with other units of work. Threads are not exposed in the language.


A goroutine is declared by prepending a function invocation with the <code>[[Go_Language#go_keyword|go]]</code> [[Go_Language#Keywords|keyword]]:
A goroutine is declared by prepending a function invocation with the <code>[[Go_Language#go_keyword|go]]</code> [[Go_Language#Keywords|keyword]]:
Line 18: Line 18:
go somefunc()
go somefunc()
</syntaxhighlight>
</syntaxhighlight>
Once declared as such, the function/goroutine is treated as an independent unit of work that runs concurrently with other functions. During the program execution, the function code is mapped transparently by the [[Go_Runtime|Go runtime]] onto O/S threads and executed, possibly in parallel.
Once declared as such, the function hosted by the goroutine is treated as an independent unit of work that runs concurrently with other functions. During the program execution, the function code is mapped transparently by the [[Go_Runtime|Go runtime]] onto O/S threads and executed, possibly in parallel.


Sharing data between concurrent goroutines must be approached with care, like in any other concurrent programming language. Go provides [[Go_Package_sync#Memory_Access_Synchronization_Primitives|memory access]] and [[Go_Package_sync#Execution_Synchronization_Primitives|execution synchronization primitives]] as part of the <code>[[Go_Package_sync#Overview|sync]]</code> package, but sharing mutable variables between goroutines is discouraged, and the usage of the <code>sync</code> primitives is recommended only in [[Go_Concurrency#Programming_Models|very specific cases]], such as [[Go_Mutex_and_RWMutex#Guarding_the_Internal_State_of_a_struct|guarding the internal state of a <code>struct</code> in a small lexical scope]]. Go language designers advise programmers to share data by communicating, instead of communicating by sharing data - which is what the <code>sync</code> package helps with. A much more robust, scalable and composable communication mechanism between goroutines is provided by [[Go_Language_Channels#Overview|channels]]. Channels are an implementation of the [[Concurrent_(Parallel)_Programming#Communicating_Sequential_Processes_(CSP)|communicating sequential processes (CSP)]] model, and the entire Go concurrency model has been built around it.  
Goroutines execute within the same address space they were created in. Sharing data between concurrent goroutines must be approached with care, like in any other concurrent programming language. Go provides [[Go_Package_sync#Memory_Access_Synchronization_Primitives|memory access]] and [[Go_Package_sync#Execution_Synchronization_Primitives|execution synchronization primitives]] as part of the <code>[[Go_Package_sync#Overview|sync]]</code> package, but sharing mutable variables between goroutines is discouraged, and the usage of the <code>sync</code> primitives is recommended only in [[Go_Concurrency#Programming_Models|very specific cases]], such as [[Go_Mutex_and_RWMutex#Guarding_the_Internal_State_of_a_struct|guarding the internal state of a <code>struct</code> in a small lexical scope]]. Go language designers advise programmers to share data by communicating, instead of communicating by sharing data - which is what the <code>sync</code> package helps with. A much more robust, scalable and composable communication mechanism between goroutines is provided by [[Go_Channels#Overview|channels]]. Channels are an implementation of the [[Concurrent_(Parallel)_Programming#Communicating_Sequential_Processes_(CSP)|communicating sequential processes (CSP)]] model, and the entire Go concurrency model has been built around it.  


This approach allows programmers to directly map concurrent problems onto natural concurrent constructs, the goroutines, instead of dealing with the minutia of starting and managing threads. Goroutines free programmers from having to think about the problem space in terms of threads and parallelism and instead allow us to model problems closer to their natural level of concurrency: functions. The benefit of the more natural mapping between problem space and Go code is the likely increased amount of the problem space that is modeled in a concurrent manner. The problem we work on as developers are naturally concurrent more often than not, we'll naturally be writing concurrent code at a finer level of granularity that we perhaps would in other languages.
This approach allows programmers to directly map concurrent problems onto natural concurrent constructs, the goroutines, instead of dealing with the minutia of starting and managing threads. Goroutines free programmers from having to think about the problem space in terms of threads and parallelism and instead allow us to model problems closer to their natural level of concurrency: functions. The benefit of the more natural mapping between problem space and Go code is the likely increased amount of the problem space that is modeled in a concurrent manner. The problem we work on as developers are naturally concurrent more often than not, we'll naturally be writing concurrent code at a finer level of granularity that we perhaps would in other languages.


Goroutines are very inexpensive to create and manage, which is not the case with the operating system threads exposed in the language. Thousands or tens of thousands of goroutines can be created in the same address space with no adverse side effects. There are appropriate times to consider how many goroutines are running in your system, but doing so upfront is a premature optimization. Internally, the goroutine is a structure managed by the [[Go_Runtime|Go runtime]] that with a very small initial memory footprint (a few kilobytes) and that is managed by the runtime, which grows or shrinks the memory allocated for the stack automatically. The CPU overhead averages about three instructions per function call. Many goroutines execute within a single O/S thread. From the O/S point of view, only one thread is scheduled. The goroutine schedule is done by the [[Go_Runtime#Go_Runtime_Scheduler|Go runtime scheduler]]. The Go runtime scheduler uses a [[Go_Runtime#Logical_Processor|logical processor]]. The goroutines scheduled on a logical processor are executing [[Concurrent_(Parallel)_Programming#Concurrency|concurrently]], not [[Concurrent_(Parallel)_Programming#Parallelism|in parallel]]. However, it is possible to have more than one logical processor, each logical processors can be mapped onto an O/S thread, which may be scheduled to work on different cores. In this case, goroutines execute in parallel.
Goroutines are very inexpensive to create and manage, which is not the case with the operating system threads exposed in the language. Thousands or tens of thousands of goroutines can be created in the same address space with no adverse side effects. There are appropriate times to consider how many goroutines are running in your system, but doing so upfront is a premature optimization. Internally, the goroutine is a structure managed by the [[Go_Runtime|Go runtime]] that with a very small initial memory footprint (a few kilobytes) and that is managed by the runtime, which grows or shrinks the memory allocated for the stack automatically. The CPU overhead averages about three instructions per function call.
 
Many goroutines execute within a single O/S thread. From the O/S point of view, only one thread is scheduled. The goroutine schedule is done by the [[Go_Runtime#Go_Runtime_Scheduler|Go runtime scheduler]]. The Go runtime scheduler uses a [[Go_Runtime#Logical_Processor|logical processor]]. The goroutines scheduled on a logical processor are executing [[Concurrent_(Parallel)_Programming#Concurrency|concurrently]], not [[Concurrent_(Parallel)_Programming#Parallelism|in parallel]]. However, it is possible to have more than one logical processor, each logical processors can be mapped onto an O/S thread, which may be scheduled to work on different cores. In this case, goroutines execute in parallel.


There is at least one goroutine, [[#The_Main_Goroutine|the main goroutine]], that is always created for every program.
There is at least one goroutine, [[#The_Main_Goroutine|the main goroutine]], that is always created for every program.
Line 48: Line 50:
The <code>go</code> invocation returns immediately, while the function it was invoked with will be executed concurrently. Note that this syntax only '''creates''' and '''schedules''' a goroutine. It is not determined when it will be actually executed.
The <code>go</code> invocation returns immediately, while the function it was invoked with will be executed concurrently. Note that this syntax only '''creates''' and '''schedules''' a goroutine. It is not determined when it will be actually executed.


Any function invocation can be used with <code>go</code>::
Any function invocation can be used with <code>go</code>:
<syntaxhighlight lang='go'>
<syntaxhighlight lang='go'>
go fmt.Printf("something")
go fmt.Printf("something")
Line 63: Line 65:
}("test")
}("test")
</syntaxhighlight>
</syntaxhighlight>
Closures can be executed as goroutines. For a discussion on how closed-over variables are handled in this case see: {{Internal|Go Closures and Goroutines#Overview|Go Closures and Goroutines}}


=<span id='Exiting'></span>Exiting a Goroutine=
=<span id='Exiting'></span>Exiting a Goroutine=
Line 69: Line 73:


<span id='Main_Exit'></span>When the <code>main</code> goroutine is complete, all other goroutines are forced to exit. It is said that those goroutines exit early.
<span id='Main_Exit'></span>When the <code>main</code> goroutine is complete, all other goroutines are forced to exit. It is said that those goroutines exit early.
Never start a goroutine without knowing how it will stop. See [https://dave.cheney.net/high-performance-go-workshop/sydney-2019.html#know_when_to_stop_a_goroutine this].
==Returns of Goroutines==
==Returns of Goroutines==
<font color=darkkhaki>What happens with the return of the function when executed as a goroutine?</font>
<font color=darkkhaki>What happens with the return of the function when executed as a goroutine?</font>


=Goroutines and the Go Runtime=
=Concurrent Programming Error Handling=
Goroutines do not define their own suspension or reentry points. Go runtime observes the behavior of goroutines and automatically suspends them when they block, and then resumes them when they become unblocked.
{{Internal|Go_Language_Error_Handling#Concurrent_Programming_Error_Handling|Go Error Handling &#124; Concurrent Programming Error Handling}}
 
The Go runtime manages goroutines with an M:N scheduler: it maps M [[Go_Runtime#Logical_Processor|logical processors]] to N O/S threads. Goroutines are scheduled onto logical processors. When there are more goroutines than logical processors available, the scheduler handles the distribution of the goroutines across the available logical processors and ensures that when these goroutines become blocked, other goroutines can be run.
 
Go follows a model of concurrency called the fork-join model. At any point in the program, a child branch of execution can be split off. The branch will run concurrently with its parent. At some point in the future, the concurrent branches of execution will join back together, or the child branch will just simply exit. Where the child rejoins the parent is called a join point. A  join point can be created by synchronizing the parent and the child branches.


Also see: {{Internal|Go_Runtime#Goroutine_Management|Go Runtime}}
=<span id='Join_Point'></span>Goroutines and the Go Runtime=
{{Internal|Go_Runtime#Goroutine_Management|Go Runtime &#124; Goroutine Management}}


=Pausing=
=Pausing=
Line 94: Line 98:


Also see: {{Internal|Concurrent_(Parallel)_Programming#Deadlock|Concurrent (Parallel) Programming &#124; Deadlock}}
Also see: {{Internal|Concurrent_(Parallel)_Programming#Deadlock|Concurrent (Parallel) Programming &#124; Deadlock}}
=Goroutine Patterns=
* [[#Preventing_Goroutines_Leak|Preventing Goroutines Leak]]
* [[#Healing_Unhealthy_Goroutines|Healing Unhealthy Goroutines]]
==Preventing Goroutines Leak==
Goroutines are not garbage collected, and even if their memory footprint is small, we should strive to avoid leaving blocked goroutines that never exit behind.
A goroutine exits when:
# it completes its work
# encounters an error and cannot continue its work
# it is being told to stop
It is good programming practice to be able to programmatically cancel a goroutine when needed. '''If a goroutine is responsible for creating another goroutine, it is also responsible for ensuring it can stop it'''. The way to implement this in practice is to establish a signal between the parent goroutine and its children that allows the parent to signal cancellation to the children. A common pattern that allows to programmatically stop goroutines involves a '''read-only "done" channel'''. When the "done" channel is closed externally by the parent goroutine, the child subroutine exists: {{Internal|Go_Channels#Responsible_for_Stopping_Child|Stopping Child Goroutine with a "done" Channel}}
An example of implementation of this pattern is available in: {{Internal|Go_Pipelines#Preventing_Goroutine_Leak|Go Pipelines}}
==Healing Unhealthy Goroutines==
{{Internal|Go_Healing_Unhealthy_Goroutines#Overview|Healing Unhealthy Goroutines}}
=Goroutine and Testing=
{{Internal|Testify_require_and_assert#Calling_assert_and_require_from_Goroutines|Calling Testify <tt>assert</tt> and <tt>require</tt> from Goroutines}}

Latest revision as of 03:19, 16 November 2024

Internal

Overview

Goroutines are fundamental components of the concurrent programming model in Go. Goroutines and how they fit with Go concurrency is briefly explained in Go Concurrency, and in more detail in this article.

Unlike other programming languages, that had chosen to reify O/S threads and expose them as part of the concurrency programming model, Go uses a different fundamental concurrency primitive, the goroutine, which can be thought of as a function that executes concurrently alongside many other functions that executed concurrently in the program, a unit of work executed concurrently with other units of work. Threads are not exposed in the language.

A goroutine is declared by prepending a function invocation with the go keyword:

func somefunc() {
  ...
}

go somefunc()

Once declared as such, the function hosted by the goroutine is treated as an independent unit of work that runs concurrently with other functions. During the program execution, the function code is mapped transparently by the Go runtime onto O/S threads and executed, possibly in parallel.

Goroutines execute within the same address space they were created in. Sharing data between concurrent goroutines must be approached with care, like in any other concurrent programming language. Go provides memory access and execution synchronization primitives as part of the sync package, but sharing mutable variables between goroutines is discouraged, and the usage of the sync primitives is recommended only in very specific cases, such as guarding the internal state of a struct in a small lexical scope. Go language designers advise programmers to share data by communicating, instead of communicating by sharing data - which is what the sync package helps with. A much more robust, scalable and composable communication mechanism between goroutines is provided by channels. Channels are an implementation of the communicating sequential processes (CSP) model, and the entire Go concurrency model has been built around it.

This approach allows programmers to directly map concurrent problems onto natural concurrent constructs, the goroutines, instead of dealing with the minutia of starting and managing threads. Goroutines free programmers from having to think about the problem space in terms of threads and parallelism and instead allow us to model problems closer to their natural level of concurrency: functions. The benefit of the more natural mapping between problem space and Go code is the likely increased amount of the problem space that is modeled in a concurrent manner. The problem we work on as developers are naturally concurrent more often than not, we'll naturally be writing concurrent code at a finer level of granularity that we perhaps would in other languages.

Goroutines are very inexpensive to create and manage, which is not the case with the operating system threads exposed in the language. Thousands or tens of thousands of goroutines can be created in the same address space with no adverse side effects. There are appropriate times to consider how many goroutines are running in your system, but doing so upfront is a premature optimization. Internally, the goroutine is a structure managed by the Go runtime that with a very small initial memory footprint (a few kilobytes) and that is managed by the runtime, which grows or shrinks the memory allocated for the stack automatically. The CPU overhead averages about three instructions per function call.

Many goroutines execute within a single O/S thread. From the O/S point of view, only one thread is scheduled. The goroutine schedule is done by the Go runtime scheduler. The Go runtime scheduler uses a logical processor. The goroutines scheduled on a logical processor are executing concurrently, not in parallel. However, it is possible to have more than one logical processor, each logical processors can be mapped onto an O/S thread, which may be scheduled to work on different cores. In this case, goroutines execute in parallel.

There is at least one goroutine, the main goroutine, that is always created for every program.

The Main Goroutine

The main goroutine is always automatically created by the runtime, to run the main() function.

When other goroutines are created directly from main(), the scheduler always seems to continue to execute main(), it does not preempt main() to executes the new goroutine. Also, given the fact that when main() exists, all other goroutines are forcibly terminated, unless there's a mechanism that ensures they will be executed, they might not be executed at all. It is bad practice to use time.Sleep() to preempt and delay main(), because we're making assumptions about timing, and these assumptions can be wrong, and also we're assuming that the scheduler will schedule the other goroutine, when the main goroutine goes to sleep.

Also see:

main()

Start a Goroutine

Place the go keyword before a function invocation statement:

func somefunc(i int) {
  ...
}

...

go somefunc(10)

The go invocation returns immediately, while the function it was invoked with will be executed concurrently. Note that this syntax only creates and schedules a goroutine. It is not determined when it will be actually executed.

Any function invocation can be used with go:

go fmt.Printf("something")

Anonymous functions can also be executed as goroutines:

...
go func(s string) {
  ...
  fmt.Println(s)
  ...
}("test")

Closures can be executed as goroutines. For a discussion on how closed-over variables are handled in this case see:

Go Closures and Goroutines

Exiting a Goroutine

A goroutine exits when the code is complete.

When the main goroutine is complete, all other goroutines are forced to exit. It is said that those goroutines exit early.

Never start a goroutine without knowing how it will stop. See this.

Returns of Goroutines

What happens with the return of the function when executed as a goroutine?

Concurrent Programming Error Handling

Go Error Handling | Concurrent Programming Error Handling

Goroutines and the Go Runtime

Go Runtime | Goroutine Management

Pausing

time.Sleep()

Atomic Primitives

atomic package

Communication

Communication between threads is essential for concurrent programming. Data can be sent to goroutines by simply providing it as arguments to the functions that are invoked with the go keyword. This only works when the goroutine is started, though. Another way to send data to goroutines is via channels. Goroutines can use channels to receive and send data to other goroutines.

Channels

Go Channels

Deadlock

The Go runtime detects a deadlock where all goroutines are locked. It does not detect partial deadlocks, when only a subset of goroutines are deadlocked.

Also see:

Concurrent (Parallel) Programming | Deadlock

Goroutine Patterns

Preventing Goroutines Leak

Goroutines are not garbage collected, and even if their memory footprint is small, we should strive to avoid leaving blocked goroutines that never exit behind.

A goroutine exits when:

  1. it completes its work
  2. encounters an error and cannot continue its work
  3. it is being told to stop

It is good programming practice to be able to programmatically cancel a goroutine when needed. If a goroutine is responsible for creating another goroutine, it is also responsible for ensuring it can stop it. The way to implement this in practice is to establish a signal between the parent goroutine and its children that allows the parent to signal cancellation to the children. A common pattern that allows to programmatically stop goroutines involves a read-only "done" channel. When the "done" channel is closed externally by the parent goroutine, the child subroutine exists:

Stopping Child Goroutine with a "done" Channel

An example of implementation of this pattern is available in:

Go Pipelines

Healing Unhealthy Goroutines

Healing Unhealthy Goroutines

Goroutine and Testing

Calling Testify assert and require from Goroutines