Go Channels

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

External

Internal

TODO

Deplete Go_Channels_TODEPLETE into this.

Overview

Go channels provide a composable, concurrent-safe and type-safe way to communicate between concurrent processes. A channel serves as a conduit for a stream of information: values may be written on the channel and then read out downstream. For this reason, it's probably not a bad idea to end the channel variable names with the word "Stream". When using a channel, you pass a value into a chan variable and then somewhere else in your program read it off the channel. The program part that reads it does not require knowledge of the part that wrote it, only the channel variable.

Channels are typed: the data that goes through a channel has a specific type.

Because channels are composable with other channels, this makes writing large systems simpler. You can coordinate the input from multiple subsystems by easily composing the output together. You can combine input channels with timeouts, cancellations, or message to other subsystems. The select statement is the complement to Go's channels. It is what enables all the difficult parts of composing channels. select statement allows you to wait for events, select a message from competing channels in a uniform random way, continue on if there are no messages waiting, and more. Channels orchestrate, mutexes serialize.

Also see:

Go Concurrency Programming Models

R chan W

R <- chan <- W

The data flows between the channel and the variable in the direction the arrow points. See Sending on Channels and Receiving from Channels below.

Declaration and Instantiation

The channel variables are declared with the usual var keyword declaration syntax, where the channel type can be a Go built-in type or a user-defined type:

var c chan <channel_type>

The declaration shown above assigns a zero value bidirectional channel to the variable. However, because zero value channels cannot be used for anything useful, the actual channel instances must be created with make(). The short variable declaration syntax can also be used, but only inside functions.

var c chan int // creates a zero value channel
c = make(chan int) // assigns a valid channel instance to the channel variable
c2 := make(chan int) // short variable declaration

A channel variable is a value, not a reference variable, which means that no two different channel variables may point to the same channel instance. unidirectional channels can also be declared, for a discussion around unidirectional and bidirectional channels, see Bidirectional and Unidirectional Channels below.

nil Channels (Zero Value for Channel Type)

A channel variable declaration creates a zero value channel, also known as a nil channel, but that instance cannot be used for sending or receiving data, and it is represented with the predeclared identifier nil.

Attempts to send or receive from such channel instances block irrespective of presence of readers or writers, so they simply cannot be used.

An attempt to close the channel produces a panic.

make()

Channel instances are created with the built-in function make():

c := make(chan string)

Invoking make() with only the chan keyword and the payload type makes an unbuffered channel: its capacity to hold objects in transit is 0. To make a buffered channel, specify an integer capacity as the third argument:

c := make(chan string, 3)

Unbuffered and Buffered Channels

Unbuffered Channels

An unbuffered channel cannot hold data in transit. This is the default mode for creating the channel instances. The sending operation blocks on an unbuffered channel until some other goroutine reads the data from the channel on the receiving end. For the same reason, the receiving operation blocks until some data is sent on the sending end.

Because of the blocking behavior, an unbuffered channel provides communication between threads, but also execution synchronization: the unbuffered channel can be used as a synchronization mechanism only, and the data passing on the channel simply thrown away. The language syntax supports that by allowing receiving from a channel without storing the result in any variable - which means the result will be simply discarded:

<- c

This syntax has a synchronization "wait" semantics. Also see WaitGroup.Wait().

Buffered Channels

A buffered channel can be configured to contain a limited number of objects in transit. The number of buffered elements is known as the channel's capacity. The capacity is specified as an argument of the make() function when the channel is initialized:

c := make(chan string, 3)

An unbuffered channel can be defined as a buffered channel with the capacity of 0. This is a valid declaration:

c := make(chan string, 0)

The sending operation on a buffered channel succeeds for n times even if nobody receives data from the channel, and only blocks if the buffer is full. The receiving operation blocks only if the buffer is empty. The main reason for buffering is to allow sender and receiver to operate at different speeds, at least from time to time. For unbuffered channels, the sender and the receiver will work in lockstep, which reduces the concurrency of the code. For a buffered channel, the buffer can temporarily absorb some of the differences in speed between the producer and the consumer. It cannot do that forever, for its capacity is finite.

The differentiation between a buffered and unbuffered channel is made with the channel is instantiated, and not when the channel is declared. This means that the goroutine that instantiates the channel controls whether it is buffered. This suggests that the creation of a channel should probably be tightly coupled to goroutines that will be performing writes on it, so that we can reason about its behavior and performance more easily.

Bidirectional and Unidirectional Channels

When a channel is declared with:

var s chan <channel_type>

it is declared to be default a bidirectional channel, which means that data can be both read and written on in. A unidirectional channel is a channel that can be either read, or written.

To declare a read-only channel, use the <- operator at the left of the chan keyword.

var roChan <- chan <channel_type>

To declare a write-only channel, use the <- operator at the right of the chan keyword.

var woChan chan <- <channel_type>

The attempt to read from a write-only channel or write on a read-only channel is signaled out by the compiler:

invalid operation: cannot send to receive-only channel roC (variable of type <-chan int)

Converting Bidirectional Channels to Unidirectional Channels

Channels that are declared bidirectional can be converted to unidirectional channels when the logic of the program requires it. The compiler will enforce the conversion by preventing unsupported operations on channels such converted. Unidirectional channels are not very often instantiated as such, but they are used as function parameter and return types.

var c chan int
c = make(chan int)

var roC <- chan int
var woC chan <- int
	
// these assignments are legal, the compiler will ensure that roC will only be used for reading
// and woC will only be used for writing
roC = c
woC = c
	
woC <- 10
i := <- roC

Sending on Channels

To send data on a bidirectional or write-only channel, use the left arrow operator <- at the right of the channel variable. Note that "sending on a channel" and "writing on a channel" are used interchangeably.

c <- 10

Sending on a channel is blocking: if no goroutines are attempting to receive data from the unbuffered channel, or the buffered channel is full, the sending operation blocks until a goroutine actually receives data from the channel. Note that "full" or "empty" are a function of the channel's capacity: an unbuffered channel is always full and always empty, so operations on it block unless other goroutines send or receive data.

Sending on a closed channel causes the program to panic.

Summary of write operations on a channel:

Channel State Result
nil Block
Open and not full Write value
Open and full Block
Closed panic
Read-only Compilation error

Receiving from Channels

To receive data from a bidirectional or read-only channel, use the left arrow operator <- at the left of the channel variable. "Receiving from a channel" and "reading from a channel" are equivalent and they are used interchangeably.

i := <- c

Similarity to sending to the channel, receiving from a channel is blocking: if no goroutines are attempting to send data to the unbuffered channel, or the buffered channel is empty, the receiving operation blocks until data becomes available in the channel.

The receive operation actually returns two values: the first is the value read from the channel, and the second is a boolean that says whether the value that read from the channel was generated by a write somewhere else (true), or it is a zero value for the channel's type, generated by a closed channel.

i, isChannelOpen := <- c

If a buffered channel is empty and it has a blocked receiver, the buffer will be bypassed and the value will be passed directly from the sender to receiver when a sender sends data.

Summary of read operation results on a channel:

Channel State Result
nil Block
Open and not empty Value
Open and empty Block
Closed (zero value for channel type, false)
Write-only Compilation error

Receiving Concurrently from One Channel

If more than one goroutines receive from the same channel, only one gets the data sent on the channel. From this perspective, the behavior of a channel is similar to that of a JMS queue. However, if the channel is closed, all goroutines get the close notification: their receive operations return false as the second result value.

Iterative Read from a Channel (Ranging over a Channel)

The range keyword can be used in conjunction with the for statement to loop over values arriving on a channel. The knowledge of whether the value comes from an open or closed channel, exposed as the second result of channel read operation, enables the runtime know when to get out of the loop, so the loop does not need an explicit exit criterion:

var c chan int
...
for i := range c {
  ...
}

Closed Channels

Closing a channel is used to communicate that no more values will be written on the channel:

var c chan int
...
close(c)

An attempt to send on a closed channel causes the program to panic. A read only channel cannot be closed, an attempt to close it produces a compilation error.

An attempt to receive from a closed channel returns the zero value for the channel's type and sets the second return value to false (not received from an open channel). Reads can be performed indefinitely on a closed channel. This is to allow support for multiple downstream reads from a single upstream writer.

Closing a channel is a way to signal multiple goroutines simultaneously, equivalent to sync.Cond#Broadcast(). If N goroutines are receiving on a single channel, instead of writing N "sentinel" values to signal each of the goroutines to exit, the channel can just be simply closed.

Summary of close operations on a channel:

Channel State Result
nil panic
Open and not empty Closes the channel, reads succeed until the channel is drained, then reads produce (zero value, false)
Open and empty Closes the channel, reads produce (zero value, false)
Closed panic
Read-only Compilation error

The select Statement

Channel Patterns

Channel Ownership

The channel owner is responsible with closing the channel.

Pipelines

Transferring the Ownership of Data

If you have a bit of code that produces a result and wants to share that result with another bit of code, what you are really doing is transferring the ownership of that data. Data has an owner, and one way to make concurrent programs safe is to ensure only one concurrent context has ownership of data at a time. Channels are the recommended way to implement this pattern in Go, as they help us communicate this concept by encoding the intent into the channel's type. You can create buffered channels to implement a cheap in-memory queue and thus decouple the producer from consumer. This pattern also makes you code composable with other concurrent code.

Coordinate Multiple Pieces of Logic

Channels are inherently composable and preferred when communicating between different parts of your object graph. Having locks scattered throughout the object graph is far worse. Having channels everywhere is expected and encouraged. Channels can be easily composed, which is not what can be said about locks or methods that return values. It is much easier to control the emergent complexity that arises in your project if you use channels.

Timeout

Cancellation