Go Channels

From NovaOrdis Knowledge Base
Revision as of 18:48, 14 August 2024 by Ovidiu (talk | contribs) (→‎make())
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

External

Internal

Overview

Go channels provide a composable, concurrent-safe and type-safe way to communicate between concurrent processes. A channel serves as a conduit for a stream of information: values may be written on the channel and then read out downstream. For this reason, it's probably not a bad idea to end the channel variable names with the word "Stream". When using a channel, you pass a value or a pointer into a chan variable and then somewhere else in your program read it off the channel. The program part that reads it does not require knowledge of the part that wrote it, only the channel variable.

When values are passed between two parts of the program, the sender and the receiver do not need to synchronize access to the data. The ownership of the data changes from sender to receiver instead, and no two concurrent threads touch the same piece of data at the same time. Only one goroutines modifies the data at any time. Also see Transferring the Ownership of Data below.

When pointers to the data are exchanged, instead of data itself, each goroutine still needs to be synchronized if reads and writes are performed against the shared piece of data.

Channels are typed: the data that goes through a channel has a specific type.

Because channels are composable with other channels, this makes writing large systems simpler. You can coordinate the input from multiple subsystems by easily composing the output together. You can combine input channels with timeouts, cancellations, or message to other subsystems. The select statement is the complement to Go's channels. It is what enables all the difficult parts of composing channels. select statement allows you to wait for events, select a message from competing channels in a uniform random way, continue on if there are no messages waiting, and more. Channels orchestrate, mutexes serialize. Channels are the glue that binds the goroutines together.

Also see:

Go Concurrency Programming Models

R chan W

R <- chan <- W

The data flows between the channel and the variable in the direction the arrow points. See Sending on Channels and Receiving from Channels below.

Declaration and Instantiation

The channel variables are declared with the usual var keyword declaration syntax, where the channel type can be a Go built-in type or a user-defined type:

var c chan <channel_type>

The declaration shown above assigns a zero value bidirectional channel to the variable. However, because zero value channels cannot be used for anything useful, the actual channel instances must be created with make(). The short variable declaration syntax can also be used, but only inside functions.

var c chan int // creates a zero value channel
c = make(chan int) // assigns a valid channel instance to the channel variable
c2 := make(chan int) // short variable declaration

A channel variable is a value, not a reference variable, which means that no two different channel variables may point to the same channel instance. unidirectional channels can also be declared, for a discussion around unidirectional and bidirectional channels, see Bidirectional and Unidirectional Channels below.

nil Channels (Zero Value for Channel Type)

A channel variable declaration creates a zero value channel, also known as a nil channel. A nil channel cannot be used for sending or receiving data. Attempts to send or receive from such channel instances block irrespective of presence of readers or writers. An attempt to close the channel produces a panic.

Initialization with make()

Channel instances are created with the built-in function make():

c := make(chan string)

Invoking make() with only the chan keyword and the payload type makes an unbuffered channel: its capacity to hold objects in transit is 0. To make a buffered channel, specify an integer capacity as the third argument:

c := make(chan string, 3)

Slice of Channels

var cSlice = []chan int

No parentheses are necessary.

Unbuffered and Buffered Channels

Unbuffered Channels

An unbuffered channel cannot hold data in transit. This is the default mode for creating the channel instances. An unbuffered channel is synchronous: both sides of the channel will block until the other side is ready. The sending operation blocks on an unbuffered channel until some other goroutine reads the data from the channel on the receiving end. Symmetrically, the receiving operation blocks until some data is sent on the sending end.

Because of the blocking behavior, an unbuffered channel provides communication between threads, but also execution synchronization: the unbuffered channel can be used as a synchronization mechanism only, and the data passing on the channel simply thrown away. The language syntax supports that by allowing receiving from a channel without storing the result in any variable - which means the result will be simply discarded:

<- c

If data is exchanged, the unbuffered channel provides the guarantee that the exchange between two goroutines is performed at the instant the send and the receive takes place.

This syntax has a synchronization "wait" semantics. Also see WaitGroup.Wait().

Buffered Channels

A buffered channel can be configured to contain a limited number of values in transit. The number of buffered elements is known as the channel's capacity. The capacity is specified as an argument of the make() function when the channel is initialized:

c := make(chan string, 3)

An unbuffered channel can be defined as a buffered channel with the capacity of 0. This is a valid declaration:

c := make(chan string, 0)

This type of channel does not force goroutines to be read at the same instant to perform the exchange. The sending operation on a buffered channel succeeds for n times even if nobody receives data from the channel, and only blocks if the buffer is full. The receiving operation blocks only if the buffer is empty. The main reason for buffering is to allow sender and receiver to operate at different speeds, at least from time to time. For unbuffered channels, the sender and the receiver will work in lockstep, which reduces the concurrency of the code. For a buffered channel, the buffer can temporarily absorb some of the differences in speed between the producer and the consumer. It cannot do that forever, for its capacity is finite.

The differentiation between a buffered and unbuffered channel is made with the channel is instantiated, and not when the channel is declared. This means that the goroutine that instantiates the channel controls whether it is buffered. This suggests that the creation of a channel should probably be tightly coupled to goroutines that will be performing writes on it, so that we can reason about its behavior and performance more easily.

Bidirectional and Unidirectional Channels

When a channel is declared with:

var s chan <channel_type>

it is declared to be default a bidirectional channel, which means that data can be both read and written on in. A unidirectional channel is a channel that can be either read, or written.

To declare a read-only channel, use the <- operator at the left of the chan keyword.

var roChan <- chan <channel_type>

To declare a write-only channel, use the <- operator at the right of the chan keyword.

var woChan chan <- <channel_type>

The attempt to read from a write-only channel or write on a read-only channel is signaled out by the compiler:

invalid operation: cannot send to receive-only channel roC (variable of type <-chan int)

Converting Bidirectional Channels to Unidirectional Channels

Channels that are declared bidirectional can be converted to unidirectional channels when the logic of the program requires it. The compiler will enforce the conversion by preventing unsupported operations on channels such converted. Unidirectional channels are not very often instantiated as such, but they are used as function parameter and return types.

var c chan int
c = make(chan int)

var roC <- chan int
var woC chan <- int
	
// these assignments are legal, the compiler will ensure that roC will only be used for reading
// and woC will only be used for writing
roC = c
woC = c
	
woC <- 10
i := <- roC

Sending on Channels

To send data on a bidirectional or write-only channel, use the left arrow operator <- at the right of the channel variable. Note that "sending on a channel" and "writing on a channel" are used interchangeably.

c <- 10

Sending on an unbuffered channel or a full buffered channel is blocking: if no goroutines are attempting to receive data from the unbuffered channel, or the buffered channel is full, the sending operation blocks until a goroutine actually receives data from the channel. Note that "full" or "empty" are a function of the channel's capacity: an unbuffered channel is always full and always empty, so operations on it block unless other goroutines send or receive data. For a buffered channel, sending will return immediately, unless the channel is full.

Sending on a closed channel causes the program to panic:

panic: send on closed channel

Summary of write operations on a channel:

Channel State Result
nil Block
Open and not full Write value
Open and full Block
Closed panic
Read-only Compilation error

Receiving from Channels

To receive data from a bidirectional or read-only channel, use the left arrow operator <- at the left of the channel variable. "Receiving from a channel" and "reading from a channel" are equivalent and they are used interchangeably.

i := <- c

This form, without separating space is also valid:

i :=<- c

The receive operation actually returns two values: the first is the value read from the channel, and the second is a boolean that says whether the value that read from the channel was generated by a write somewhere else (true), or it is a zero value for the channel's type, generated by a closed channel.

i, isChannelOpen := <- c

Similarity to sending to the channel, receiving from a channel is blocking: if no goroutines are attempting to send data to the unbuffered channel, or the buffered channel is empty, the receiving operation blocks until data becomes available in the channel, or the channel is closed. For buffered channels, the execution immediately returns the next value in the channel, if any, or blocks until a new value becomes available or the channel is closed.

If a buffered channel is empty and it has a blocked receiver, the buffer will be bypassed and the value will be passed directly from the sender to receiver when a sender sends data.

Summary of read operation results on a channel:

Channel State Result
nil Block
Open and not empty Value
Open and empty Block
Closed (zero value for channel type, false)
Write-only Compilation error

Ranging over a Channel - Iterative Read from a Channel

The range keyword can be used in conjunction with the for statement to loop over values arriving on a channel. The knowledge of whether the value comes from an open or closed channel, exposed as the second result of channel read operation, enables the runtime know when to get out of the loop, so the loop does not need an explicit exit criterion. When values are not available on the channel, range blocks until values become available or the channel is closed. If the channel is nil, the loop blocks forever.

var c chan int
...
for i := range c {
  ...
}

To make the loop preemptable use a Done channel.

Receiving Concurrently from One Channel

If more than one goroutines receive from the same channel, only one gets the data sent on the channel. From this perspective, the behavior of a channel is similar to that of a JMS queue. However, if the channel is closed, all goroutines get the close notification: their receive operations return false as the second result value.

Closed Channels

Closing a channel is used to communicate that no more values will be written on the channel:

var c chan int
...
close(c)

Closing the channel makes the blocked read operators or the for loops that read from the channel with the <coderange keyword exit, while returning a zero value for the channel type and a false boolean.

Once the channel is closed, goroutines cannot send data on that channel anymore.

An attempt to send on a closed channel causes the program to panic. The send operations that are blocked on the channel will also panic, so it is important that the channel's owner, who conventionally is responsible with writing on the channel, also closes it. See Channel Owner below.

A read only channel cannot be closed, an attempt to close it produces a compilation error.

An attempt to receive from a closed channel returns the zero value for the channel's type and sets the second return value to false (not received from an open channel). Reads can be performed indefinitely on a closed channel. This is to allow support for multiple downstream reads from a single upstream writer.

Closing a channel is a way to signal multiple goroutines simultaneously, equivalent to sync.Cond#Broadcast(). If N goroutines are receiving on a single channel, instead of writing N "sentinel" values to signal each of the goroutines to exit, the channel can just be simply closed.

Summary of close operations on a channel:

Channel State Result
nil panic
Open and not empty Closes the channel, reads succeed until the channel is drained, then reads produce (zero value, false)
Open and empty Closes the channel, reads produce (zero value, false)
Closed panic
Read-only Compilation error

The select Statement

A select statement, which is one of the language's keywords and a new type of control structure particular to Go, is a multiway communication multiplexer consisting of a series of case statements that guard channel read and write statements. All channel reads and writes are considered simultaneously to see if any of them are ready. If populated and closed channels are available for reads, and channels that are not at capacity are available writes, they become eligible for selection. If none of the channels are ready, the entire select statement blocks, unless there is default statement, which will be executed. When one or more channels becomes ready, one is chosen at random and the corresponding statement will execute. Introducing randomness makes the program work "well" in the average case.

select {
  case someVar = <- c1:
     ...
  case someVar2 = <- c2:
     ...
  case c3 <- someVar3:
     ...
  case c4 <- someVar4:
     ...
  default:
     // execute this if none of the channels is ready
     ...
}

The select statement can be used to compose channels together in a program to form larger abstractions.

A select statement with no case clauses will block forever:

select {}

Timing Out select

select {
  case ...:
     ...
  case ...:
     ...
  case <- time.After(1 * time.Second):
     // do this when timing out
     ...
}

Also see:

time Package | After(Duration)

The for/select Loop

for { // Either loop infinitely or range over something
  select {
     // Do work with channels
  }
}

The Done Channel

for and select can be used in implementation of a Done (Abort) channel, which will interpret anything sent on it, and also its closing, as "we're done", "drop processing and exit". The Done channel flows through the program and cancels all blocking concurrent operations that are reading from the channel. This pattern has been incorporated in standard library since Go 1.7, in the form of the context package's Done channel. The context package implementation offers additional features, so when a Done channel is needed, consider first using the context package:

context Package Done Channel

The general pattern is:

done := make(chan interface{})
...
loop:
for {
  select {
    case <- done: 
      // exit loop
      break loop
    case e, stillOpen := <-someInputChannel:
      if ! stillOpen {
        // exit loop
        break loop
      }
      // use the element ...
    default: ...
  }
}

To abort the Done channel-driven goroutine, close the Done channel:

close(done)

The behavior described above can be encapsulated in a function known as the "OR-Done-Channel" pattern, which allows us to deploy a much simpler for/range syntax:

OR-Done-Channel Pattern

Usually, we want to write or close the Done channel and make all goroutines listening to it exit when we get out of the scope that created the Done channel. We can do this with defer, ensuring that we don't leak goroutines:

defer close(done)

If the goroutine is internally instantiated by a function, and the Done channel is created externally, as it should, by its owner, by convention the channel should be passed as the first argument of the function.

An example that shows how a parent goroutine puts itself in the position to be able to programmatically stop the child goroutine, follows:

func startDoneChannelControlledGoroutine(done <-chan interface{}) {
	go func() {
		for {
			select {
			case <-done:
				return // this ends the goroutine
			default:
				// some work
				time.Sleep(1 * time.Second)
			}
		}
	}()
}

done := make(chan interface{})
startDoneChannelControlledGoroutine(done)

go func() { // the owner goroutine stops the child goroutine after 5 seconds 
  time.Sleep(5 * time.Second)
  close(done)
}()

The context package offers a similar, but richer pattern:

Context

Loop Infinitely until Stopped

It is very common to create goroutines that loop infinitely until they're stoped with a Done channel:

for {
  select {
  case <- done:
    return
  default:
    // do non-preemptable work
  }
}

Convert an Iterable to Channel Content

for v := range ... {
  select {
  case <-done:
    return
  case c <- v:
  }
}

Channels and Error Propagation

Go Error Handling | Concurrent Programming Error Handling

Channel Patterns

Channel Ownership

A channel's owner is the goroutine that instantiates, writes and closes the channel. A channel that was declared as bidirectional can be later restricted declaratively to behave like a unidirectional channel (read only mostly, or write only, which is less common). These declarations are a tool that will allow us to distinguish between goroutine that own channels and those that merely utilize them. Channel owners have a write-access view into the channel: chan or chan<-. Channel users only have a read-only view into the channel (<-chan).

The channel owner is responsible with:

  1. Instantiating the channel.
  2. Performing writes, or passing ownership to another goroutine.
  3. Closing the channel.
  4. Encapsulate and hide the first three things in this list, and only expose the channel as a read-only channel to its users.
// A channel owner: a goroutine that:
// 1. Instantiates the channel
// 2. Writes on the channel
// 3. Closes the channel when it's done with it
// 4. Exposes the channel externally as read-only

func makeChannelByOwner() <-chan int {
	var c chan int
	c = make(chan int, 10)
	go func() {
		defer close(c)
		for i := 0; i <= 10; i++ {
			c <- i
		}
	}()
	return c // by returning as "<-chan", the channel is externally converted to read-only
}

...

ro := makeChannelByOwner()
for i := range ro {
  fmt.Printf("received: %d\n", i)
}

Strive to keep the scope of channel ownership small. If you have a channel as a member variable of a struct with numerous methods on it, it's quickly going to become unclear how the channel will behave.

A channel consumer only has to worry about:

  1. Knowing when the channel is closed. The second result of a read operation indicates that, and range automatically exits the loop when the channel is closed.
  2. Responsibly handling blocking.

Transferring the Ownership of Data

If you have a bit of code that produces a result and wants to share that result with another bit of code, what you are really doing is transferring the ownership of that data. Data has an owner, and one way to make concurrent programs safe is to ensure only one concurrent context has ownership of data at a time. Channels are the recommended way to implement this pattern in Go, as they help us communicate this concept by encoding the intent into the channel's type. You can create buffered channels to implement a cheap in-memory queue and thus decouple the producer from consumer. This pattern also makes you code composable with other concurrent code.

Coordinate Multiple Pieces of Logic

Channels are inherently composable and preferred when communicating between different parts of your object graph. Having locks scattered throughout the object graph is far worse. Having channels everywhere is expected and encouraged. Channels can be easily composed, which is not what can be said about locks or methods that return values. It is much easier to control the emergent complexity that arises in your project if you use channels.

Use a Channel to Signal Completion of Work

This pattern can be used by a goroutine to signal that it has done what it was supposed to do, so the upper layer can move on:

func startGoroutine() (iAmDone <-chan interface{}) {
	c := make(chan interface{})
	defer close(c) // close the channel on exit, signaling that this goroutine is done
	// do some work
	time.Sleep(5 * time.Second)
	return c
}

thatGoRoutineIsDone := startGoroutine()

// wait until that goroutine is done, will return nil when channel is closed
<-thatGoRoutineIsDone

// go on ...
}

Use a Channel to Time Out

The OR-Channel Pattern

OR-channel

The OR-Done-Channel Pattern

OR-done-channel

The tee-channel

tee-channel

The bridge-channel

bridge-channel

context.Context

context.Context

Cancellation

Programmatic Preemption (Cancellation)

Timeout

Programmatic Timeout
Timing-out select

Pipelines

Go Pipelines