Go Maps: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
 
(65 intermediate revisions by the same user not shown)
Line 1: Line 1:
=External=
=External=
* https://go.dev/ref/spec#Map_types
* https://go.dev/ref/spec#Map_types
* https://go.dev/blog/maps


=Internal=
=Internal=
* [[Go_Language#Maps|Go Language]]
* [[Go_Language#Maps|Go Language]]
* [[Go_Slices#Overview|Go Slices]]
* [[Go_Slices#Overview|Go Slices]]
* [[Go_Sets#Overview|Go Sets]]


=Overview=
=Overview=
A map is an unordered collection of key/value pairs. The map implementation uses the key as an index and retrieves the associated value in O(1). In order to store a key/value pair in the map, the map hash function is applied to the key, which produces the index of a bucket. The purpose of the hash function is to generate the index that is evenly distributed across the bucket population. Once a bucket is selected, the map stores the key/value pair in the bucket. In order to retrieve the value corresponding to a key, the same hash function is applied to select the bucket, and then the map iterates over the keys stored in that bucket.
Only comparable types can be keys. For more details see [[#Key_Identity|Key Identity]] below.
Like slices, maps hold references to the underlying data structures. The map instances must be initialized before attempting to use into them, either by using the <code>[[#Initialization_with_make()|make()]]</code> function or a [[#Initialization_with_a_Composite_Literal|composite literal]], otherwise you will get a runtime error, because a zero value for a map is <tt>nil</tt>, which panics when it is written.
The Go documentation used to refer to maps as [[Go_Language#Reference_Type|reference types]], but not anymore. The "reference type" terminology was removed from Go documentation.
=Maps and Pass-by-Value=
=Maps and Pass-by-Value=
Go uses [[Go_Functions#Pass-by-value_vs._pass-by-pointer_vs._pass-by-reference|pass-by-value]], so when a map argument is passed to a function, the internal fields are copied across on the function stack, including the pointer to the underlying storage. Therefore, the underlying storage data structure is intrinsically shared, even in case of pass-by-value. Both the original map header and the copy of the header passed to the function describe the same underlying storage. If the function changes the storage via its copy of the map, the modified elements can be seen outside the function through the original map variable.
<syntaxhighlight lang='go'>
m := map[string]int{"A": 1} // map[A:1]
changeMap(m)
fmt.Println(m) // map[A:2]
func changeMap(m map[string]int) {
    m["A"] = 2
}
</syntaxhighlight>
The map variables are small, so their content can be copied fast. The fact that they include a pointer to the underlying storage prevents copying the storage, which has important performance implications.
Note that slices are not [[Variables,_Parameters,_Arguments#Reference_Variable|reference variables]], as Go does not support reference variables, so no two different maps variables may point to the same map instance.
=Key Identity=
The key can be of any comparable type: a type for which the equality operator <code>[[Go_Language#Equality_Operator|==]]</code> is defined: boolean, integer, floating point and complex number, string, pointer, interface (as long as the dynamic type supports equality), structs, arrays and channels. [[Go_Slices#Overview|Slices]], functions and struct types that contain slices cannot be used as map keys because equality is not defined on them. An attempt will generate a compiler error.
Structs can be used to key data by multiple dimensions. For an example see https://go.dev/blog/maps#key-types.
The identity of a key in the map is given by its value '''and''' its type. We can use the same underlying value, but if the type is different, the keys will count as different. Two different [[Go_Language#Type_Alias|aliases]] of the same type count as two different types.
<syntaxhighlight lang='go'>
type A int
type B int
m := map[interface{}]string{}
m[A(1)] = "red"
m[B(1)] = "blue"
fmt.Printf("%v", m)
</syntaxhighlight>
will display:
<font size=-2>
map[1:red 1:blue]
</font>
This behavior is used to implement a [[Go_Package_context#Restoring_Key/Value_Type_Safety_for_Request-Scoped_Data|technique that prevents key collisions]] when <code>[[Go_Package_context#Requests_and_Request-Scoped_Data|context.Context]]</code> is used to store request-scoped key/value pairs.
Also see: {{Internal|Go_Identity_Equality_Comparability#Overview|Identity, Equality, Comparability in Go}}
=Pointers to Maps=
=Pointers to Maps=
=<tt>nil</tt> and Empty Map=
==<tt>nil</tt> Map==
==Empty Map==
=Declaration and Initialization=
=Declaration and Initialization=
==Initialization with <tt>make()</tt>==
A map can be declared with the [[Go_Variables#Long_Variable_Declaration|long variable declaration]].
<syntaxhighlight lang='go'>
var mapVarName map[KeyType]ValueType
</syntaxhighlight>
<span id='Nil_Slice_Idiomatic'></span><syntaxhighlight lang='go'>
var mm [string]int
</syntaxhighlight>
This declaration creates a [[#nil|<code>nil</code> map]]. Because <code>nil</code> maps cannot be written, this form has a limited usefulness.
 
Inside functions, the [[Go_Variables#Short_Variable_Declaration|short variable declaration]] can be used with <code>[[#make|make()]]</code> or a [[#Composite_Literal|composite literal]].
 
==<span id='make()'></span>Initialization with <tt>make()</tt>==
<code>make()</code> allocates a the underlying data structures required by the map, and creates a new map header to describe it, all at once. Note that unlike <code>[[#new()|new()]]</code>, <code>make()</code> returns the map, not a pointer. The returned map is [[#Empty_Map|empty]], but ready to use.
<syntaxhighlight lang='go'>
mm := make([string]int)
</syntaxhighlight>
 
==<span id='Composite_Literal'></span>Initialization with a Composite Literal==
==<span id='Composite_Literal'></span>Initialization with a Composite Literal==
Composite literals can be used for map initialization.  
Composite literals can be used for map initialization.  
Line 25: Line 87:
<syntaxhighlight lang='go'>
<syntaxhighlight lang='go'>
mm := map[string]int{"A": 1, "B": 2}
mm := map[string]int{"A": 1, "B": 2}
</syntaxhighlight>
=<tt>nil</tt> and Empty Map=
==<tt>nil</tt> Map==
Declaring a map with the [[Go_Variables#Long_Variable_Declaration|long declaration]] and without initialization yields a <code>nil</code> map:
<syntaxhighlight lang='go'>
var mm map[string]int
</syntaxhighlight>
A <code>nil</code> map is also returned by the <code>new()</code> function, but <code>new()</code> returns the pointer to the map, not the map handle itself.
<syntaxhighlight lang='go'>
var mmp = new(map[string]int)
</syntaxhighlight>
<code>fmt.Printf("%p", mm)</code> prints <code>0x0</code>. The underlying data structures of the map handle are initialized with zero values.
A <code>nil</code> map behaves like an empty map when reading. An attempt to read a key with the indexing operator <code>[]</code> operator from a <code>nil</code> map will not panic, but it will return the zero value for the map type.
<syntaxhighlight lang='go'>
v, ok := mm["A"] // v is 0 and ok is false
</syntaxhighlight>
However, an attempt to '''write''' a <code>nil</code> map will cause a runtime panic:
<syntaxhighlight lang='go'>
m["A"] = 1
</syntaxhighlight>
will result in:
<font size=-1.5>
panic: assignment to entry in nil map
</font>
A non-<code>nil</code> map can be obtained by initializing the map with <code>[[#Initialization_with_make()|make()]]</code> or with a [[#Initialization_with_a_Composite_Literal|composite literal]].
===Test <tt>nil</tt> Map===
<syntaxhighlight lang='go'>
var mm map[string]int // nil map
if mm == nil {
  // nil map
}
</syntaxhighlight>
When we have a pointer to the map, the test becomes:
<syntaxhighlight lang='go'>
if *mm == nil {
  // nil map
}
</syntaxhighlight>
so for those methods that get a pointer receiver to a map, and that write the map, do this to avoid panic:
<syntaxhighlight lang='go'>
func (mm *<map_type>) SomeMethod(...) {
    if mm == nil || *mm == nil {
        return fmt.Errorf("nil pointer or nil map")
    }
    ...
}
</syntaxhighlight>
==Empty Map==
An empty map has zero elements, but it is different from a <code>nil</code> map in that the underlying storage is allocated. For an empty map, <code>fmt.Printf("%p", mm)</code> prints a non-zero pointer.
An empty map can be initialized with an empty [[#Composite_Literal|composite literal]]:
<syntaxhighlight lang='go'>
mm := map[string]int{}
</syntaxhighlight>
This declaration and initialization is equivalent with using <code>[[#make()|make()]]</code>:
<syntaxhighlight lang='go'>
mm := make(string]int)
</syntaxhighlight>
</syntaxhighlight>


Line 31: Line 156:
Values stored in a map can be referenced with the indexing operator <code>[[Go_Language#Indexing_Operator|[]]]</code>, using the <code>[key]</code> syntax.  
Values stored in a map can be referenced with the indexing operator <code>[[Go_Language#Indexing_Operator|[]]]</code>, using the <code>[key]</code> syntax.  


The indexing operator returns two values: a '''copy''' of the value associated with the key, or zero-value for the type if the key does not exist in map, and a boolean that says whether the key exists or not in the map.  
When used to read the map content, the indexing operator returns two values: a '''copy''' of the value associated with the key, or the zero-value for the map type if the key does not exist in map, and a boolean that says whether the key exists or not in the map. This is called the <span id='comma_ok'></span>"comma, ok" idiom. The second returned value can be syntactically omitted, but given the fact that the indexing operator returns the zero value for the type for missing keys, it is impossible to say without it whether the key is missing, or a zero-value was associated with the key.
 
The second returned value can be syntactically omitted, but given the fact that the indexing operator returns the zero value for the type and not <code>nil</code> on missing keys, it is impossible to say without it whether the key is missing, or a zero-value was associated with the key.
<syntaxhighlight lang='go'>
<syntaxhighlight lang='go'>
mm := map[string]int{"A": 1, "B": 2}
mm := map[string]int{"A": 1, "B": 2}
Line 40: Line 163:
</syntaxhighlight>
</syntaxhighlight>


If the key does not exist in map, the retuned value '''is not <code>nil</code>''', but the [[Go_Language#Zero_Value|zero value]] for the type.
Idiom:
<syntaxhighlight lang='go'>
if v, ok := mm["A"]; ok {
  // use the value
}
</syntaxhighlight>
 
If the key does not exist in map, the retuned value '''is not''' <code>nil</code>, but the [[Go_Language#Zero_Value|zero value]] for the type.


The indexing operator can be used to change the value associated with a key, or to add a new key/value pair:
The indexing operator can be used to change the value associated with a key, or to add a new key/value pair.
<syntaxhighlight lang='go'>
<syntaxhighlight lang='go'>
mm["A"] = 7
mm["A"] = 7
mm["C"] = 9
mm["C"] = 9
</syntaxhighlight>
</syntaxhighlight>
When used to write map content, the variable provided on the right side of the operator is passed by value: a copy of it is made and placed in the map so updating the variable after the assignment does not have an effect on the value in the map.


=Map Functions=
=Map Functions=
==<tt>delete()</tt>==
==<tt>delete()</tt>==
<code>delete()</code> is a [[Go_Functions#cap.28.29|built-in function]] that deletes a key/value pair from the map:
<syntaxhighlight lang='go'>
mm := map[string]int{"A": 1, "B": 2}
delete(mm, "A")
</syntaxhighlight>
If the key does not exist in the map, <code>delete()</code> is a no-op, and safe.
==<tt>len()</tt>==
==<tt>len()</tt>==
<code>[[Go Built-In Functions Length and Capacity#len.28.29|len()]]</code> returns the number of keys. Note that the built-in <code>cap()</code> function does not work on maps. Unlike [[Go_Slices#Overview|slices]], maps do not have the concept of [[Go_Slices#Capacity|capacity]].
<code>len()</code> is a [[Go_Functions#len.28.29|built-in function]] that returns the number of keys in the map. Note that the built-in <code>cap()</code> function does not work on maps. Unlike [[Go_Slices#Overview|slices]], maps do not have the concept of [[Go_Slices#Capacity|capacity]].


==<tt>make()</tt>==
==<tt>make()</tt>==
<code>make()</code> is a [[Go_Functions#make()|built-in function]] that creates ready-to-use map. See [[#Initialization_with_make()|Initialization with <tt>make()</tt>]] above.
==<tt>new()</tt>==
==<tt>new()</tt>==
<code>new()</code> will initialize a [[#nil_Map|<tt>nil</tt> map]], but it will return the pointer to the map, instead of the value.
A map returned by <code>new()</code> can be positively tested as <code>nil</code> (careful to test the map, not the pointer to it):
<syntaxhighlight lang='go'>
mm := new(map[string]int)
fmt.Println(*mm == nil) // true
</syntaxhighlight>
=Iterating over a Map=
=Iterating over a Map=
Use the <code>[[Go_Keyword_range#Iterating_over_Key.2FValues_in_a_Map|range]]</code> keyword.
<syntaxhighlight lang='go'>
for k, v := range mm {
    fmt.Printf("%s=%d\n", k, v)
}
</syntaxhighlight>
If we only want to iterate over keys and the values are not needed, this syntax is valid:
<syntaxhighlight lang='go'>
for k := range mm {
  fmt.Printf("key: %s\n", k)
}
</syntaxhighlight>
When iterating over a map with a <code>range</code> loop, the iteration order is not specified and it is not guaranteed to be the same from one iteration to the next.
=Maps of Maps=
There are situations when maps are hierarchical: a map has another map as values, and so on. An example of a two-level map is available below.
<syntaxhighlight lang='go'>
var twoLevelMap map[string]map[string]int
</syntaxhighlight>
<code>make()</code> makes a non-nil, but empty map, so the second-level map instances will also to be explicitly created with <code>make()</code>.
<syntaxhighlight lang='go'>
twoLevelMap = make(map[string]map[string]int)
secondLevelMap = twoLevelMap["something"] // secondLevelMap is a nil map.
</syntaxhighlight>


=TODO=
=Maps and Concurrency=
{{Internal|Go_Maps_TODEPLETE|Go_Maps_TODEPLETE}}
Maps are not safe for concurrent use. A map instance must not be concurrently accessed from multiple goroutines. If you need to read from and write to a map from concurrently executing goroutines, the access must be mediated by a synchronization mechanism like [[Go_Mutex_and_RWMutex#RWMutex|<code>sync.RWMutex</code>]].
=The <tt>maps</tt> Package=
{{Internal|Go_Package_maps#Overview|The <tt>maps</tt> Package}}

Latest revision as of 21:17, 30 October 2024

External

Internal

Overview

A map is an unordered collection of key/value pairs. The map implementation uses the key as an index and retrieves the associated value in O(1). In order to store a key/value pair in the map, the map hash function is applied to the key, which produces the index of a bucket. The purpose of the hash function is to generate the index that is evenly distributed across the bucket population. Once a bucket is selected, the map stores the key/value pair in the bucket. In order to retrieve the value corresponding to a key, the same hash function is applied to select the bucket, and then the map iterates over the keys stored in that bucket.

Only comparable types can be keys. For more details see Key Identity below.

Like slices, maps hold references to the underlying data structures. The map instances must be initialized before attempting to use into them, either by using the make() function or a composite literal, otherwise you will get a runtime error, because a zero value for a map is nil, which panics when it is written.

The Go documentation used to refer to maps as reference types, but not anymore. The "reference type" terminology was removed from Go documentation.

Maps and Pass-by-Value

Go uses pass-by-value, so when a map argument is passed to a function, the internal fields are copied across on the function stack, including the pointer to the underlying storage. Therefore, the underlying storage data structure is intrinsically shared, even in case of pass-by-value. Both the original map header and the copy of the header passed to the function describe the same underlying storage. If the function changes the storage via its copy of the map, the modified elements can be seen outside the function through the original map variable.

m := map[string]int{"A": 1} // map[A:1]
changeMap(m)
fmt.Println(m) // map[A:2]

func changeMap(m map[string]int) {
     m["A"] = 2
}

The map variables are small, so their content can be copied fast. The fact that they include a pointer to the underlying storage prevents copying the storage, which has important performance implications.

Note that slices are not reference variables, as Go does not support reference variables, so no two different maps variables may point to the same map instance.

Key Identity

The key can be of any comparable type: a type for which the equality operator == is defined: boolean, integer, floating point and complex number, string, pointer, interface (as long as the dynamic type supports equality), structs, arrays and channels. Slices, functions and struct types that contain slices cannot be used as map keys because equality is not defined on them. An attempt will generate a compiler error.

Structs can be used to key data by multiple dimensions. For an example see https://go.dev/blog/maps#key-types.

The identity of a key in the map is given by its value and its type. We can use the same underlying value, but if the type is different, the keys will count as different. Two different aliases of the same type count as two different types.

type A int
type B int

m := map[interface{}]string{}
m[A(1)] = "red"
m[B(1)] = "blue"
fmt.Printf("%v", m)

will display:

map[1:red 1:blue]

This behavior is used to implement a technique that prevents key collisions when context.Context is used to store request-scoped key/value pairs.

Also see:

Identity, Equality, Comparability in Go

Pointers to Maps

Declaration and Initialization

A map can be declared with the long variable declaration.

var mapVarName map[KeyType]ValueType

var mm [string]int

This declaration creates a nil map. Because nil maps cannot be written, this form has a limited usefulness.

Inside functions, the short variable declaration can be used with make() or a composite literal.

Initialization with make()

make() allocates a the underlying data structures required by the map, and creates a new map header to describe it, all at once. Note that unlike new(), make() returns the map, not a pointer. The returned map is empty, but ready to use.

mm := make([string]int)

Initialization with a Composite Literal

Composite literals can be used for map initialization.

Long variable declaration and initialization with type inference:

var mm = map[string]int{"A": 1, "B": 2}

Short variable declaration:

mm := map[string]int{"A": 1, "B": 2}

nil and Empty Map

nil Map

Declaring a map with the long declaration and without initialization yields a nil map:

var mm map[string]int

A nil map is also returned by the new() function, but new() returns the pointer to the map, not the map handle itself.

var mmp = new(map[string]int)

fmt.Printf("%p", mm) prints 0x0. The underlying data structures of the map handle are initialized with zero values.

A nil map behaves like an empty map when reading. An attempt to read a key with the indexing operator [] operator from a nil map will not panic, but it will return the zero value for the map type.

v, ok := mm["A"] // v is 0 and ok is false

However, an attempt to write a nil map will cause a runtime panic:

m["A"] = 1

will result in:

panic: assignment to entry in nil map

A non-nil map can be obtained by initializing the map with make() or with a composite literal.

Test nil Map

var mm map[string]int // nil map
if mm == nil {
  // nil map
}

When we have a pointer to the map, the test becomes:

if *mm == nil {
  // nil map
}

so for those methods that get a pointer receiver to a map, and that write the map, do this to avoid panic:

func (mm *<map_type>) SomeMethod(...) {
    if mm == nil || *mm == nil {
         return fmt.Errorf("nil pointer or nil map")
    }
    ...
}

Empty Map

An empty map has zero elements, but it is different from a nil map in that the underlying storage is allocated. For an empty map, fmt.Printf("%p", mm) prints a non-zero pointer.

An empty map can be initialized with an empty composite literal:

mm := map[string]int{}

This declaration and initialization is equivalent with using make():

mm := make(string]int)

Operators

Indexing Operator []

Values stored in a map can be referenced with the indexing operator [], using the [key] syntax.

When used to read the map content, the indexing operator returns two values: a copy of the value associated with the key, or the zero-value for the map type if the key does not exist in map, and a boolean that says whether the key exists or not in the map. This is called the "comma, ok" idiom. The second returned value can be syntactically omitted, but given the fact that the indexing operator returns the zero value for the type for missing keys, it is impossible to say without it whether the key is missing, or a zero-value was associated with the key.

mm := map[string]int{"A": 1, "B": 2}
v, ok := mm["A"]
fmt.Printf("%d, %t\n", v, ok) // will display 1, true

Idiom:

if v, ok := mm["A"]; ok {
  // use the value
}

If the key does not exist in map, the retuned value is not nil, but the zero value for the type.

The indexing operator can be used to change the value associated with a key, or to add a new key/value pair.

mm["A"] = 7
mm["C"] = 9

When used to write map content, the variable provided on the right side of the operator is passed by value: a copy of it is made and placed in the map so updating the variable after the assignment does not have an effect on the value in the map.

Map Functions

delete()

delete() is a built-in function that deletes a key/value pair from the map:

mm := map[string]int{"A": 1, "B": 2}
delete(mm, "A")

If the key does not exist in the map, delete() is a no-op, and safe.

len()

len() is a built-in function that returns the number of keys in the map. Note that the built-in cap() function does not work on maps. Unlike slices, maps do not have the concept of capacity.

make()

make() is a built-in function that creates ready-to-use map. See Initialization with make() above.

new()

new() will initialize a nil map, but it will return the pointer to the map, instead of the value.

A map returned by new() can be positively tested as nil (careful to test the map, not the pointer to it):

mm := new(map[string]int)
fmt.Println(*mm == nil) // true

Iterating over a Map

Use the range keyword.

for k, v := range mm {
    fmt.Printf("%s=%d\n", k, v)
}

If we only want to iterate over keys and the values are not needed, this syntax is valid:

for k := range mm {
  fmt.Printf("key: %s\n", k)
}

When iterating over a map with a range loop, the iteration order is not specified and it is not guaranteed to be the same from one iteration to the next.

Maps of Maps

There are situations when maps are hierarchical: a map has another map as values, and so on. An example of a two-level map is available below.

var twoLevelMap map[string]map[string]int

make() makes a non-nil, but empty map, so the second-level map instances will also to be explicitly created with make().

twoLevelMap = make(map[string]map[string]int)
secondLevelMap = twoLevelMap["something"] // secondLevelMap is a nil map.

Maps and Concurrency

Maps are not safe for concurrent use. A map instance must not be concurrently accessed from multiple goroutines. If you need to read from and write to a map from concurrently executing goroutines, the access must be mediated by a synchronization mechanism like sync.RWMutex.

The maps Package

The maps Package