Pointers in Go: Difference between revisions
Tag: Manual revert |
|||
(41 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
=External= | =External= | ||
* https://go.dev/ref/spec#Pointer_types | * https://go.dev/ref/spec#Pointer_types | ||
* https://go.dev/doc/faq#Pointers | |||
=Internal= | =Internal= | ||
* [[Go_Language#Pointers|Go Language]] | * [[Go_Language#Pointers|Go Language]] | ||
* [[Variables,_Parameters,_Arguments#Pointer|Pointers]] | * [[Variables,_Parameters,_Arguments#Pointer|Pointers]] | ||
=Overview= | =Overview= | ||
Line 32: | Line 26: | ||
println(a) // will display 20 | println(a) // will display 20 | ||
</syntaxhighlight> | </syntaxhighlight> | ||
=<span id='Pointer_Variable_Name'></span>Pointer Variable Naming= | |||
Review of existing code has shown that people do not use special variable names to indicate that the variable contains a pointer. <code>someName</code> seems to be perfectly fine, and <code>someNamePtr</code> does not seem to be required. This is in part because the compiler knows how to handle transparently the difference between the values and pointers in some common cases. For example, a struct field is referred with the [[Go_Language#Selector_Operator|selector operator]] <code>.<field_name></code> [[Go_Structs#Selector_Operator_Versatility|regardless of whether the variable is a pointer to the structure or contains the struct value]]. | |||
Also see: {{Internal|Go_Variables#Naming|Variable Naming}} | |||
=Pointer Type= | |||
A '''pointer type''' denotes the set of all pointers to variables of a given type, called the '''base type''' of the pointer. Note that the base type and the associated pointer type are obviously two different types. Values of one cannot be assigned to another, and vice-versa. This is what happens when such an assignment is attempted: | |||
<font size=-1.5> | |||
./main.go:24: cannot use v2 (type *B) as type B in assignment | |||
</font> | |||
A pointer type is declared using the [[#The_Dereferencing_Operator_*|dereferencing operator]] <code>*</code> placed in front of the target type, which is the type of the stored value: | |||
<syntaxhighlight lang='go'> | |||
*int | |||
</syntaxhighlight> | |||
The difference between a base type and its associated pointer type is also relevant when we are discussing whether the type and its pointer type implement an interface. For a discussion on this subject, see: | |||
{{Internal|Go_Interfaces#When_does_a_Type.2FPointer_Type_Implement_an_Interface.3F|When does a Type/Pointer Type Implement an Interface?}} | |||
We cannot do pointer arithmetic. Assuming <code>ptr</code> is a <code>*int</code>, we cannot do <code>ptr + 1</code>: | |||
<font size=-1.5> | |||
invalid operation: ptr + 1 (mismatched types *int and int) | |||
</font>and we can't do <code>ptr + ptr2</code>: | |||
<font size=-1.5> | |||
invalid operation: ptr + ptr2 (operator + not defined on pointer) | |||
</font> | |||
=Escape Analysis= | =Escape Analysis= | ||
Line 90: | Line 109: | ||
C000094018 | C000094018 | ||
</font> | </font> | ||
=Pointer Operators= | =Pointer Operators= | ||
The pointer data type comes with two operators: <code>&</code> (the [[#The_Referencing_Operator_.26|referencing operator]]), and <code>*</code> (the [[#The_Dereferencing_Operator_.2A|dereferencing operator]]). | The pointer data type comes with two operators: <code>&</code> (the [[#The_Referencing_Operator_.26|referencing operator]]), and <code>*</code> (the [[#The_Dereferencing_Operator_.2A|dereferencing operator]]). | ||
=The Referencing Operator <tt>&</tt>= | ==The Referencing Operator <tt>&</tt>== | ||
The referencing operator | The '''referencing operator''', also known as the ampersand operator, returns an address, also known as a "reference", from a variable. <code>&</code> should be read as "address of ...". The address is represented internally as an instance of type <code>pointer</code>. The address points to the location in memory where the instance associated with the "referenced" variable is stored. | ||
<syntaxhighlight lang='go'> | <syntaxhighlight lang='go'> | ||
&< | &<variable_name> | ||
</syntaxhighlight> | </syntaxhighlight> | ||
Line 110: | Line 125: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
The referencing operator works with variables and also with struct literals. The syntax <code>&user{name:"Bill"}</code> where <code>user</code> is a <code>struct</code> is legal. | |||
However, it does not work with other literals, such as string or int. The following statement produces a compilation error: | |||
<syntaxhighlight lang='go'> | <syntaxhighlight lang='go'> | ||
*<pointer | s := &"somehting" // compilation error | ||
</syntaxhighlight> | |||
To "inline" such a declaration, an anonymous function can be used: | |||
<syntaxhighlight lang='go'> | |||
s := func() *string { s := "something"; return &s }() | |||
</syntaxhighlight> | |||
<font color=darkkhaki> | |||
TODO: understand why & works in case of a struct literal and it does not work for a string literal. Aren't both literals?</font> | |||
==The Dereferencing Operator <tt>*</tt>== | |||
The dereferencing operator, also known as the [[#Star_Operator|star operator]], takes a pointer and returns the value in memory the pointer's address points to. The variable must contain a pointer type instance, otherwise the code will not compile. The value thus exposed can be read or written. | |||
<syntaxhighlight lang='go'> | |||
*<pointer_variable_name> | |||
</syntaxhighlight> | </syntaxhighlight> | ||
Line 123: | Line 152: | ||
println(color) // prints "red" | println(color) // prints "red" | ||
</syntaxhighlight> | </syntaxhighlight> | ||
=When to Use Values and When to Use Pointers= | =When to Use Values and When to Use Pointers= | ||
If it makes sense for your use case, prefer using values and design your types so zero-values make logical sense and can be used by default. | |||
However, there are some situations when pointers make sense. | |||
'''Performance is not a good argument, most of the times'''. Passing pointers instead of values is generally slower, so performance is generally not an argument to use pointers. This is a consequence of Go being a garbage collected language. When a pointer is passed to a function, the runtime needs to perform [[#Escape_Analysis|escape analysis]] to figure out whether the variable should be store on stack or heap. If a lot of data is stored on heap, GC times increase. If the data is stored on the stack, no GC is needed, just push/pop operations. With less data stored on the heap, GC will have less work to do. The overhead of GC becomes less important when large amounts of data, like large structs, are copied around by pass-by-value. | |||
'''Mutability'''. If an external struct needs to be mutated from inside a function, this may be a good argument for using a pointer. The default is to use pass-by-value, the entire structure will be copied on the stack and the function will modify the copy. However, mutability can be problematic in concurrent situations. A function free of side-effects is safer to use. The classical example of a function that does not mutate its argument but returns a new, modified value is <code>append()</code>: | |||
<syntaxhighlight lang='go'> | |||
a := []int{1} | |||
a = append(a, 2) | |||
</syntaxhighlight> | |||
'''Pointer Receivers'''. It is a good idea to use a [[Go_Language_Object_Oriented_Programming#Pointer_Receiver_Type|pointer receiver]] everywhere, if you need at least one. The compiler will raise a static analysis warning if value and pointer receivers are mixed. See: {{Internal|Go_Language_Object_Oriented_Programming#Mixing_Value_and_Pointer_Receiver_Types|Mixing Value and Pointer Receiver Types}} | |||
'''To model true absence'''. If values are passed around, true absence of a value cannot be really modeled, as a missing value will always be supplanted by the zero-value for the type. It is impossible to tell whether zero-value means legitimate zero or absence. In this case, a <code>nil</code> pointer can represent true absence. The alternative to using a pointer is to use an additional boolean that provides a "present" semantics. | |||
=Pointers Lead to Values, the Reciprocal is Not Always True= | |||
This fact is important for [[Go_Methods#Method_Set|method sets]]. The method set of a pointer to a type includes the method set of the type. | |||
Also see: {{Internal|Go_Method_Set_for_Type_and_Method_Set_for_Pointer_to_Type#Overview| Method Set for Type and Method Set for Pointer to Type}} | |||
=Pointers and Interfaces= | |||
<font color=darkkhaki> | |||
TODO: | |||
* [[Go_Interfaces#Interface_Values|Go Interfaces | Interface Values]] | |||
* [[Go_Interfaces#Interfaces_as_Function_Parameters_and_Result_Values|Go Interfaces | Interfaces as Function Parameters and Result Values]] | |||
</font> |
Latest revision as of 22:28, 1 September 2024
External
Internal
Overview
A pointer is a data type that represents a virtual address in memory, usually the address of a location in memory that is referred by a variable.
A pointer can be declared as such:
var aPtr *int // a pointer to an int
A pointer can also be implicitly declared using the short variable declaration and the the referencing operator inside functions:
a := 10
aPtr := &a
aPtr
is a variable that contains the memory address of the memory location associated with the variable a
. Changing the memory value using a syntax that involves the pointer will surface in the value of the variable:
*aPtr = 20
println(a) // will display 20
Pointer Variable Naming
Review of existing code has shown that people do not use special variable names to indicate that the variable contains a pointer. someName
seems to be perfectly fine, and someNamePtr
does not seem to be required. This is in part because the compiler knows how to handle transparently the difference between the values and pointers in some common cases. For example, a struct field is referred with the selector operator .<field_name>
regardless of whether the variable is a pointer to the structure or contains the struct value.
Also see:
Pointer Type
A pointer type denotes the set of all pointers to variables of a given type, called the base type of the pointer. Note that the base type and the associated pointer type are obviously two different types. Values of one cannot be assigned to another, and vice-versa. This is what happens when such an assignment is attempted:
./main.go:24: cannot use v2 (type *B) as type B in assignment
A pointer type is declared using the dereferencing operator *
placed in front of the target type, which is the type of the stored value:
*int
The difference between a base type and its associated pointer type is also relevant when we are discussing whether the type and its pointer type implement an interface. For a discussion on this subject, see:
We cannot do pointer arithmetic. Assuming ptr
is a *int
, we cannot do ptr + 1
:
invalid operation: ptr + 1 (mismatched types *int and int)
and we can't do ptr + ptr2
:
invalid operation: ptr + ptr2 (operator + not defined on pointer)
Escape Analysis
Once a non-nil
value is assigned to a pointer, the Go runtime guarantees that the thing being pointed to will continue to be valid for the life time of the pointer. This allows for a pattern when what looks like a stack variable can be allocated inside a function, and a pointer to it returned outside the function. The pointer will remain valid even if the stack is unwound, the compiler will arrange for the memory location holding the value of i to be valid after the function return. This is done with escape analysis, which is the process of determining whether a variable should be stored on stack or on the heap:
func makeInt() *int {
i := 10
return &i
}
go build -gcflags="-m" cmd/acmd.go [...] cmd/acmd.go:4:2: moved to heap: i
How to Tell if a Variable is a Pointer
Use reflect.TypeOf()
on the variable. If the variable is a pointer, displaying the result of reflect.TypeOf()
will start with "*":
var b *int
fmt.Println(reflect.TypeOf(b)) // will print "*int"
Alternatively, use:
fmt.Printf("%#v\n", b) // will print "(*int)(nil)"
Displaying Pointers
To display the value at memory address stored in the pointer, must dereference:
fmt.Printf("%d\n", *aPtr)
To display the memory address stored in the pointer in a hexadecimal notation, with the "0x" prefix, use %p
or %v
, they are equivalent for pointers:
fmt.Printf("%p\n", aPtr)
fmt.Printf("%v\n", aPtr) // same thing
This will print:
0xc000012080
For more details on the pointer, including the type of the data it points to, use:
fmt.Printf("%#v\n", aPtr)
This will print:
(*int)(0xc000012080)
Pointers can be also represented using the "%X"
format specifier, which displays the pointer in base 16, upper case characters, without the "0x" prefix:
fmt.Printf("%X\n", aPtr)
This will print:
C000094018
Pointer Operators
The pointer data type comes with two operators: &
(the referencing operator), and *
(the dereferencing operator).
The Referencing Operator &
The referencing operator, also known as the ampersand operator, returns an address, also known as a "reference", from a variable. &
should be read as "address of ...". The address is represented internally as an instance of type pointer
. The address points to the location in memory where the instance associated with the "referenced" variable is stored.
&<variable_name>
color := "blue"
pointerToColor := &color
println(pointerToColor) // prints "0xc000058720"
The referencing operator works with variables and also with struct literals. The syntax &user{name:"Bill"}
where user
is a struct
is legal.
However, it does not work with other literals, such as string or int. The following statement produces a compilation error:
s := &"somehting" // compilation error
To "inline" such a declaration, an anonymous function can be used:
s := func() *string { s := "something"; return &s }()
TODO: understand why & works in case of a struct literal and it does not work for a string literal. Aren't both literals?
The Dereferencing Operator *
The dereferencing operator, also known as the star operator, takes a pointer and returns the value in memory the pointer's address points to. The variable must contain a pointer type instance, otherwise the code will not compile. The value thus exposed can be read or written.
*<pointer_variable_name>
color := "blue"
pointerToColor := &color
println(*pointerToColor) // prints "blue"
*pointerToColor = "red"
println(color) // prints "red"
When to Use Values and When to Use Pointers
If it makes sense for your use case, prefer using values and design your types so zero-values make logical sense and can be used by default.
However, there are some situations when pointers make sense.
Performance is not a good argument, most of the times. Passing pointers instead of values is generally slower, so performance is generally not an argument to use pointers. This is a consequence of Go being a garbage collected language. When a pointer is passed to a function, the runtime needs to perform escape analysis to figure out whether the variable should be store on stack or heap. If a lot of data is stored on heap, GC times increase. If the data is stored on the stack, no GC is needed, just push/pop operations. With less data stored on the heap, GC will have less work to do. The overhead of GC becomes less important when large amounts of data, like large structs, are copied around by pass-by-value.
Mutability. If an external struct needs to be mutated from inside a function, this may be a good argument for using a pointer. The default is to use pass-by-value, the entire structure will be copied on the stack and the function will modify the copy. However, mutability can be problematic in concurrent situations. A function free of side-effects is safer to use. The classical example of a function that does not mutate its argument but returns a new, modified value is append()
:
a := []int{1}
a = append(a, 2)
Pointer Receivers. It is a good idea to use a pointer receiver everywhere, if you need at least one. The compiler will raise a static analysis warning if value and pointer receivers are mixed. See:
To model true absence. If values are passed around, true absence of a value cannot be really modeled, as a missing value will always be supplanted by the zero-value for the type. It is impossible to tell whether zero-value means legitimate zero or absence. In this case, a nil
pointer can represent true absence. The alternative to using a pointer is to use an additional boolean that provides a "present" semantics.
Pointers Lead to Values, the Reciprocal is Not Always True
This fact is important for method sets. The method set of a pointer to a type includes the method set of the type.
Also see:
Pointers and Interfaces
TODO:
- Go Interfaces | Interface Values
- Go Interfaces | Interfaces as Function Parameters and Result Values