Go Packages: Difference between revisions
Line 157: | Line 157: | ||
The package name can be changed by the consuming code one a per-file basis by [[#Renaming_Imports|renaming the import]]. | The package name can be changed by the consuming code one a per-file basis by [[#Renaming_Imports|renaming the import]]. | ||
Also see: {{Internal|Go_Style#Naming|Go Style}} | |||
==Conventions to Improve Code Readability== | ==Conventions to Improve Code Readability== | ||
===Use the Same Name for Package and Package Directory=== | ===Use the Same Name for Package and Package Directory=== |
Revision as of 22:12, 5 July 2024
External
Internal
Overview
Go modularization is built upon the concept of package.
A Go package is a collection of constants, variables, functions and type definitions, such as structs and interfaces that are connected to each other and provide related functionality. These constants, variables, etc. are referred to as members, or features of the package. All elements of a package exist in source files stored in the same directory, and that are compiled together as a unit. No Go source code may exist outside a package.
The intention behind bundling program elements together is to make the design and the maintenance of large programs practical. Packages are units that can be easier understood and changed, and that can also evolve independently of other packages. These characteristics allow packages to be shared, distributed and reused by different and otherwise independent projects. As such, a package should contain code that has a single purpose.
Write small packages, with a focused API, aimed at performing a single task.
Packages provide a namespace for their members. They also provide an encapsulation mechanism for their code, by hiding implementation details and only exposing features, such as variables, types or functions that are meant to be publicly consumed. Packages are one of Go approaches to reusing code, according to the Don't Repeat Yourself principle. Keeping pre-compiled packages around speeds up the compilation process, because a package whose source code did not change does not need recompilation.
Packages can be used by themselves, and they also can be published as part of modules. Modules have been introduced in Go 1.11.
How to Organize Packages
Process this again: https://www.youtube.com/watch?v=MzTcsI6tn-0
If you are going to have multiple packages, it is probably a good idea to orient them around two categories:
- business domain types
- services
Use this organization rather than building around of accidents of implementation.
The domain types are types that model the business functionality and objects. Services are packages that operate on or with the domain types.
The packages that contain the domain types should also define the interfaces between your domain types and the rest of the world. These interfaces define the things you want to do with your domain types: Employee
, ProductService
, SupplierService
, etc.
The domain type package, or the root package of your application should not have any external dependencies. They only exist for the purpose of defining your types and their dependencies.
The implementation of your domain interfaces should be in separate packages (sub-packages), organized by dependency. Dependencies are external data sources, transport logic (http, RPC), etc. You should have one package per dependency.
Other ideas are available here:
- Standard Package Layout https://www.gobeyond.dev/standard-package-layout/
Import Path
Packages are consumed by importing them with the import
keyword. For details on the syntax, see the Import Statement section, below.
When importing a package, one uses an import path, not the package name, which is explained in detail below:
import "<import-path>"
Example:
import "a/b/c"
The import path is a file system path-like string that uniquely identifies a package. The language specification does not define what the string means, it is left to the tool that performs the import and compiles the source code to interpret it.
In case the package in question is not part of a module, the import path is the relative path of the local file system directory that contains the package source files. The path is relative to a src
directory whose parent is listed in GOPATH
. All source files that make the package must be stored in the same directory, referred to as package directory.
. ← the directory that must be listed in GOPATH to make the package visible to its consumers │ └─ src └─ a └─ b └─ c ├─ file1.go ├─ file2.go └─ file3.go
In the example above, file1.go
, file2.go
and file3.go
are all declared as part of package x
.
file1.go
may look like:
package x
var SomeValue = 10
// ...
To use the features exported by package x
, the consuming package must import the package x
using its import path "a/b/c", and must use its features by prefixing them with the package name:
package consumer
import "a/b/c"
func F() {
println(x.SomeValue)
}
As an example, the public variable SomeValue
exported by package x
is used by prefixing it with the package name. The resulting identifier x.SomeValue
is referred to as a qualified identifier.
However, the package name to use for qualified identifiers is not obvious, at least if one looks at the import line. The import path does not have to have anything in common with the package name. This makes this very generic, yet legal case slightly confusing, from a code readability point of view. The reader may ask: where does "x" come from? The package name is known to the compiler, of course, but the programmer must look into a package source file to figure out the name. The readability can be improved with naming conventions and syntax, as shown below.
For more details on declaring packages, see Declaring Packages section, below.
If the consumed package is part of a module, the package import path starts with the module path, followed by path of the directory containing the package source files, relative to the root of the module. Module path is explained here. The root of the module is the directory containing the go.mod
file.
. ├─ go.mod # declares that this is module "example.com" │ └─ a └─ b └─ c ├─ file1.go ├─ file2.go └─ file3.go
The same import and usage syntax applies, the only difference being that the package import path includes the module path:
package consumer
import "example.com/a/b/c"
func F() {
println(x.SomeValue)
}
As in the previous "package-only" example, the package name does not show anywhere in the import line. To make the code more readable, the following conversions are recommended:
- Use the same name for package and package directory
- Explicitly declare the package name in the import line.
Package Name
The package name is the identifier that follows the package
keyword, in all source files of the package:
package x // the package name
...
Package names are generally given lowercase, single-word names. They should be short, concise and evocative, but not cryptic. Avoid long package names. The package name must comply with the Go name requirements.
The contents of a package should be consistent with the name. If you start noticing that a package includes extra logic that has no relationship to the package name, consider exporting it as a separate one, or use a more descriptive package name. Generally, you will find that the easier it is to give a short and specific self-descriptive name to a package, the better your code composition is. Package names usually take the singular form. Standard library packages bytes
, errors
and strings
use the plural to avoid naming conflicts with the corresponding pre-declared types or keywords. Package names with underscores, hyphens or mixed caps should be avoided.
As explained above, the package name is not required to show up in the name of any of the file system elements associated with the package, such as package directory or package source files, though code readability can be improved by using the same name for the package and its hosting directory. The name of the package and its directory should match if possible, unless you are in one of these situations.
When imported into a consuming package, the package name becomes an accessor for its contents. The names exported by the imported packages are referred with qualified identifiers, where the first part is the name of of the imported package and the second is the public feature name itself:
import "quota"
...
quota.ComputeLimit(...)
Knowing this, the exported names in the package can be chosen in such a way to avoid repetition, as described in Try to Make Package Names and Exported Names Work Together, below.
The package name can be changed by the consuming code one a per-file basis by renaming the import.
Also see:
Conventions to Improve Code Readability
Use the Same Name for Package and Package Directory
If possible, use the same name for the package directory and package name. Using different names is legal, but using the same name will improve readability by turning this:
package consumer
import "a/b/c"
func F() {
println(x.SomeValue)
}
into this:
package consumer
import "a/b/x" // the reader gets a hint where "x" is coming from
func F() {
println(x.SomeValue)
}
The second version is slightly reader-friendlier, giving at least a hint where "x" is coming from.
However, there are situations when the name of the package directory must be different from the package name. There are at least three cases when this situation is justified.
- Packages defining an executable command. The name of the package has to be
main
, regardless of the name of the directory containing the command source files. If you can also name the package directory "main", that would be ideal, but that is not always possible. - Some files in the directory may have the suffix
_test
on their package name if the file name ends in_test.go
. Such a directory will define two packages: the normal one and an external test package. The_test
suffix signals togo test
that it must build both packages, and specifies which files belong to each packages. Also see Go Testing. - Some dependency management tool append version number suffixes to package import paths.
Declare the Package Name in Import Line
If using the same name for package and package directory is not an option, the readability can be improved by using a syntactic feature of the import
statement which allows explicitly specifying the package name in the import statement:
package consumer
import x "a/b/c" // unequivocally clarifies where 'x' is coming from
func F() {
println(x.SomeValue)
}
Try to Make Package Names and Exported Names Work Together
Since each reference to the member of an imported package uses the imported package name as qualifier, the package name will provide context for its content. As such, the burden of describing the package semantics is borne equally by the package name and the member name. Design the names to work together, like in yaml.Reader
. yaml.YAMLReader
would be redundant. Another example is a buffered reader type in the bufio
package, which is is called Reader
, not BufferedReader
because when imported, it will be referred to as bufio.Reader
, and not with the repetitive bufio.BufferedReader
.
This recommendations apply to constants, variables, function and type names.
However, this is just a guideline. In some cases, duplicating the package name in the name of the exported feature may make sense. Use your judgement.
Document Packages
For small packages, add a comment line above the package declaration describing its contents, as shown in the Declaring Packages section. This only works if the package has only one file. What to do if the package has more than one file? Where to place the documentation? For large packages, use a doc.go
file.
Modularization Considerations
Packages as Namespaces
Each package defines a distinct namespace that hosts all its identifiers. Within the package namespace, all identifiers must be unique.
When a public feature of a package, such as a function or a type is used outside of the package, its name must prefixed with the name of the package, with the intention of making the package and feature name pair unique in the universe block. Such a name is referred to as qualified identifier. For example, the Println()
function, exported by the fmt
package, is invoked with the fmt.Println
qualified identifier:
import "fmt"
...
fmt.Println("hello")
This mechanism allows us to chose short, succinct names for the members of a package, without creating conflicts with other parts of the program.
Packages as Encapsulation Mechanism
The package is the most important mechanism for encapsulation in Go programs. Packages control which names are visible outside the package, by a simple convention: all names declared in the package block or field names and method names that start with an uppercase letter are publicly visible. Publicly visible names are referred to as exported names. For more details on exporting package members, see the Exporting Package Members section below.
Encapsulation is useful because it allows the package maintainer to change the implementation of the hidden members without affecting the public interface of the package, thus allowing the package to evolve without breaking its dependents. Restricting variables' visibility constrains the package clients to access and update them only through exported functions that preserve internal invariants and enforce mutual exclusion in concurrent programs.
Dependencies
Each import declaration established a dependency from the importing package to the imported package. go build
detects dependency cycles:
package main imports a imports c imports b imports a: import cycle not allowed
Declaring Packages
Packages are defined by writing one or more source files that declare association with the package. Every source file that is part of the implementation of a package must start with the package
statement, which consists of the package
keyword followed by the name of the package the source file belongs to.
// Package color is a package that provides functionality around colors
package color
...
All files sharing the same package name form the implementation of the package. The identifiers declared in any of the files of a package are accessible from all other files of the same package.
All source files of a package inhabit the same file system directory, referred to as package directory.
A directory may not contain source files belonging to different packages. In such a situation, the compiler raises an error:
found packages colors (colors.go) and shapes (shapes.go) in .../src/colors
It is legal to declare packages that live in different package directories but use the same package name. They can even be used together, as long as the consuming package is aware they're different packages and makes that syntactically obvious by using different aliases:
. └─ src ├─ a │ └─ file1.go # declares a package named "x" ├─ b │ └─ file2.go # declares a different package, also named "x" └─ consumer └─ consumer.go
file1.go
:
package x
var SomeValue = 10
file2.go
:
package x
var SomeValue = 20
consumer.go
:
package consumer
import (
"a"
x2 "b"
)
func F() {
println(x.SomeValue)
println(x2.SomeValue)
}
F()
will print:
10 20
Exporting Package Members
A package identifier is made publicly visible outside of the package if:
- The name is declared in the package block or is a struct field or a method name.
- The name starts with uppercase letter.
Publicly visible names are referred to as exported names. Code that belongs to another package and that imports the package can access the exported names. Names that are not explicitly exported are not visible publicly. "Hidden" or "unexported" is also used in this situation. This is how encapsulation is implemented for packages. Note that identifiers, and not values, are exported or unexported. It is a good practice to only expose the package elements that we explicitly want other package to use, and hide everything else. This allows to change the hidden parts later, without affecting the package's consumers.
In the following example:
package color
var color string = "blue"
func Color() string {
return color
}
the color
variable is private to the package, but it can be accessed with the public Color()
function.
Just because an identifier is unexported, it does not mean other packages can't indirectly access it. A function can return the value of an unexported type and this value is accessible by any calling function, even if the calling function has been declared in a different package. This can be coupled with the short declaration operator :=
, which infers the type of the variable. A variable of that type cannot be explicitly declared with var
.
Embedded Type Export
Unexported Members
A package member that is not explicitly exported becomes unexported automatically.
Building Packages
Building with go build
Building with GoLand
Publishing Packages
When a package file changes, the entire package must be recompiled and all the packages that depend on it.
Publishing Locally
Publishing Remotely
Consuming Packages
Import Statement
Packages are consumed with an import statement that consists in the import
keyword, followed by one or more import paths. Consuming a package means accessing and using a package's exported identifiers. When a package is imported during compilation, the compiler searches the local filesystem according to the rules specified here:
Each package can be imported and used individually, so developers can import only the functionality they need. The import
statement syntax follows:
import "fmt"
import (
"bufio"
"fmt"
"os"
"strings"
)
Imported packages may be grouped introducing blank lines. By conventions, the names in a group are sorted alphabetically,
Once imported, the package names are used as qualifiers in the qualified identifiers of package's exported names. The package names are inferred automatically by the compiler, but they can be locally changed by renaming imports.
Renaming Imports
The package name is only the default name for imports. It need not be unique across all source code. In case of collision, the consuming package can choose to use a different local name for the imported package. A different qualifier can be declared, providing an alias for the package name. This is called renaming an import:
import myQualifier "a/b/c"
...
println(myQualifier.SomeValue)
This feature is useful when two packages have the same name when their import paths differ. As an example, math/rand
and crypto/rand
have the same package name rand
. In this situation, one of the imports must be renamed. Renaming only affects the importing file.
Blank Imports
It is a compilation error to import a package but not refer to it. However, there are situation when we want to import a package merely for the initialization side effects. This can be achieved using the blank identifier at import:
import _ "image/png" // this registers the PNG decoding upon import
This is known as a blank import.
Package Initialization
Package initialization begins by importing dependency packages, then initializing the package global state, which consists in all package-level constants and variables, in the order in which they are declared. This process happens recursively for each dependency package. If the package has multiple files, they are initialized in the order in which they are provided to the compiler. The go
tool sorts them by name before invoking the compiler. The initialization continues with the invocation of the init()
functions, if they exist.
init()
Any file may contain any number of functions with the following signature:
func init() {
...
}
Such init()
function cannot be called or referenced, but they are automatically executed in the order in which they are declared, as part of package initialization, when the package they're part of is imported, after the initialization of the package global constants and variables. Note that init()
will only run if the package is imported.
If multiple init()
functions exist in a file, they will be executed in the order in which they were declared in the file.
If there are multiple init()
function in a package, declared across multiple files, the files will be sorted in their alphabetical order and init()
functions will be executed in that order.
If multiple packages are imported, one package is initialized at a time, the dependencies are fully initialized before the dependents are initialized, so the init()
functions in dependencies are executed before the init()
functions declared in dependents. Each init()
function is executed only once.
The main
is the last to be initialized, where the init()
functions will be executed after global constant and variable initialization and before the main()
function.
Package Global State
A package's global state includes the values of all constants and variables declared in the package's lexical block. We should strive to maintain little to no mutable package global state. Explicit Component Dependencies section gives at least one reason why. Given the fact that the only purpose of the init()
function is to initialize package global state, using it is a serious red flag in almost any program.
Executables and the "main" Package
A binary executable is created when the Go build tool encounters a main
package among the packages provided as arguments to go build
or go install
. In both cases, the build logic invokes the linker that assembles a self-contained executable from the main
package and its dependencies. The linker uses the package as a start point and walks the package dependency graph when building the binary executable.
The main
package must declare a main()
function, otherwise the compiler raises an error:
runtime.main_main·f: function main is undeclared in the main package
The executable will be named differently, and placed in different locations depending on command line arguments and the value of certain environment variables when the build logic is executed. For more details see:
Single-Type Packages
Packages that expose one principal data type, plus its methods and often a New()
function to create instances are referred to as single-type packages. The enclosed type is always qualified with the package name when used, so the package names should be short.
Internal (Private) Packages
The module's code that should not be imported by the module's dependents should be placed in a directory named internal
, or in one of its recursive subdirectories. The packages declared under a directory named internal
are prevented by the compiler to be imported by packages that live outside the source subtree in which the internal packages resides. internal
is official, this pattern is enforced by the Go compiler. When the compiler sees an import of a package with internal
in its path, it verifies that the package doing this import is within the directory tree rooted at the parent of the internal
directory. Packages declared in internal
siblings directories can import.
In the following example, the package a
can import from package b
, because a
has the internal
parent as its ancestor:
. ├── go.mod └── level0 ├── internal │ └── b │ └── b.go └── level1-1 └── pkg └── a └── a.go
Normally, the projects use a less convoluted directory structure, where internal
and pkg
are siblings in the module root directory.
This is what happens in GoLang if a package tries to import from an internal
directory:
A typical project layout employs this pattern if it makes sense. See:
TODO:
- https://go.dev/doc/go1.4#internalpackages
- Addison-Wesley The Go Programming Language Section 10.7.5
- Microservices with Go. Chapter 2. "Private Packages" https://learning.oreilly.com/library/view/microservices-with-go/9781804617007/B18865_02.xhtml#:-:text=Private%20packages It seems "internal" does not work the way it is described there. I can use code declared in "internal" from outside of that directory.
Hierarchical (Nested) Packages
Hierarchical packages or nested packages.
When necessary, use a descriptive parent package and several children packages implementing functionality, like the standard library encoding
package:
./encoding/ ├── ascii85 // package ascii85 │ ├── ascii85.go │ └── ascii85_test.go ├── asn1 │ ├── asn1.go // package asn1 │ ├── asn1_test.go │ ├── common.go │ └─ ... ├── base32 │ └── ... ├── ... ├── xml │ └── ... │ └── encoding.go // package encoding
Vendoring
"Vendoring" is the act of making a local copy of a third party dependency package a project depends on. This copy is traditionally placed inside the project source tree and then saved in the project repository.
The associated directory structure is described here:
The code below a directory named "vendor" is importable only by code in the directory tree rooted at the parent of "vendor", and only using an import path that omits the prefix up to and including the vendor element. Code in vendor directories deeper in the source tree shadows code in higher directories. Within the subtree rooted at foo
, an import
of crash/bang
resolves to foo/vendor/crash/bang
, not the top-level crash/bang
. Code in vendor directories is not subject to import path checking. When go get
checks out or updates a Git repository, it now also updates submodules. Vendor directories do not affect the placement of new repositories being checked out for the first time by go get
. Those are always placed in the main GOPATH
, never in a vendor subtree.
Locating Third-Party Packages
Use:
to find packages with functionality you might find useful.
TODO