Go Packages: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
Line 287: Line 287:


A binary executable is created when the [[Go_Tool#Overview|Go build tool]] encounters a <code>main</code> package among the packages provided as arguments to <code>[[Go_Tool#build|go build]]</code> or <code>[[Go_Tool#install|go install]]</code>. In both cases, the build logic invokes the linker that assembles a self-contained executable from the <code>main</code> package and its dependencies.
A binary executable is created when the [[Go_Tool#Overview|Go build tool]] encounters a <code>main</code> package among the packages provided as arguments to <code>[[Go_Tool#build|go build]]</code> or <code>[[Go_Tool#install|go install]]</code>. In both cases, the build logic invokes the linker that assembles a self-contained executable from the <code>main</code> package and its dependencies.
The <code>main()</code> function must be declared in the <code>main</code> package. If the <code>main()</code> function is missing, the compiler indicates an error:
<font size=-2>
runtime.main_main·f: function main is undeclared in the main package
</font>


=TO DEPLETE=
=TO DEPLETE=

Revision as of 19:07, 29 September 2023

External

Internal

Overview

Go modularization is built upon the concept of package.

A Go package is a collection of constants, variables, functions and type definitions, such as structs, and interfaces that are inter-related and provide coherent, unified functionality. These constants, variables, etc. are referred to as members, or features of the package. All elements of a package exist in source files stored in the same directory, and that are compiled together as a unit. Packages are mandatory, no Go source code may exist outside a package.

The intention behind bundling program elements together is to make the design and the maintenance of large programs practical. Packages are units that can be easier understood and changed, and that can also evolve independently of other packages. These characteristics allow packages to be shared, distributed and reused by different and otherwise independent projects. Packages provide a namespace for their members. They also provide an encapsulation mechanism for their code, by hiding implementation details and only exposing features, such as variables, types or functions that are meant to be publicly consumed. Packages are part of Go's way of reusing code, according to the Don't Repeat Yourself principle. Keeping pre-compiled packages around speeds up the compilation process, because a package whose source code did not change does not need recompilation.

Packages can be used by themselves, and they also can be published as part of modules. Modules have been introduced in Go 1.11.

Packages as Namespaces

Each package defines a distinct namespace that hosts all its identifiers. Within the package namespace, all identifiers must be unique.

When a public feature of a package, such as a function or a type is used outside of the package, its name must prefixed with the name of the package, with the intention of making the package and feature name pair unique in the universe block. Such a name is referred to as qualified identifier. For example, the Println() function, exported by the fmt package, is invoked with the fmt.Println qualified identifier:

import "fmt"
...
fmt.Println("hello")

This mechanism allows us to chose short, succinct names for the members of a package, without creating conflicts with other parts of the program.

Packages as Encapsulation Mechanism

The package is the most important mechanism for encapsulation in Go programs. Packages control which names are visible outside the package, by a simple convention: all names declared in the package block or field names and method names that start with an uppercase letter are publicly visible. Publicly visible names are referred to as exported names.

Encapsulation is useful because it allows the package maintainer to change the implementation of the hidden members without affecting the public interface of the package, thus allowing the package to evolve without breaking its dependents. Restricting variables' visibility constrains the package clients to access and update them only through exported functions that preserve internal invariants and enforce mutual exclusion in concurrent programs.

Dependencies

Each import declaration established a dependency from the importing package to the imported package. go build detects dependency cycles:

package main
	imports a
	imports c
	imports b
	imports a: import cycle not allowed

Idiomatic Package Conventions

Write small packages, with a focused API, aimed at performing a single task.

Package names are generally given lowercase, single-word names. They should be short, concise and evocative, but not cryptic. Avoid long package names. The contents of a package should be consistent with the name. If you start noticing that a package includes extra logic that has no relationship to the package name, consider exporting it as a separate one, or use a more descriptive package name. Generally, you will find that the easier it is to give a short and specific self-descriptive name to a package, the better your code composition is. Package names usually take the singular form. Standard library packages bytes, errors and strings use the plural to avoid naming conflicts with the corresponding pre-declared types or keywords. Package names with underscores, hyphens or mixed caps should be avoided.

The name of the package and the name of the directory hosting the package source files should match if possible, unless you are in one of these situations.

The package members intended to be exported must start with an uppercase character. Only expose the package elements that we explicitly want other package to use, and hide everything else.

Since each reference to the member of an imported package uses the imported package name as qualifier, the package name will provide context for its content. As such, the burden of describing the package maker is borne equally by the package name and the member name. Design the names to work together, like in yaml.Reader. yaml.YAMLReader would be redundant.

Every package should have a comment describing its contents.

Separate private code using an internal directory.

Also see:

Go Style

Typical Package Layout and Files

.
├── file1.go
├── file1_test.go
├── file2.go
├── file2_test.go
├── ...
├── main.go # The file containing the main() function.
├── doc.go # Package documentation. A separate file is not necessary for small packages.
├── README.md # A README file written in Markdown
├── LICENSE
└── CONTRIBUTING.md

Declaring Packages

Packages are defined by writing one or more source files that declare association with the package. Every source file that is part of the implementation of a package must start with the package statement, which consists of the package keyword followed by the name of the package the source file belongs to.

// color is a package that provides functionality around colors
package color

...

All files sharing the same package name form the implementation of the package. The identifiers declared in any of the files of a package are accessible from all other files of the same package. All source files of a package inhabit the same file system directory. A package's source files cannot be spread across multiple directories. A directory may not contain source files belonging to different packages. In such a situation, the compiler raises an error:

found packages colors (colors.go) and shapes (shapes.go) in .../src/colors

Source Files

Document source file naming conventions.

Package Name

The package name is given by the identifier that follows the package keyword. A package name must comply with the Go name requirements.

Conventionally, the name of the directory that hosts the package's source files should match the package name, but this is not a requirement, and it is not enforced. It is good practice though, otherwise the source code will contain confusing constructs like the one described below.

The package name provides the default identifier to use when the package members are accessed from an importing package and it is conventionally the last segment of the package import path. Alternative identifiers can be used, as shown in the "Renaming Imports" section below.

Import Path

Each package is identified by a unique string called package import path. The import path follows the import keyword in the import statement when consuming the package.

The language specification does not define what the string means, it is left to the tool that performs the import and compiles the source code to interpret it. In case of the go tool, the import path denotes a directory that contains the package's source files, relative to a GOPATH element, which is usually a src directory, a subdirectory of the workspace. The last segment of the path is usually the package name, but this is not enforced in any way, and some times it might not actually be the case.

.
└─ src
    └─ misc
        └─ shapes # the package directory name coincides with the package name
            └─ file1.go # the file belongs to the "shapes" package

In the above example, the import path is misc/shapes and the package name is shapes.

import "misc/shapes"
...
shapes.Apply()

The import paths should be globally unique for packages intended for sharing. That is why is good practice to start the import path with the name of the Internet domain of the organization that owns or hosts the package.

The fact that the last segment of the import path coincides with the package name is purely conventional. There could be situations where the name of the directory that hosts the package is different from the package name. There are at least three cases when the exceptions are justified. The first are the packages defining an executable command. The name of the package has to be main, regardless of the name of the directory containing the command source files. The second exception is that some files in the directory may have the suffix _test on their package name if the file name ends in _test.go. Such a directory may define two packages: the normal one and an external test package. The _test suffix signals to go test that it must build both packages, and specifies which files belong to each packages. Also see: Go Testing. The third exception is that some dependency management tool append version number suffixes to package import paths.

In all other cases, this situation should be avoided, because the import path is the filesystem path of the package, and the package name is the identifier that follows the package keyword in the package files, and we will find ourselves in this situation:

.
└─ src
    └─ weights # the import path is "weights"
        └─ file2.go # the file belongs to the "nuances" package

import "weights"
...
nuances.Apply() // this is confusing, so don't do it

Modules and Import Path

If a package belongs to a module, the package's import path is the module path joined with the package's subdirectory within the module. Packages in the standard library do not have a module path prefix.

Exporting Package Members

A package identifier is made publicly visible outside of the package if:

  1. The name is declared in the package block or is a struct field or a method name.
  2. The name starts with uppercase letter.

Publicly visible names are referred to as exported names. Code that belongs to another package and that imports the package can access the exported names. Names that are not explicitly exported are not visible publicly. "Hidden" or "unexported" is also used in this situation. This is how encapsulation is implemented for packages. Note that identifiers, and not values, are exported or unexported. It is a good practice to only expose the package elements that we explicitly want other package to use, and hide everything else. This allows to change the hidden parts later, without affecting the package's consumers.

In the following example:

package color

var color string = "blue"

func Color() string { 
  return color
}

the color variable is private to the package, but it can be accessed with the public Color() function.

Just because an identifier is unexported, it does not mean other packages can't indirectly access it. A function can return the value of an unexported type and this value is accessible by any calling function, even if the calling function has been declared in a different package. This can be coupled with the short declaration operator :=, which infers the type of the variable. A variable of that type cannot be explicitly declared with var.

Embedded Type Export

Structs | Embedded Type Export

Building Packages

Building with go build

Ensure that go tool is not in module-aware mode by setting GO111MODULE to "auto" and set the correct GOPATH value. Then execute go build:

go build -o <command-name> <package-1-import-path> <package-2-import-path> ...
go build -o ./color-tool main # colors and misc/shapes is implied

go build is capable of detecting dependencies, so not all package import paths should be specified in the command line. go build will build an executable, by invoking the linker, if the package main is among the compiled packages, otherwise it will just compile the packages and discard the result. For details on how the executable name is chosen, see the main package.

Building with GoLand

Building with GoLand

Publishing Packages

When a package file changes, the entire package must be recompiled and all the packages that depend on it.

Publishing Locally

This section documents compiling and installing packages locally with go install in absence of module support. For details on compiling and installing package within the context of a module, see:

Publishing Modules Locally

Single Package with No Dependencies

A single package is compiled, and its corresponding object file locally installed by executing:

go install <package-import-path>

We assume that while running the command, the GOPATH environment variable is set to point to a directory that contains the src/<package-import-path> directory. For a package whose import path is other/b, the corresponding directory layout is:

. ← This is the directory GOPATH points to
│ 
└── src
     └── other
          └── b
              └── b.go

where the b.go source file contains:

package b

...

We also assume that GO111MODULE environment variable is set to "off" or "auto".

The command to compile and install the b package, identified by its other/b import path is:

go install other/b

The command will compile the source code and install the package object file under ${GOPATH}/pkg/${GOOS}_${GOARCH}/<import-path>:

.
├── pkg
│    └── darwin_amd64
│         └── other
│              └── b.a
│ 
└── src
     └── other
          └── b
              └── b.go

If the pkg subdirectory does not exist, it will be created. The go install can be run from the ${GOPATH} directory, or from some any other directory. As long the GOPATH and GO111MODULE are correctly set, the results will be the same.

GOPATH Lists Multiple Directories

GOPATH may list multiple colon-separated directories. As long as the package to be installed is declared under one of those directories, it will be correctly located and go install will write the corresponding object files under the pkg subdirectory of the corresponding root.

If multiple GOPATH directories contain package with the same import path, only the first package will be processed.

Packages with Dependencies

If a package "a" depends on the package "other/b", the command

go install a

will correctly locate and compile the source files for both "a" and "other/b" packages, and if deeper dependencies exist, all source files in the transitive dependency tree. If any of the dependencies fails to compile, the installation process will fail. However, only the object file corresponding to the package "a" will be written in ${GOPATH}/pkg. To install the object files for the entire dependency chain, the packages have to be explicitly specified on the command line:

go install a other/b

Executables

An executable is produced if the linker detects a main package in the list of packages to be compiled and installed. go install compiles all dependencies, links them and installs the binary under the directory specified by the GOBIN environment variable or under the bin subdirectory of the GOPATH directory, depending on which environment variable is defined. If the bin directory does not exist, it is created.

Publishing Remotely

Consuming Packages

Import Statement

https://go.dev/ref/spec#Import_declarations

Packages are consumed with an import statement that consists in the import keyword, followed by one or more import paths. Consuming a package means accessing and using a package's exported identifiers. When a package is imported during compilation, the compiler searches the local filesystem according to the rules specified here:

Module-Aware or GOPATH Mode

Each package can be imported and used individually, so developers can import only the functionality they need. The import statement syntax follows:

import "fmt"
import (
	"bufio"
	"fmt"
	"os"
	"strings"
)

Imported packages may be grouped introducing blank lines. By conventions, the names in a group are sorted alphabetically,

Package Name as Qualifier

The package name is used by default to qualify the members of the package when used from an importing package:

import "colors"
...
colors.Apply() // the package name is used

Renaming Imports

A different qualifier can be declared, providing an alias for the package name. This is called renaming an import:

import details "colors"
...
details.Apply()

This feature is useful when two packages have the same name when their import paths differ. As an example, math/rand and crypto/rand have the same package name rand. In this situation, one of the imports must be renamed. Renaming only affects the importing file.

Blank Imports

It is a compilation error to import a package but not refer to it. However, there are situation when we want to import a package merely for the initialization side effects. This can be achieved using the blank identifier at import:

import _ "image/png" // this registers the PNG decoding upon import

This is known as a blank import.

Executables and the "main" Package

A binary executable is created when the Go build tool encounters a main package among the packages provided as arguments to go build or go install. In both cases, the build logic invokes the linker that assembles a self-contained executable from the main package and its dependencies.

The main() function must be declared in the main package. If the main() function is missing, the compiler indicates an error:

runtime.main_main·f: function main is undeclared in the main package

TO DEPLETE

The entry point in an executable program is provided by the main() function, which must be declared in the main package. The name of the package must be main, regardless of the name of the directory containing the source code. The "main" literal is a signal to go build or go install that it must invoke the linker to make an executable file. The linker will use the package as a start point for walking the package dependency graph when building the binary executable. As result, the main package produces the executable as result of its compilation. The name of the executable is given by default by the name of the first source code file from the list of source files that make the main package. This behavior can be changed with the -o go tool option.

The executable is placed in a location depending on the values of GOPATH and GOBIN environment variables. For more details see Publishing executable packages and Publishing executable modules.

The following code, declared in a blue.go source file, placed in any directory, not necessarily in a main directory, produces an executable named blue:

package main

import "fmt"

func main() {
  fmt.Printf("hello\n")
}
go build ./blue.go
./blue
hello

Also see:

main()

Package Initialization

Package initialization begins by initializing package-level variables in the order in which they are declared, resolving the dependencies first. If the package has multiple files, they are initialized in the order in which they are provided to the compiler. The go tool sorts them by name before invoking the compiler.

The initialization continues with the invocation of the init() functions, if they exist. Any file may contain any number of functions with the following signature:

func init() {
  ...
}

Such init() function cannot be called or referenced, but they are automatically executed in the order in which they are declared, when the package they're part of is imported. One package is initialized at a time, the dependencies are fully initialized before the dependents are initialized, so the init() functions in dependencies are executed before the init() functions declared in dependents. The main is the last to be initialized.

Single-Type Packages

Packages that expose one principal data type, plus its methods and often a New() function to create instances are referred to as single-type packages. The enclosed type is always qualified with the package name when used, so the package names should be short.

Internal (Private) Packages

TODO:

Go Internal Package.png

Hierarchical Packages

Hierarchical packages or nested packages.

The "pkg" Package

MiGo Chapter 2 Public packages

Vendoring

"Vendoring" is the act of making a local copy of a third party dependency package a project depends on. This copy is traditionally placed inside the project source tree and then saved in the project repository.

The associated directory structure is described here:

Workspace

The code below a directory named "vendor" is importable only by code in the directory tree rooted at the parent of "vendor", and only using an import path that omits the prefix up to and including the vendor element. Code in vendor directories deeper in the source tree shadows code in higher directories. Within the subtree rooted at foo, an import of crash/bang resolves to foo/vendor/crash/bang, not the top-level crash/bang. Code in vendor directories is not subject to import path checking. When go get checks out or updates a Git repository, it now also updates submodules. Vendor directories do not affect the placement of new repositories being checked out for the first time by go get. Those are always placed in the main GOPATH, never in a vendor subtree.

TODO