Python Language Modularization: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
 
(21 intermediate revisions by the same user not shown)
Line 39: Line 39:


=<span id='Module'></span>Modules=
=<span id='Module'></span>Modules=
In Python, a module is simply a file with the <code>.py</code> extension containing Python code. The nomenclature is completely opposite to Go's, where the simples modularization unit is a [[Go_Packages#Overview|Go package]].
There are three different ways to define a module in Python:
There are three different ways to define a module in Python:
* A module can be written in Python itself, with its code contained in one single code file.
* A module can be written in Python itself, with its code contained in one single code file.
Line 44: Line 46:
* A module can be intrinsically contained by the interpreter. This is called a [[#Built-in_Modules|built-in module]].
* A module can be intrinsically contained by the interpreter. This is called a [[#Built-in_Modules|built-in module]].


The modules written in Python are the most common, and they are the type of modules Python developer usually write. They are exceedingly straightforward to build: just place Python code in a file with the <code>py</code>. Modules are simple Python files: a module consists of '''just one''' file. The module can be [[#Importing|imported]] inside another Python program or executed on its own. The module can define object instances such as strings or lists, which are assigned to variables, and also functions and classes. If intended to run on its own, the module will also include runnable code.  
The modules written in Python are the most common, and they are the type of modules Python developer usually write. They are exceedingly straightforward to build: just place Python code in a file with the <code>.py</code> extension. Modules are simple Python files: a module consists of '''just one''' file. The module can be [[#Importing|imported]] inside another Python program or executed on its own. The module can define object instances such as strings or lists, which are assigned to variables, and also functions and classes. If intended to run on its own, the module will also include runnable code.  


The file name is the '''name of the module''' with the suffix <code>.py</code> appended. Modules are loaded into Python by the process of [[#Import|importing]], where the name of the objects present in one module are made available in the [[Python_Language#Variables_Namespace_and_Scope|namespace]] of the calling layer.  
The file name is the name of the module with the suffix <code>.py</code> appended. Modules are loaded into Python by the process of [[#Import|importing]], where the name of the objects present in one module are made available in the [[Python_Language#Variables_Namespace_and_Scope|namespace]] of the calling layer.  


<span id='Global_Namespace'></span>Each module has its own [[Python_Language#Variables_Namespace_and_Scope|global namespace]], where all objects defined inside the module (strings, lists, functions, classes, etc.) are identified by unique names.
<span id='Global_Namespace'></span>Each module has its own [[Python_Language#Variables_Namespace_and_Scope|global namespace]], where all objects defined inside the module (strings, lists, functions, classes, etc.) are identified by unique names.
Line 52: Line 54:
A module may exist as part of a [[#Package|package]], and a package may contain many modules.
A module may exist as part of a [[#Package|package]], and a package may contain many modules.
==Module Name==
==Module Name==
A module name is the name of the file the module resides in, with the extension <code>.py</code> stripped.
Naming conventions are documented here
Naming conventions are documented here


Line 433: Line 437:
Packages can be run as if they were scripts if the package provides the top-level script <code>__main__.py</code>. The file contains the code of the "main" module, which will be imported automatically when the package is imported. As such, <code>__main__.py</code> file is used to provide a command-line interface for a package.
Packages can be run as if they were scripts if the package provides the top-level script <code>__main__.py</code>. The file contains the code of the "main" module, which will be imported automatically when the package is imported. As such, <code>__main__.py</code> file is used to provide a command-line interface for a package.


<syntaxhighlight lang='py'>
import somepkg.some_module_1 as some_module_1
def main():
    print('.')
if __name__ == '__main__':
    # Execute when the module is not initialized from an import statement.
    main()
</syntaxhighlight>
<font color=darkkhaki>TO PROCESS: Idiomatic usage: https://docs.python.org/3/library/__main__.html#id1</font>
<font color=darkkhaki>TO PROCESS: Idiomatic usage: https://docs.python.org/3/library/__main__.html#id1</font>


Line 478: Line 492:
</syntaxhighlight>
</syntaxhighlight>
When you browse [[Python_Language#Python_Package_Index_PyPI|PyPI]], what you see are distribution packages. On a given package index, like PyPI, distribution package names must be unique.
When you browse [[Python_Language#Python_Package_Index_PyPI|PyPI]], what you see are distribution packages. On a given package index, like PyPI, distribution package names must be unique.
==Distribution Package Name==
==<span id='Distribution_Name'></span>Distribution Package Name==
Distribution packages can use hyphens <code>-</code> or underscores <code>_</code>. They can also contain dots <code>.</code>, which is sometimes used for packaging a subpackage of a namespace package. For most purposes, they are insensitive to case and to <code>-</code> vs. <code>_</code> differences, ex., <code>pip install Some_Package</code> is the same as <code>pip install some-package</code>. The precise rules are provided here: {{External|https://packaging.python.org/en/latest/specifications/name-normalization/#name-normalization}}
Distribution packages can use hyphens <code>-</code> or underscores <code>_</code>. They can also contain dots <code>.</code>, which is sometimes used for packaging a subpackage of a namespace package. For most purposes, they are insensitive to case and to <code>-</code> vs. <code>_</code> differences, ex., <code>pip install Some_Package</code> is the same as <code>pip install some-package</code>. The precise rules are provided here: {{External|https://packaging.python.org/en/latest/specifications/name-normalization/#name-normalization}}
==Relationship between the Distribution Package and the corresponding Import Package==
Most of the time, a distribution package provides one single import package, with the same name as the distribution package, or several import packages, though this is less common. For example, <code>pip install somepkg</code> lets you <code>import somepkg</code>. However, this is only a convention. PyPI and other package indices do not enforce any relationship between the name of the distribution package and the import packages it provides. A distribution package could provide an import package with a different name.


On a given package index, like PyPI, distribution package names must be unique. On the other hand, import packages have no such requirement. Import packages with the same name can be provided by several distribution packages.
The distribution name of the package is specified as <code>[project].name</code> in the <code>[[Pyproject.toml#name|pyproject.toml]]</code> file.
===Relationship between the Distribution Package and the corresponding Import Package===
Most of the time, a distribution package provides one single import package, with the same name as the '''distribution name of the package''', or several import packages, though this is less common. For example, <code>pip install somepkg</code> lets you <code>import somepkg</code>. However, this is only a convention. PyPI and other package indices do not enforce any relationship between the distribution name of the package and the import packages it provides. A distribution package could provide an import package with a different name.
 
On a given package index, like PyPI, the distribution name of the package must be unique. On the other hand, import packages have no such requirement. Import packages with the same name can be provided by several distribution packages.
 
==Distribution Package Version==
 
The version of the distribution package is specified as <code>[project].version</code> in the <code>[[Pyproject.toml#version|pyproject.toml]]</code> file.


==Built Distribution==
==Built Distribution==
A distribution format containing files and metadata that only need to be moved to the correct location on the target system, to be installed. [[#Wheel|Wheel]] is such a format. This format does not imply that Python files have to be precompiled. [[#Wheel|Wheel]] intentionally does not include compiled Python files.
A distribution format containing files and metadata that only need to be moved to the correct location on the target system, to be installed. [[#Wheel|Wheel]] is such a format. This format does not imply that Python files have to be precompiled. [[#Wheel|Wheel]] intentionally does not include compiled Python files.
==Source Distribution ("sdist")==
A distribution format, usually generated with <code>python -m build --sdist</code> that provides metadata and essential source files needed for installing by a tool like <code>pip</code> or for generating [[#Built_Distribution|built distribution]].
==Build Frontend==
==Build Frontend==
A tool that users might run that task arbitrary source trees or source distributions and builds source distributions or [[#Wheel|Wheel]] from them. The actual build is delegated to a [[#Build_Backend|build backend]]. Examples of build frontends are <code>pip</code> and <code>build</code>
A tool that users might run that task arbitrary source trees or source distributions and builds source distributions or [[#Wheel|Wheel]] from them. Examples of build frontends are <code>pip</code> and <code>build</code>. The actual build is delegated to a [[#Build_Backend|build backend]]. It is the <code>[[Pyproject.toml#Build_Backend_Configuration|pyproject.toml]]</code> file that tells the frontend tool which backend to use.
 
==Build Backend==
==Build Backend==
{{External|https://packaging.python.org/en/latest/tutorials/packaging-projects/#choosing-a-build-backend}}
{{External|https://packaging.python.org/en/latest/tutorials/packaging-projects/#choosing-a-build-backend}}
A '''build backend''' is a library that takes a source tree or a source distribution and builds a source distribution or a [[#Wheel|Wheel]] from it. The build is delegated to the backend by a frontend. All backends offer a standardized interface. Example of build backends:
A '''build backend''' is a library that takes a source tree or a source distribution and builds a source distribution or a [[#Wheel|Wheel]] from it. The build is delegated to the backend by a [[#Build_Frontend|frontend]]. The build backend determines how the project will specify its configuration, including its [[Pyproject.toml#Project_Metadata|metadata]]. All backends offer a standardized interface.  
 
Example of build backends:
* [[Hatch#Hatchling|Hatchling]]
* [https://packaging.python.org/en/latest/key_projects/#flit flit-core]
* [https://packaging.python.org/en/latest/key_projects/#flit flit-core]
* [https://packaging.python.org/en/latest/key_projects/#hatch hatchling]
* [https://packaging.python.org/en/latest/key_projects/#maturin Maturin]
* [https://packaging.python.org/en/latest/key_projects/#maturin Maturin]
* [https://packaging.python.org/en/latest/key_projects/#meson-python meson-python]
* [https://packaging.python.org/en/latest/key_projects/#meson-python meson-python]
Line 502: Line 528:
<Font color=darkkhaki>TODO: https://packaging.python.org/en/latest/discussions/package-formats/#package-formats</font>
<Font color=darkkhaki>TODO: https://packaging.python.org/en/latest/discussions/package-formats/#package-formats</font>
===<span id='wheel'></span>Wheel===  
===<span id='wheel'></span>Wheel===  
The standard [[#Built_Distribution|Built Distribution]] format.
The standard [[#Built_Distribution|Built Distribution]] format. Also see: {{Internal|Publishing_a_Python_Distribution_Package_in_a_Repository#Built_Distribution|Publishing a Python Distribution Package in a Repository &#124; Built Distribution}}
 
===Egg===
===Egg===
{{External|https://packaging.python.org/en/latest/discussions/package-formats/#egg-format}}
{{External|https://packaging.python.org/en/latest/discussions/package-formats/#egg-format}}

Latest revision as of 21:28, 19 September 2024

External

Internal

Overview

Python code can be organized as standalone programs, modules and packages. When a module or a package is published, people refer to it as a library. In this context, the term library is simply a generic term for a bunch of code that was designed with the aim of being reused by many applications. It provides some generic functionality, usually in form of functions and classes, that can be used by specific applications.

Modular programming refers to the process of braking a large unwieldy body of code into separate, smaller, more manageable modules. Individual modules can the be combined into creating a larger application. More details about modular programming is available in Designing Modular Systems article.

Standalone Program

A standalone program consists of one or more files of code that is read by the Python interpreter and executed. A typical way to interact with a Python program is command line arguments. For more details on handling command line arguments, see:

Command Line Argument Processing in Python

Python Script

A script is a module whose aim is to be executed. It has the same meaning as "program", standalone program, or "application", but it is usually used to describe simple and small program. It contains a stored set of instructions that can be handed over to the Python interpreter:

python3 ./my-script.py

Python scripts have .py extensions.

The python code can be specified in-line with a here-doc:

python3 <<EOF
print('hello')
print('hello2')
EOF

A stable and flexible way of executing Python programs in command line is invoking them from a thin bash wrapper that prepares the environment variables and other settings. This approach is described here:

Running a Python Program with a Bash Wrapper

Modules

In Python, a module is simply a file with the .py extension containing Python code. The nomenclature is completely opposite to Go's, where the simples modularization unit is a Go package.

There are three different ways to define a module in Python:

  • A module can be written in Python itself, with its code contained in one single code file.
  • A module can be written in C and loaded dynamically at runtime. This is the case of the regular expression re module.
  • A module can be intrinsically contained by the interpreter. This is called a built-in module.

The modules written in Python are the most common, and they are the type of modules Python developer usually write. They are exceedingly straightforward to build: just place Python code in a file with the .py extension. Modules are simple Python files: a module consists of just one file. The module can be imported inside another Python program or executed on its own. The module can define object instances such as strings or lists, which are assigned to variables, and also functions and classes. If intended to run on its own, the module will also include runnable code.

The file name is the name of the module with the suffix .py appended. Modules are loaded into Python by the process of importing, where the name of the objects present in one module are made available in the namespace of the calling layer.

Each module has its own global namespace, where all objects defined inside the module (strings, lists, functions, classes, etc.) are identified by unique names.

A module may exist as part of a package, and a package may contain many modules.

Module Name

A module name is the name of the file the module resides in, with the extension .py stripped.

Naming conventions are documented here

PEP 8 – Style Guide for Python Code, Package and Module Names

Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability. The name of the module cannot contain dashes ('-'). The name of the directory the module is stored in can contain dashes.

Module Docstring

Contains documentation for the module. Added on the first line:

"""
This module does this and that.
"""

Built-in Modules

The built-in modules are contained by the Python interpreter.

Module Internal Representation and Introspection

The modules, once loaded, can be introspected for their attributes, as instances of the module class. For more details, see:

Module Internal Representation and Introspection

Importing

Importing means making accessible names available in a module or a package into the namespace associated with the code that invokes the import statement. The import operation binds the imported module's name into the namespace associated with the calling layer. All module-level code is executed immediately at the time it is imported. In case of functions, the function will be created, but its code will not be executed until the function is called.

Importing a Module

The import statement, usually listed at the top of the file, followed by the name of the module, binds the name of the module into the caller's symbol table. This way, the name of the imported module becomes available in the caller's namespace. The name of the module is the name of the file containing the module code, without the .py extension.

import mymodule

A module is importable if the file corresponding to the module being imported is accessible to the Python interpreter. For more details on how the Python interpreter finds modules see Locating Module Files - Module Search Path.

The import mymodule statement does nothing except binding the specified module name into the current namespace, thus potentially enabling access to the imported module global namespace. The objects that are defined in the imported module remain in the module’s private symbol table.

print(globals())

[...]

'mymodule': <module 'mymodule' from '/Users/ovidiu/playground/pyhton/modules/mymodule.py'>

For more details on accessing a module global namespace see globals().

However, binding the module name into the current namespace allows the Python code from the current scope to use the module name to access objects contained by the module. Assuming that mymodule declares a function some_func, the function can be invoked by prefixing its name with the name of the module, using the dot notation. This is called qualifying the internal names of a module with the module's name. The function will be looked up in the mymodule's global namespace, which is made accessible to the calling layer as mymodule. The same applies to other objects declared by the module, such as variables, classes, etc.:

import mymodule

mymodule.some_func()

Once imported, the file associated with the module can be determined using the module object's __file__ attribute:

Module Internal Representation | __file__

Multiple comma-separated module names can be specified in the same import statement, but various static analysis programs flag this as a style violation:

import mymodule, mymodule2 # The style checker flags this as a style violation

It most cases, we use the absolute import syntax, where we specify the complete path to the module, function or class we want to import. The absolute import statement uses the period operator to separate the name of the packages or modules. The alternative is to use relative import syntax.

Importing a Module from a Function or Class

A module can be imported in the global namespace of another module, or inside a function or a class. You should consider importing in the global namespace if the imported code might be used in more than one place, and inside of the function or class if you know its use will be limited. Putting all imports at the top of the file makes all dependencies of the importing module code explicit.

While importing into a function's namespace, all that the import statement does is to bind the specified module's namespace into the local namespace of the function. The imported module's objects must be qualified with the name of the module to be accessed. Note that the import does not occur until the function is called:

def some_func():
  import mymodule
  [...]
  mymodule.my_func()

Python 3 does not allow indiscriminate import * from within a function.

Importing a Module with Another Name

The objects of an imported module are qualified with the name of the module to be used. The prefix can be changed, usually to shorten it, by using the as reserved word in the import statement. This syntax effectively renames the module being imported in the namespace. The same technique is useful if there are two modules with the same name.

import mymodule as m

[...]

m.my_func()
print(locals())

[...]
'm': <module 'mymodule' from '/Users/ovidiu/playground/pyhton/modules/mymodule.py'>

For more details accessing a function's local namespace see locals().

Programmatic Import using the Module Name as String

So far, we used the import statement to import modules by their name, which is provided as part of the statement. Modules can also be imported dynamically in the program by using the module name as string, or the file the module exists in:

__import()__

__import__() is a built-in function. Because this function is meant for use by the Python interpreter and not for general use, it is better to use importlib.import_module() to programmatically import a module.

__import__('some_package.some_subpackage.some_module')

importlib.import_module()

Recommended idiom to import modules programmatically:

import importlib.import_module
module = importlib.import_module('some_name')
assert isinstance(module, types.ModuleType)

When the module is loaded as part of a package, use:

module_name = "..."
module = importlib.import_module(f'.{module_name}', package.__name__)

imp.load_source()

import imp
module = imp.load_source('some_package.some_subpackage.some_module', '.../some_package/some_subpackage/some_module.py')

Locating Module Files - Module Search Path

The runtime looks at a list of directory names and ZIP files stored in the standard sys module as the variable path (sys.path). The initial value of sys.path is assembled from the following sources:

  • The directory in which the code performing the import is located. This is why an import will aways work if the module file being imported is in the same directory as the code doing the import.
  • The current directory if the script is executed interactively.
  • The list of directories contained in the PYTHONPATH environment variable, if set.
  • An installation-dependent list of directories configured at the time Python is installed.

sys.path value can be accessed and modified:

import sys
for i in sys.path:
  print(i)

 /opt/brew/Cellar/python@3.9/3.9.9/Frameworks/Python.framework/Versions/3.9/lib/python39.zip
 /opt/brew/Cellar/python@3.9/3.9.9/Frameworks/Python.framework/Versions/3.9/lib/python3.9
 /opt/brew/Cellar/python@3.9/3.9.9/Frameworks/Python.framework/Versions/3.9/lib/python3.9/lib-dynload
 /Users/ovidiu/my-project/venv/lib/python3.9/site-packages

The initial blank output line is the empty string '', which stands for the current directory. The first match will be used. If a module with the same name as a module from standard library is encountered in the search path before the standard library, it will be used instead of the module coming from the standard library.

sys.path value can be modified as follows:

Update PYTHONPATH

Set the environment variable PYTHONPATH to a colon-separated list of directories to search for imported modules. A directory declared in PYTHONPATH will be inserted at the beginning of the sys.path list when Python starts up.

Programmatically modify sys.path

Appending an absolute path works:

import sys
sys.path.append('/path/to/search')

Appending a relative path does not work, unless the script is executed from the directory the path is relative to.

import sys
sys.path.append('./my-module') # This does not work unless the Python script is executed from "my-module"'s parent.

To use a relative directory and make append() insensitive to the location the program is run from, use this pattern:

import os
import sys
sys.path.append(os.path.dirname(__file__) + "/my-module")
import my_module

PyCharm trick: If the module name and the parent directory have the same name, PyCharm will stop issuing static analysis error "No module named ..."

Programmatically modify site.addsitedir

site.addsitedir can be used to add a directory to sys.path. The difference between this and just plain appending is that when you use addsitedir, it also looks for .pth files within that directory and uses them to possibly add additional directories to sys.path based on the contents of the files.

Importing Specific Objects from a Module

An alternate import statement syntax allow binding of specific objects into the caller's symbol table. When the from/import combination is used, the specified imported module objects are inserted into the caller's symbol table and are available in the caller's namespace under their original name. No qualification with dot notation is needed to access them.

from mymodule import my_func, my_func_2
...
# invoke the function directly, without prefixing it with the name of the module
my_func()

The objects can keep their original name, as shown above, or it can be aliased, which means a custom name is inserted in the caller's symbol table.

from mymodule import my_func as m_f
...
m_f()

⚠️ Because this form of import places the object names directly into the caller’s symbol table, any objects that already exist with the same name will be overwritten.

Importing * from a Module

from mymodule import *

This syntax places the names of all objects from the imported module into the local symbol table, with the exception of those whose name begins with an underscore (_).This technique is not necessarily recommended in large-scale production code. Unless you are confident there won’t be a conflict, there is a chance of overwriting an existing name inadvertently, so it should be used with caution. Doing this will unnecessarily clutter the namespace. Not doing it makes the code easier to read: when we explicitly import a class or a function with from x import y syntax, we can easily see where y comes from. However, if we use from x import * syntax, it takes a lot longer to find where y is located. In addition, most code editors are able to provide code completion, ability to navigate to the definition of a class or inline documentation if normal imports are used. The import * syntax usually removes this capabilities. Finally, import * syntax can bring unexpected objects into the target namespace, because it will import any classes or modules that were themselves imported in the file being imported.

Explicit is better than implicit.

Recommended Style

https://google.github.io/styleguide/pyguide.html#22-imports

Uses import statements for packages and modules only, not for individual classes or functions.

  • Use import x for importing packages and modules.
  • Use from x import y where x is the package prefix and y is the module name with no prefix.
  • Use from x import y as z if two modules named y are to be imported, if y conflicts with a top-level name defined in the current module, or if y is an inconveniently long name.
  • Use import y as z only when z is a standard abbreviation (e.g., np for numpy).
  • TO PROCES: https://google.github.io/styleguide/pyguide.html#2241-exemptions

Handling Unsuccessful Imports

An unsuccessful import attempt, caused by the unavailability of the imported module, can be caught in a try block:

try:
  import inexistent_module
except ImportError:
  print("module not found!")

The unavailability of specific objects within an existent module can be detected the same way:

try:
  from mymodule import inexistent_object
except ImportError:
  print("object not found!")

Reloading a Module

TO PROCESS: https://realpython.com/python-modules-packages/#reloading-a-module

Relative Imports

Relative import syntax is an alternative to absolute import. This syntax is useful when working with related module inside a package, that are stored in a know relative position to each other. Relative import is a way of saying find a class, function or module as it is positioned relative to the current module.

In this situation:

 some_package
    ├─ module_1.py
    └─ module_2.py

where module_1 declares func_1(), func_1() can be imported in module_2 with a relative import:

from .module_1 import func_1

def func_2():
  func_1()

The period in front of "module_1" says to sue the "module_1" module inside the current package. In this situation:

 some_package
    ├─ module_1.py
    └─ some_subpackage
             └─ module_2.py

in module_2 we can use:

from ..module_1 import func_1

Distinguishing between Importing and Executing a Module

https://docs.python.org/3/library/__main__.html

Note that if you want to maintain a module that can be imported and executed as a script, the module must not contain executable code outside functions and classes. If it does, that code will be executed when the module is imported, which is generally not what you want. To distinguish between the case when the file is loaded as a module and it is executed as a script, Python sets the __name__ variable to different values. When the module is imported, __name__ is set to the module name. When the module is executed as a script, the variable value is set to "__main__".

As such, this fact can be used to wrap the executable code in a function that is only executed when the module is executed as a script:

...

def main():
 # this code is executed when the module is executed as script
 print("...")

if __name__ == '__main__':
  main()

Also see:

__name__

How to Execute (Run) a Module?

Document the -m flag.

python -m <module-name> ....

Package (Import Package)

A package is Python code stored into multiple files, organized in a file hierarchy. The Python Packaging User Guide calls it an import package, to differentiate it from a distribution package, explained below. The canonical form of an import package is a directory containing modules, or recursively other packages and optionally an __init__.py file.

Since we cannot put modules inside modules, because a module is just a file, and a file can hold only one file after all, Python offers the package mechanism. A package is a collection of modules, and optionally subpackages, recursively, in a folder.

The name of the package is the name of the folder.

A package may contain multiple modules, each stored in its own file, either in the package root directory or recursively in subdirectories. The package root directory may also contain two optional files named __init__.py and __main__.py. Packages allow for a hierarchical structuring of the module namespace using dot notation. In the same way that modules avoid collisions between global variable names, packages avoid collision between module names. For example, the urllib package contains several modules: urllib.request, urllib.error, etc. A package also allows for subpackages.

some_dir
 └─ some_package_1
     ├─ __init__.py  # Optional
     ├─ __main__.py  # Optional
     ├─ some_module_1.py # Defines some_func_1()
     ├─ some_module_2.py # Defines some_func_2()
     └─ dir_1
         ├─ some_module_3.py # Defines some_func_3()
         └─ dir_2
             └─ some_module_4.py # Defines some_func_4()

Internally, a package is represented as an instance of the module class. The difference between a package instance and a simple module instance is that the the package instance as an extra __path__ attribute. For more details on the internal representation of a package, see:

Module Internal Representation and Introspection

Importing a Package

To import the modules of the package represented above, ensure that the directory some_dir, the parent of some_package_1, is in the module search path and use the following import statements, where a module is identified using dot notation relative to its package name:

import some_package_1.some_module_1
import some_package_1.some_module_2
import some_package_1.dir_1.some_module_3
import some_package_1.dir_1.dir_2.some_module_4

some_package_1.some_module_1.some_func_1()
some_package_1.some_module_2.some_func_2()
some_package_1.dir_1.some_module_3.some_func_3()
some_package_1.dir_1.dir_2.some_module_4.some_func_4()

A slightly more compact version is:

from some_package_1 import some_module_1
from some_package_1 import some_module_2
from some_package_1.dir_1 import some_module_3
from some_package_1.dir_1.dir_2 import some_module_4

some_module_1.some_func_1()
some_module_2.some_func_2()
some_module_3.some_func_3()
some_module_4.some_func_4()

Importing the package itself is syntactically correct, but unless there is an __init__.py, the import does not do anything useful. In particular, it does not place any of the component module names in the package in the local namespace:

import some_package_1
print(str(some_package_1))
<module 'some_package_1' (namespace)>

Package Name

See module name above.

Package Internal Representation and Introspection

A package, once loaded, is represented internally as an instance of the module class. For more details, see:

Module Internal Representation and Introspection

__init__.py

When a regular package is imported, this __init__.py file is implicitly executed, and the objects it defines are bound to names in the package’s namespace. The __init__.py file can contain the same Python code that any other module can contain, like variables, function and class declarations, and Python will add some additional attributes to the module when it is imported.

Much of the Python documentation states that the __init__.py file must be present in the package directory, even if as an empty file, for the package to be valid. This was once true. Since Python 3.3, PEP 420 Implicit Namespace Packages were introduced and they allow for the creation of a package without any __init__.py file.

Assuming that __init__.py is declared in some_package_1 , as shown above, and has the following content:

# this is __init__.py
COLOR = 'blue'

then importing the package itself binds the COLOR in the package's namespace, making it accessible to the client program importing the package:

import some_package_1
assert 'blue' == some_package_1.COLOR

A module in the package can access the global variable by importing it in turn. In some_module_1.py:

from some_package_1 import COLOR

[...]

def print_color():
    print(f'color is {COLOR}')

__init__.py can also be used to automatically import the modules from the package, so the clients of the package won't have to import them individually, and the objects from the package's modules will bound to the package namespace.

# this is __init__.py
import some_package_1.some_module_1
import some_package_1.some_module_2
import some_package_1.dir_1.some_module_3
import some_package_1.dir_1.dir_2.some_module_4

For a client of the package:

import some_package_1
some_package_1.some_module_1.some_func_1()
some_package_1.some_module_2.some_func_2()
some_package_1.dir_1.some_module_3.some_func_3()
some_package_1.dir_1.dir_2.some_module_4.some_func_4()

It is not recommended to put much code in __init__.py file. Programmers to not expect actual logic to happen in this file.

__main__.py

https://docs.python.org/3/library/__main__.html

Packages can be run as if they were scripts if the package provides the top-level script __main__.py. The file contains the code of the "main" module, which will be imported automatically when the package is imported. As such, __main__.py file is used to provide a command-line interface for a package.

import somepkg.some_module_1 as some_module_1

def main():
    print('.')

if __name__ == '__main__':
    # Execute when the module is not initialized from an import statement.
    main()

TO PROCESS: Idiomatic usage: https://docs.python.org/3/library/__main__.html#id1

Namespace Package

A PEP 420 package which serves only as a container for subpackages. Namespace packages may have no physical representation, and have no __init__.py file.

Subpackages

https://realpython.com/python-modules-packages/#subpackages

A subpackage is a folder containing modules and optionally other subpackages, stored in a package.

Importing * from a Package

https://realpython.com/python-modules-packages/#importing-from-a-package

When from <package_name> import * is encountered, Python follows this convention: if the __init__.py file in the package directory contains a list named __all__, it is taken to be a list of modules that should be imported.

Package Metadata

Name: pulumi
Version: 2.11.2
Summary: Pulumi's Python SDK
Home-page: https://github.com/pulumi/pulumi
Author:
Author-email:
License: Apache 2.0
Location: /Users/ovidiu/Library/Python/3.8/lib/python/site-packages
Requires: dill, grpcio, protobuf
Required-by: pulumi-aws, pulumi-kubernetes, pulumi-random, pulumi-tls

Requires

Required-by

Package Example

A package example is available here:

https://github.com/ovidiuf/playground/tree/master/pyhton/packages/some_package_1

A program that consumes it is available here:

https://github.com/ovidiuf/playground/tree/master/pyhton/packages/consumer-of-packages

Distribution Package

https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/#distribution-package-vs-import-package
https://packaging.python.org/en/latest/glossary/#term-Distribution-Package

A distribution package is a project with a setup.py that uses setuptools or disutils or wheel to generate a distributable egg or wheel. In other words, a distribution package is a "built" distributable artifact, a piece of software you can install. In most cases is synonymous with "project", as mentioned above. A published distribution package can be installed with:

pip install somepkg

The name of a published distribution package, in this case somepkg can be declared in the project's requirements.txt:

somepkg==0.1.0

When you browse PyPI, what you see are distribution packages. On a given package index, like PyPI, distribution package names must be unique.

Distribution Package Name

Distribution packages can use hyphens - or underscores _. They can also contain dots ., which is sometimes used for packaging a subpackage of a namespace package. For most purposes, they are insensitive to case and to - vs. _ differences, ex., pip install Some_Package is the same as pip install some-package. The precise rules are provided here:

https://packaging.python.org/en/latest/specifications/name-normalization/#name-normalization

The distribution name of the package is specified as [project].name in the pyproject.toml file.

Relationship between the Distribution Package and the corresponding Import Package

Most of the time, a distribution package provides one single import package, with the same name as the distribution name of the package, or several import packages, though this is less common. For example, pip install somepkg lets you import somepkg. However, this is only a convention. PyPI and other package indices do not enforce any relationship between the distribution name of the package and the import packages it provides. A distribution package could provide an import package with a different name.

On a given package index, like PyPI, the distribution name of the package must be unique. On the other hand, import packages have no such requirement. Import packages with the same name can be provided by several distribution packages.

Distribution Package Version

The version of the distribution package is specified as [project].version in the pyproject.toml file.

Built Distribution

A distribution format containing files and metadata that only need to be moved to the correct location on the target system, to be installed. Wheel is such a format. This format does not imply that Python files have to be precompiled. Wheel intentionally does not include compiled Python files.

Source Distribution ("sdist")

A distribution format, usually generated with python -m build --sdist that provides metadata and essential source files needed for installing by a tool like pip or for generating built distribution.

Build Frontend

A tool that users might run that task arbitrary source trees or source distributions and builds source distributions or Wheel from them. Examples of build frontends are pip and build. The actual build is delegated to a build backend. It is the pyproject.toml file that tells the frontend tool which backend to use.

Build Backend

https://packaging.python.org/en/latest/tutorials/packaging-projects/#choosing-a-build-backend

A build backend is a library that takes a source tree or a source distribution and builds a source distribution or a Wheel from it. The build is delegated to the backend by a frontend. The build backend determines how the project will specify its configuration, including its metadata. All backends offer a standardized interface.

Example of build backends:

Distribution Package Formats

TODO: https://packaging.python.org/en/latest/discussions/package-formats/#package-formats

Wheel

The standard Built Distribution format. Also see:

Publishing a Python Distribution Package in a Repository | Built Distribution

Egg

https://packaging.python.org/en/latest/discussions/package-formats/#egg-format

Egg is a built distribution format introduced by setuptools, which was replaced by Wheel.

Publishing a Python Distribution Package in a Repository

Publishing a Python Distribution Package in a Repository

Python Standard Library

Python Language | Python Standard Library

site-packages

See

Python Versions | pip Relationship to Python Version