Python Language Modularization: Difference between revisions
(40 intermediate revisions by the same user not shown) | |||
Line 39: | Line 39: | ||
=<span id='Module'></span>Modules= | =<span id='Module'></span>Modules= | ||
In Python, a module is simply a file with the <code>.py</code> extension containing Python code. The nomenclature is completely opposite to Go's, where the simples modularization unit is a [[Go_Packages#Overview|Go package]]. | |||
There are three different ways to define a module in Python: | There are three different ways to define a module in Python: | ||
* A module can be written in Python itself, with its code contained in one single code file. | * A module can be written in Python itself, with its code contained in one single code file. | ||
Line 44: | Line 46: | ||
* A module can be intrinsically contained by the interpreter. This is called a [[#Built-in_Modules|built-in module]]. | * A module can be intrinsically contained by the interpreter. This is called a [[#Built-in_Modules|built-in module]]. | ||
The modules written in Python are the most common, and they are the type of modules Python developer usually write. They are exceedingly straightforward to build: just place Python code in a file with the <code>py</code>. Modules are simple Python files: a module consists of '''just one''' file. The module can be [[#Importing|imported]] inside another Python program or executed on its own. The module can define object instances such as strings or lists, which are assigned to variables, and also functions and classes. If intended to run on its own, the module will also include runnable code. | The modules written in Python are the most common, and they are the type of modules Python developer usually write. They are exceedingly straightforward to build: just place Python code in a file with the <code>.py</code> extension. Modules are simple Python files: a module consists of '''just one''' file. The module can be [[#Importing|imported]] inside another Python program or executed on its own. The module can define object instances such as strings or lists, which are assigned to variables, and also functions and classes. If intended to run on its own, the module will also include runnable code. | ||
The file name is the | The file name is the name of the module with the suffix <code>.py</code> appended. Modules are loaded into Python by the process of [[#Import|importing]], where the name of the objects present in one module are made available in the [[Python_Language#Variables_Namespace_and_Scope|namespace]] of the calling layer. | ||
<span id='Global_Namespace'></span>Each module has its own [[Python_Language#Variables_Namespace_and_Scope|global namespace]], where all objects defined inside the module (strings, lists, functions, classes, etc.) are identified by unique names. | <span id='Global_Namespace'></span>Each module has its own [[Python_Language#Variables_Namespace_and_Scope|global namespace]], where all objects defined inside the module (strings, lists, functions, classes, etc.) are identified by unique names. | ||
Line 52: | Line 54: | ||
A module may exist as part of a [[#Package|package]], and a package may contain many modules. | A module may exist as part of a [[#Package|package]], and a package may contain many modules. | ||
==Module Name== | ==Module Name== | ||
A module name is the name of the file the module resides in, with the extension <code>.py</code> stripped. | |||
Naming conventions are documented here | Naming conventions are documented here | ||
Line 323: | Line 327: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
=Package= | =<span id='Import_(Regular)_Package'></span><span id='Import_Package'></span><span id='Regular_Package'></span>Package (Import Package)= | ||
A package is Python code stored into multiple files, organized in a file hierarchy. The [https://packaging.python.org/en/latest/ Python Packaging User Guide] calls it an '''import package''', to differentiate it from a [[#Distribution_Package|distribution package]], explained below. | A '''package''' is Python code stored into multiple files, organized in a file hierarchy. The [https://packaging.python.org/en/latest/ Python Packaging User Guide] calls it an '''import package''', to differentiate it from a [[#Distribution_Package|distribution package]], explained below. The canonical form of an import package is a directory containing modules, or recursively other packages and optionally an <code>[[#init_.py|__init__.py]]</code> file. | ||
Since we cannot put modules inside modules, because a module is just a file, and a file can hold only one file after all, Python offers the package mechanism. A package is a collection of [[#Module|modules]], and optionally [[#Subpackage|subpackages]], recursively, in a folder. | Since we cannot put modules inside modules, because a module is just a file, and a file can hold only one file after all, Python offers the package mechanism. A package is a collection of [[#Module|modules]], and optionally [[#Subpackage|subpackages]], recursively, in a folder. | ||
Line 381: | Line 385: | ||
==Package Name== | ==Package Name== | ||
See [[#Module_Name|module name]] above. | See [[#Module_Name|module name]] above. | ||
== Package Internal Representation and Introspection== | ==Package Internal Representation and Introspection== | ||
A package, once loaded, is represented internally as an instance of the <code>module</code> class. For more details, see: | A package, once loaded, is represented internally as an instance of the <code>module</code> class. For more details, see: | ||
{{Internal|Python_Module_Internal_Representation_and_Introspection#Overview|Module Internal Representation and Introspection}} | {{Internal|Python_Module_Internal_Representation_and_Introspection#Overview|Module Internal Representation and Introspection}} | ||
==<tt>__init__.py</tt>== | |||
When a regular package is imported, this <code>__init__.py</code> file is implicitly executed, and the objects it defines are bound to names in the package’s namespace. The <code>__init__.py</code> file can contain the same Python code that any other module can contain, like variables, function and class declarations, and Python will add some additional attributes to the module when it is imported. | When a regular package is imported, this <code>__init__.py</code> file is implicitly executed, and the objects it defines are bound to names in the package’s namespace. The <code>__init__.py</code> file can contain the same Python code that any other module can contain, like variables, function and class declarations, and Python will add some additional attributes to the module when it is imported. | ||
Line 435: | Line 433: | ||
It is not recommended to put much code in <code>__init__.py</code> file. Programmers to not expect actual logic to happen in this file. | It is not recommended to put much code in <code>__init__.py</code> file. Programmers to not expect actual logic to happen in this file. | ||
==<tt>__main__.py</tt>== | |||
{{External|https://docs.python.org/3/library/__main__.html}} | {{External|https://docs.python.org/3/library/__main__.html}} | ||
Packages can be run as if they were scripts if the package provides the top-level script <code>__main__.py</code>. The file contains the code of the "main" module, which will be imported automatically when the package is imported. As such, <code>__main__.py</code> file is used to provide a command-line interface for a package. | Packages can be run as if they were scripts if the package provides the top-level script <code>__main__.py</code>. The file contains the code of the "main" module, which will be imported automatically when the package is imported. As such, <code>__main__.py</code> file is used to provide a command-line interface for a package. | ||
<syntaxhighlight lang='py'> | |||
import somepkg.some_module_1 as some_module_1 | |||
def main(): | |||
print('.') | |||
if __name__ == '__main__': | |||
# Execute when the module is not initialized from an import statement. | |||
main() | |||
</syntaxhighlight> | |||
<font color=darkkhaki>TO PROCESS: Idiomatic usage: https://docs.python.org/3/library/__main__.html#id1</font> | <font color=darkkhaki>TO PROCESS: Idiomatic usage: https://docs.python.org/3/library/__main__.html#id1</font> | ||
==Namespace Package== | |||
A PEP 420 package which serves only as a container for subpackages. Namespace packages may have no physical representation, and have no <code>__init__.py</code> file. | A PEP 420 package which serves only as a container for subpackages. Namespace packages may have no physical representation, and have no <code>__init__.py</code> file. | ||
==<span id='Subpackage'></span>Subpackages== | ==<span id='Subpackage'></span>Subpackages== | ||
{{External|https://realpython.com/python-modules-packages/#subpackages}} | {{External|https://realpython.com/python-modules-packages/#subpackages}} | ||
Line 453: | Line 462: | ||
==Package Metadata== | ==Package Metadata== | ||
<syntaxhighlight lang='yaml'> | <syntaxhighlight lang='yaml'> | ||
Name: pulumi | Name: pulumi | ||
Line 472: | Line 480: | ||
A program that consumes it is available here: {{External|https://github.com/ovidiuf/playground/tree/master/pyhton/packages/consumer-of-packages}} | A program that consumes it is available here: {{External|https://github.com/ovidiuf/playground/tree/master/pyhton/packages/consumer-of-packages}} | ||
= | =<span id='Package_Kinds'></span><span id='Package_Types'></span>Distribution Package= | ||
< | {{External|https://packaging.python.org/en/latest/discussions/distribution-package-vs-import-package/#distribution-package-vs-import-package}} | ||
{{External|https://packaging.python.org/en/latest/glossary/#term-Distribution-Package}} | |||
A '''distribution package''' is a project with a <code>setup.py</code> that uses <code>setuptools</code> or <code>disutils</code> or <code>wheel</code> to generate a distributable [[#Egg|egg]] or [[#Wheel|wheel]]. In other words, a distribution package is a "built" distributable artifact, a piece of software you can install. In most cases is synonymous with "project", as mentioned above. A published distribution package can be installed with: | |||
* https://packaging.python.org/en/latest/ | <syntaxhighlight lang='bash'> | ||
</font> | pip install somepkg | ||
</syntaxhighlight> | |||
The name of a published distribution package, in this case <code>somepkg</code> can be declared in the project's <code>[[Requirements.txt|requirements.txt]]</code>: | |||
<syntaxhighlight lang='text'> | |||
somepkg==0.1.0 | |||
</syntaxhighlight> | |||
When you browse [[Python_Language#Python_Package_Index_PyPI|PyPI]], what you see are distribution packages. On a given package index, like PyPI, distribution package names must be unique. | |||
==<span id='Distribution_Name'></span>Distribution Package Name== | |||
Distribution packages can use hyphens <code>-</code> or underscores <code>_</code>. They can also contain dots <code>.</code>, which is sometimes used for packaging a subpackage of a namespace package. For most purposes, they are insensitive to case and to <code>-</code> vs. <code>_</code> differences, ex., <code>pip install Some_Package</code> is the same as <code>pip install some-package</code>. The precise rules are provided here: {{External|https://packaging.python.org/en/latest/specifications/name-normalization/#name-normalization}} | |||
The distribution name of the package is specified as <code>[project].name</code> in the <code>[[Pyproject.toml#name|pyproject.toml]]</code> file. | |||
===Relationship between the Distribution Package and the corresponding Import Package=== | |||
Most of the time, a distribution package provides one single import package, with the same name as the '''distribution name of the package''', or several import packages, though this is less common. For example, <code>pip install somepkg</code> lets you <code>import somepkg</code>. However, this is only a convention. PyPI and other package indices do not enforce any relationship between the distribution name of the package and the import packages it provides. A distribution package could provide an import package with a different name. | |||
On a given package index, like PyPI, the distribution name of the package must be unique. On the other hand, import packages have no such requirement. Import packages with the same name can be provided by several distribution packages. | |||
==Distribution Package Version== | |||
The version of the distribution package is specified as <code>[project].version</code> in the <code>[[Pyproject.toml#version|pyproject.toml]]</code> file. | |||
==Built Distribution== | |||
A distribution format containing files and metadata that only need to be moved to the correct location on the target system, to be installed. [[#Wheel|Wheel]] is such a format. This format does not imply that Python files have to be precompiled. [[#Wheel|Wheel]] intentionally does not include compiled Python files. | |||
==Source Distribution ("sdist")== | |||
A distribution format, usually generated with <code>python -m build --sdist</code> that provides metadata and essential source files needed for installing by a tool like <code>pip</code> or for generating [[#Built_Distribution|built distribution]]. | |||
==Build Frontend== | |||
A tool that users might run that task arbitrary source trees or source distributions and builds source distributions or [[#Wheel|Wheel]] from them. Examples of build frontends are <code>pip</code> and <code>build</code>. The actual build is delegated to a [[#Build_Backend|build backend]]. It is the <code>[[Pyproject.toml#Build_Backend_Configuration|pyproject.toml]]</code> file that tells the frontend tool which backend to use. | |||
==Build Backend== | |||
{{External|https://packaging.python.org/en/latest/tutorials/packaging-projects/#choosing-a-build-backend}} | |||
A '''build backend''' is a library that takes a source tree or a source distribution and builds a source distribution or a [[#Wheel|Wheel]] from it. The build is delegated to the backend by a [[#Build_Frontend|frontend]]. The build backend determines how the project will specify its configuration, including its [[Pyproject.toml#Project_Metadata|metadata]]. All backends offer a standardized interface. | |||
Example of build backends: | |||
* [[Hatch#Hatchling|Hatchling]] | |||
* [https://packaging.python.org/en/latest/key_projects/#flit flit-core] | |||
* [https://packaging.python.org/en/latest/key_projects/#maturin Maturin] | |||
* [https://packaging.python.org/en/latest/key_projects/#meson-python meson-python] | |||
* [https://packaging.python.org/en/latest/key_projects/#scikit-build-core scikit-build-core] | |||
* [https://packaging.python.org/en/latest/key_projects/#setuptools Setuptoos] | |||
==Distribution Package Formats== | |||
<Font color=darkkhaki>TODO: https://packaging.python.org/en/latest/discussions/package-formats/#package-formats</font> | |||
===<span id='wheel'></span>Wheel=== | |||
The standard [[#Built_Distribution|Built Distribution]] format. Also see: {{Internal|Publishing_a_Python_Distribution_Package_in_a_Repository#Built_Distribution|Publishing a Python Distribution Package in a Repository | Built Distribution}} | |||
===Egg=== | |||
{{External|https://packaging.python.org/en/latest/discussions/package-formats/#egg-format}} | |||
Egg is a built distribution format introduced by <code>setuptools</code>, which was replaced by [[#Wheel|Wheel]]. | |||
==<span id='Publishing_a_Python_Package_in_a_Repository'></span>Publishing a Python Distribution Package in a Repository== | |||
{{Internal|Publishing a Python Distribution Package in a Repository#Overview|Publishing a Python Distribution Package in a Repository}} | |||
=Python Standard Library= | =Python Standard Library= |
Latest revision as of 21:28, 19 September 2024
External
- https://realpython.com/python-modules-packages
- https://docs.python.org/3/tutorial/modules.html
- https://docs.python.org/3/tutorial/modules.html#tut-packages
- https://docs.python.org/3/reference/import.html
Internal
Overview
Python code can be organized as standalone programs, modules and packages. When a module or a package is published, people refer to it as a library. In this context, the term library is simply a generic term for a bunch of code that was designed with the aim of being reused by many applications. It provides some generic functionality, usually in form of functions and classes, that can be used by specific applications.
Modular programming refers to the process of braking a large unwieldy body of code into separate, smaller, more manageable modules. Individual modules can the be combined into creating a larger application. More details about modular programming is available in Designing Modular Systems article.
Standalone Program
A standalone program consists of one or more files of code that is read by the Python interpreter and executed. A typical way to interact with a Python program is command line arguments. For more details on handling command line arguments, see:
Python Script
A script is a module whose aim is to be executed. It has the same meaning as "program", standalone program, or "application", but it is usually used to describe simple and small program. It contains a stored set of instructions that can be handed over to the Python interpreter:
python3 ./my-script.py
Python scripts have .py
extensions.
The python code can be specified in-line with a here-doc:
python3 <<EOF
print('hello')
print('hello2')
EOF
A stable and flexible way of executing Python programs in command line is invoking them from a thin bash wrapper that prepares the environment variables and other settings. This approach is described here:
Modules
In Python, a module is simply a file with the .py
extension containing Python code. The nomenclature is completely opposite to Go's, where the simples modularization unit is a Go package.
There are three different ways to define a module in Python:
- A module can be written in Python itself, with its code contained in one single code file.
- A module can be written in C and loaded dynamically at runtime. This is the case of the regular expression re module.
- A module can be intrinsically contained by the interpreter. This is called a built-in module.
The modules written in Python are the most common, and they are the type of modules Python developer usually write. They are exceedingly straightforward to build: just place Python code in a file with the .py
extension. Modules are simple Python files: a module consists of just one file. The module can be imported inside another Python program or executed on its own. The module can define object instances such as strings or lists, which are assigned to variables, and also functions and classes. If intended to run on its own, the module will also include runnable code.
The file name is the name of the module with the suffix .py
appended. Modules are loaded into Python by the process of importing, where the name of the objects present in one module are made available in the namespace of the calling layer.
Each module has its own global namespace, where all objects defined inside the module (strings, lists, functions, classes, etc.) are identified by unique names.
A module may exist as part of a package, and a package may contain many modules.
Module Name
A module name is the name of the file the module resides in, with the extension .py
stripped.
Naming conventions are documented here
Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability. The name of the module cannot contain dashes ('-'). The name of the directory the module is stored in can contain dashes.
Module Docstring
Contains documentation for the module. Added on the first line:
"""
This module does this and that.
"""
Built-in Modules
The built-in modules are contained by the Python interpreter.
Module Internal Representation and Introspection
The modules, once loaded, can be introspected for their attributes, as instances of the module
class. For more details, see:
Importing
Importing means making accessible names available in a module or a package into the namespace associated with the code that invokes the import
statement. The import operation binds the imported module's name into the namespace associated with the calling layer. All module-level code is executed immediately at the time it is imported. In case of functions, the function will be created, but its code will not be executed until the function is called.
Importing a Module
The import
statement, usually listed at the top of the file, followed by the name of the module, binds the name of the module into the caller's symbol table. This way, the name of the imported module becomes available in the caller's namespace. The name of the module is the name of the file containing the module code, without the .py
extension.
import mymodule
A module is importable if the file corresponding to the module being imported is accessible to the Python interpreter. For more details on how the Python interpreter finds modules see Locating Module Files - Module Search Path.
The import mymodule
statement does nothing except binding the specified module name into the current namespace, thus potentially enabling access to the imported module global namespace. The objects that are defined in the imported module remain in the module’s private symbol table.
print(globals())
[...]
'mymodule': <module 'mymodule' from '/Users/ovidiu/playground/pyhton/modules/mymodule.py'>
For more details on accessing a module global namespace see globals()
.
However, binding the module name into the current namespace allows the Python code from the current scope to use the module name to access objects contained by the module. Assuming that mymodule
declares a function some_func
, the function can be invoked by prefixing its name with the name of the module, using the dot notation. This is called qualifying the internal names of a module with the module's name. The function will be looked up in the mymodule
's global namespace, which is made accessible to the calling layer as mymodule
. The same applies to other objects declared by the module, such as variables, classes, etc.:
import mymodule
mymodule.some_func()
Once imported, the file associated with the module can be determined using the module object's __file__
attribute:
Multiple comma-separated module names can be specified in the same import statement, but various static analysis programs flag this as a style violation:
import mymodule, mymodule2 # The style checker flags this as a style violation
It most cases, we use the absolute import syntax, where we specify the complete path to the module, function or class we want to import. The absolute import statement uses the period operator to separate the name of the packages or modules. The alternative is to use relative import syntax.
Importing a Module from a Function or Class
A module can be imported in the global namespace of another module, or inside a function or a class. You should consider importing in the global namespace if the imported code might be used in more than one place, and inside of the function or class if you know its use will be limited. Putting all imports at the top of the file makes all dependencies of the importing module code explicit.
While importing into a function's namespace, all that the import
statement does is to bind the specified module's namespace into the local namespace of the function. The imported module's objects must be qualified with the name of the module to be accessed. Note that the import does not occur until the function is called:
def some_func():
import mymodule
[...]
mymodule.my_func()
Python 3 does not allow indiscriminate import *
from within a function.
Importing a Module with Another Name
The objects of an imported module are qualified with the name of the module to be used. The prefix can be changed, usually to shorten it, by using the as
reserved word in the import statement. This syntax effectively renames the module being imported in the namespace. The same technique is useful if there are two modules with the same name.
import mymodule as m
[...]
m.my_func()
print(locals())
[...]
'm': <module 'mymodule' from '/Users/ovidiu/playground/pyhton/modules/mymodule.py'>
For more details accessing a function's local namespace see locals()
.
Programmatic Import using the Module Name as String
So far, we used the import
statement to import modules by their name, which is provided as part of the statement. Modules can also be imported dynamically in the program by using the module name as string, or the file the module exists in:
__import()__
__import__()
is a built-in function. Because this function is meant for use by the Python interpreter and not for general use, it is better to use importlib.import_module()
to programmatically import a module.
__import__('some_package.some_subpackage.some_module')
importlib.import_module()
Recommended idiom to import modules programmatically:
import importlib.import_module
module = importlib.import_module('some_name')
assert isinstance(module, types.ModuleType)
When the module is loaded as part of a package, use:
module_name = "..."
module = importlib.import_module(f'.{module_name}', package.__name__)
imp.load_source()
import imp
module = imp.load_source('some_package.some_subpackage.some_module', '.../some_package/some_subpackage/some_module.py')
Locating Module Files - Module Search Path
The runtime looks at a list of directory names and ZIP files stored in the standard sys
module as the variable path
(sys.path
). The initial value of sys.path
is assembled from the following sources:
- The directory in which the code performing the import is located. This is why an import will aways work if the module file being imported is in the same directory as the code doing the import.
- The current directory if the script is executed interactively.
- The list of directories contained in the
PYTHONPATH
environment variable, if set. - An installation-dependent list of directories configured at the time Python is installed.
sys.path
value can be accessed and modified:
import sys
for i in sys.path:
print(i)
/opt/brew/Cellar/python@3.9/3.9.9/Frameworks/Python.framework/Versions/3.9/lib/python39.zip /opt/brew/Cellar/python@3.9/3.9.9/Frameworks/Python.framework/Versions/3.9/lib/python3.9 /opt/brew/Cellar/python@3.9/3.9.9/Frameworks/Python.framework/Versions/3.9/lib/python3.9/lib-dynload /Users/ovidiu/my-project/venv/lib/python3.9/site-packages
The initial blank output line is the empty string '', which stands for the current directory. The first match will be used. If a module with the same name as a module from standard library is encountered in the search path before the standard library, it will be used instead of the module coming from the standard library.
sys.path
value can be modified as follows:
Update PYTHONPATH
Set the environment variable PYTHONPATH
to a colon-separated list of directories to search for imported modules. A directory declared in PYTHONPATH
will be inserted at the beginning of the sys.path
list when Python starts up.
Programmatically modify sys.path
Appending an absolute path works:
import sys
sys.path.append('/path/to/search')
Appending a relative path does not work, unless the script is executed from the directory the path is relative to.
import sys
sys.path.append('./my-module') # This does not work unless the Python script is executed from "my-module"'s parent.
To use a relative directory and make append() insensitive to the location the program is run from, use this pattern:
import os
import sys
sys.path.append(os.path.dirname(__file__) + "/my-module")
import my_module
PyCharm trick: If the module name and the parent directory have the same name, PyCharm will stop issuing static analysis error "No module named ..."
Programmatically modify site.addsitedir
site.addsitedir
can be used to add a directory to sys.path
. The difference between this and just plain appending is that when you use addsitedir
, it also looks for .pth
files within that directory and uses them to possibly add additional directories to sys.path
based on the contents of the files.
Importing Specific Objects from a Module
An alternate import
statement syntax allow binding of specific objects into the caller's symbol table. When the from
/import
combination is used, the specified imported module objects are inserted into the caller's symbol table and are available in the caller's namespace under their original name. No qualification with dot notation is needed to access them.
from mymodule import my_func, my_func_2
...
# invoke the function directly, without prefixing it with the name of the module
my_func()
The objects can keep their original name, as shown above, or it can be aliased, which means a custom name is inserted in the caller's symbol table.
from mymodule import my_func as m_f
...
m_f()
⚠️ Because this form of import places the object names directly into the caller’s symbol table, any objects that already exist with the same name will be overwritten.
Importing * from a Module
from mymodule import *
This syntax places the names of all objects from the imported module into the local symbol table, with the exception of those whose name begins with an underscore (_).This technique is not necessarily recommended in large-scale production code. Unless you are confident there won’t be a conflict, there is a chance of overwriting an existing name inadvertently, so it should be used with caution. Doing this will unnecessarily clutter the namespace. Not doing it makes the code easier to read: when we explicitly import a class or a function with from x import y
syntax, we can easily see where y
comes from. However, if we use from x import *
syntax, it takes a lot longer to find where y
is located. In addition, most code editors are able to provide code completion, ability to navigate to the definition of a class or inline documentation if normal imports are used. The import *
syntax usually removes this capabilities. Finally, import *
syntax can bring unexpected objects into the target namespace, because it will import any classes or modules that were themselves imported in the file being imported.
Explicit is better than implicit.
Recommended Style
Uses import
statements for packages and modules only, not for individual classes or functions.
- Use
import x
for importing packages and modules. - Use
from x import y
wherex
is the package prefix andy
is the module name with no prefix. - Use
from x import y as z
if two modules namedy
are to be imported, ify
conflicts with a top-level name defined in the current module, or ify
is an inconveniently long name. - Use
import y as z
only whenz
is a standard abbreviation (e.g., np for numpy). - TO PROCES: https://google.github.io/styleguide/pyguide.html#2241-exemptions
Handling Unsuccessful Imports
An unsuccessful import attempt, caused by the unavailability of the imported module, can be caught in a try
block:
try:
import inexistent_module
except ImportError:
print("module not found!")
The unavailability of specific objects within an existent module can be detected the same way:
try:
from mymodule import inexistent_object
except ImportError:
print("object not found!")
Reloading a Module
TO PROCESS: https://realpython.com/python-modules-packages/#reloading-a-module
Relative Imports
Relative import syntax is an alternative to absolute import. This syntax is useful when working with related module inside a package, that are stored in a know relative position to each other. Relative import is a way of saying find a class, function or module as it is positioned relative to the current module.
In this situation:
some_package
├─ module_1.py
└─ module_2.py
where module_1
declares func_1()
, func_1()
can be imported in module_2
with a relative import:
from .module_1 import func_1
def func_2():
func_1()
The period in front of "module_1" says to sue the "module_1" module inside the current package. In this situation:
some_package
├─ module_1.py
└─ some_subpackage
└─ module_2.py
in module_2
we can use:
from ..module_1 import func_1
Distinguishing between Importing and Executing a Module
Note that if you want to maintain a module that can be imported and executed as a script, the module must not contain executable code outside functions and classes. If it does, that code will be executed when the module is imported, which is generally not what you want. To distinguish between the case when the file is loaded as a module and it is executed as a script, Python sets the __name__
variable to different values. When the module is imported, __name__
is set to the module name. When the module is executed as a script, the variable value is set to "__main__".
As such, this fact can be used to wrap the executable code in a function that is only executed when the module is executed as a script:
...
def main():
# this code is executed when the module is executed as script
print("...")
if __name__ == '__main__':
main()
Also see:
How to Execute (Run) a Module?
Document the -m
flag.
python -m <module-name> ....
Package (Import Package)
A package is Python code stored into multiple files, organized in a file hierarchy. The Python Packaging User Guide calls it an import package, to differentiate it from a distribution package, explained below. The canonical form of an import package is a directory containing modules, or recursively other packages and optionally an __init__.py
file.
Since we cannot put modules inside modules, because a module is just a file, and a file can hold only one file after all, Python offers the package mechanism. A package is a collection of modules, and optionally subpackages, recursively, in a folder.
The name of the package is the name of the folder.
A package may contain multiple modules, each stored in its own file, either in the package root directory or recursively in subdirectories. The package root directory may also contain two optional files named __init__.py
and __main__.py
. Packages allow for a hierarchical structuring of the module namespace using dot notation. In the same way that modules avoid collisions between global variable names, packages avoid collision between module names. For example, the urllib package contains several modules: urllib.request
, urllib.error
, etc. A package also allows for subpackages.
some_dir └─ some_package_1 ├─ __init__.py # Optional ├─ __main__.py # Optional ├─ some_module_1.py # Defines some_func_1() ├─ some_module_2.py # Defines some_func_2() └─ dir_1 ├─ some_module_3.py # Defines some_func_3() └─ dir_2 └─ some_module_4.py # Defines some_func_4()
Internally, a package is represented as an instance of the module
class. The difference between a package instance and a simple module instance is that the the package instance as an extra __path__
attribute. For more details on the internal representation of a package, see:
Importing a Package
To import the modules of the package represented above, ensure that the directory some_dir
, the parent of some_package_1
, is in the module search path and use the following import statements, where a module is identified using dot notation relative to its package name:
import some_package_1.some_module_1
import some_package_1.some_module_2
import some_package_1.dir_1.some_module_3
import some_package_1.dir_1.dir_2.some_module_4
some_package_1.some_module_1.some_func_1()
some_package_1.some_module_2.some_func_2()
some_package_1.dir_1.some_module_3.some_func_3()
some_package_1.dir_1.dir_2.some_module_4.some_func_4()
A slightly more compact version is:
from some_package_1 import some_module_1
from some_package_1 import some_module_2
from some_package_1.dir_1 import some_module_3
from some_package_1.dir_1.dir_2 import some_module_4
some_module_1.some_func_1()
some_module_2.some_func_2()
some_module_3.some_func_3()
some_module_4.some_func_4()
Importing the package itself is syntactically correct, but unless there is an __init__.py
, the import does not do anything useful. In particular, it does not place any of the component module names in the package in the local namespace:
import some_package_1
print(str(some_package_1))
<module 'some_package_1' (namespace)>
Package Name
See module name above.
Package Internal Representation and Introspection
A package, once loaded, is represented internally as an instance of the module
class. For more details, see:
__init__.py
When a regular package is imported, this __init__.py
file is implicitly executed, and the objects it defines are bound to names in the package’s namespace. The __init__.py
file can contain the same Python code that any other module can contain, like variables, function and class declarations, and Python will add some additional attributes to the module when it is imported.
Much of the Python documentation states that the __init__.py
file must be present in the package directory, even if as an empty file, for the package to be valid. This was once true. Since Python 3.3, PEP 420 Implicit Namespace Packages were introduced and they allow for the creation of a package without any __init__.py
file.
Assuming that __init__.py
is declared in some_package_1
, as shown above, and has the following content:
# this is __init__.py
COLOR = 'blue'
then importing the package itself binds the COLOR
in the package's namespace, making it accessible to the client program importing the package:
import some_package_1
assert 'blue' == some_package_1.COLOR
A module in the package can access the global variable by importing it in turn. In some_module_1.py
:
from some_package_1 import COLOR
[...]
def print_color():
print(f'color is {COLOR}')
__init__.py
can also be used to automatically import the modules from the package, so the clients of the package won't have to import them individually, and the objects from the package's modules will bound to the package namespace.
# this is __init__.py
import some_package_1.some_module_1
import some_package_1.some_module_2
import some_package_1.dir_1.some_module_3
import some_package_1.dir_1.dir_2.some_module_4
For a client of the package:
import some_package_1
some_package_1.some_module_1.some_func_1()
some_package_1.some_module_2.some_func_2()
some_package_1.dir_1.some_module_3.some_func_3()
some_package_1.dir_1.dir_2.some_module_4.some_func_4()
It is not recommended to put much code in __init__.py
file. Programmers to not expect actual logic to happen in this file.
__main__.py
Packages can be run as if they were scripts if the package provides the top-level script __main__.py
. The file contains the code of the "main" module, which will be imported automatically when the package is imported. As such, __main__.py
file is used to provide a command-line interface for a package.
import somepkg.some_module_1 as some_module_1
def main():
print('.')
if __name__ == '__main__':
# Execute when the module is not initialized from an import statement.
main()
TO PROCESS: Idiomatic usage: https://docs.python.org/3/library/__main__.html#id1
Namespace Package
A PEP 420 package which serves only as a container for subpackages. Namespace packages may have no physical representation, and have no __init__.py
file.
Subpackages
A subpackage is a folder containing modules and optionally other subpackages, stored in a package.
Importing * from a Package
When from <package_name> import *
is encountered, Python follows this convention: if the __init__.py
file in the package directory contains a list named __all__
, it is taken to be a list of modules that should be imported.
Package Metadata
Name: pulumi
Version: 2.11.2
Summary: Pulumi's Python SDK
Home-page: https://github.com/pulumi/pulumi
Author:
Author-email:
License: Apache 2.0
Location: /Users/ovidiu/Library/Python/3.8/lib/python/site-packages
Requires: dill, grpcio, protobuf
Required-by: pulumi-aws, pulumi-kubernetes, pulumi-random, pulumi-tls
Requires
Required-by
Package Example
A package example is available here:
A program that consumes it is available here:
Distribution Package
A distribution package is a project with a setup.py
that uses setuptools
or disutils
or wheel
to generate a distributable egg or wheel. In other words, a distribution package is a "built" distributable artifact, a piece of software you can install. In most cases is synonymous with "project", as mentioned above. A published distribution package can be installed with:
pip install somepkg
The name of a published distribution package, in this case somepkg
can be declared in the project's requirements.txt
:
somepkg==0.1.0
When you browse PyPI, what you see are distribution packages. On a given package index, like PyPI, distribution package names must be unique.
Distribution Package Name
Distribution packages can use hyphens -
or underscores _
. They can also contain dots .
, which is sometimes used for packaging a subpackage of a namespace package. For most purposes, they are insensitive to case and to -
vs. _
differences, ex., pip install Some_Package
is the same as pip install some-package
. The precise rules are provided here:
The distribution name of the package is specified as [project].name
in the pyproject.toml
file.
Relationship between the Distribution Package and the corresponding Import Package
Most of the time, a distribution package provides one single import package, with the same name as the distribution name of the package, or several import packages, though this is less common. For example, pip install somepkg
lets you import somepkg
. However, this is only a convention. PyPI and other package indices do not enforce any relationship between the distribution name of the package and the import packages it provides. A distribution package could provide an import package with a different name.
On a given package index, like PyPI, the distribution name of the package must be unique. On the other hand, import packages have no such requirement. Import packages with the same name can be provided by several distribution packages.
Distribution Package Version
The version of the distribution package is specified as [project].version
in the pyproject.toml
file.
Built Distribution
A distribution format containing files and metadata that only need to be moved to the correct location on the target system, to be installed. Wheel is such a format. This format does not imply that Python files have to be precompiled. Wheel intentionally does not include compiled Python files.
Source Distribution ("sdist")
A distribution format, usually generated with python -m build --sdist
that provides metadata and essential source files needed for installing by a tool like pip
or for generating built distribution.
Build Frontend
A tool that users might run that task arbitrary source trees or source distributions and builds source distributions or Wheel from them. Examples of build frontends are pip
and build
. The actual build is delegated to a build backend. It is the pyproject.toml
file that tells the frontend tool which backend to use.
Build Backend
A build backend is a library that takes a source tree or a source distribution and builds a source distribution or a Wheel from it. The build is delegated to the backend by a frontend. The build backend determines how the project will specify its configuration, including its metadata. All backends offer a standardized interface.
Example of build backends:
Distribution Package Formats
TODO: https://packaging.python.org/en/latest/discussions/package-formats/#package-formats
Wheel
The standard Built Distribution format. Also see:
Egg
Egg is a built distribution format introduced by setuptools
, which was replaced by Wheel.
Publishing a Python Distribution Package in a Repository
Python Standard Library
site-packages
See