Python Module Internal Representation and Introspection: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
 
(33 intermediate revisions by the same user not shown)
Line 1: Line 1:
=Internal=
=Internal=
* [[Python_Introspection#Module_Internal_Representation_and_Introspection|Python Introspection]]
* [[Python_Introspection#Module_Internal_Representation_and_Introspection|Python Introspection]]
* [[Python_Language_Modularization#Module_Internal_Representation_and_Introspection|Python Language Modularization]]
=Overview=
=Overview=


=The <tt>module</tt> Class=
=The <tt>module</tt> Class=


All module instances are represented internally as instances of the <code>module</code>
All [[Python_Language_Modularization#Module_Internal_Representation_and_Introspection|module]] and [[Python_Language_Modularization#Package_Internal_Representation_and_Introspection|package]] instances are represented internally as instances of the <code>module</code> class.
 
==Checking whether an Object Instance is a Module==
<syntaxhighlight lang='py'>
import types
import mymodule
 
assert isinstance(mymodule, types.ModuleType)
assert not isinstance('something', types.ModuleType)
</syntaxhighlight>
 
==Attributes==
 
All object instances declared in the module (variables, functions, classes, etc.) become attributes of the module instance and they are accessible with <code>[[Python_Introspection#Introspect_Members_of_an_Object_Instance|inspect.getmembers()]]</code>.
 
{{Internal|Python_Introspection#Introspect_Members_of_an_Object_Instance|<tt>inspect.getmembers()</tt>}}
 
Additionally, the following special attributes are present:
 
===<tt>__name__</tt>===
The <code>__name__</code> special variable contains the name of the module, when it was imported, as string.
<syntaxhighlight lang='python'>
import mymodule
assert mymodule.__name__ == 'mymodule'
</syntaxhighlight>
When the module is executed directly with <code>python mymodule.py</code>, it is never imported, so <code>__name__</code> is set to the "__main__" string.
 
===<tt>__file__</tt>===
Once imported, the file associated with the module can be determined using the module object's <code>__file__</code> attribute, as string:
<syntaxhighlight lang='python'>
import mymodule
[...]
print(mymodule.__file__)
</syntaxhighlight>
The directory portion of <code>__file__</code> should be one of the directories in <code>[[Python_Language_Modularization#sys.path|sys.path]]</code>.
===<tt>__doc__</tt>===
The content of the [[Python_Language_Modularization#Module_Docstring|module docstring]], if declared, otherwise <code>None</code>.
 
===<tt>__cached__</tt>===
===<tt>__loader__</tt>===
===<tt>__spec__</tt>===
 
===<tt>__package__</tt>===
An empty string for a top-level module, the name of the package for a package or for a module that was loaded as part of a package.
 
===<tt>__path__</tt>===
The <code>__path__</code> attribute exists only for <code>module</code> instances that represent [[Python_Language_Modularization#Package|packages]], not for those instances that represent ordinary [[Python_Language_Modularization#Modules|modules]].
 
<code>__path__</code> contains a '''list''' of the package root directories, where the component modules, [[Python_Language_Modularization#Subpackages|subpackages]], <code>[[Python_Language_Modularization#init_.py|__init__.py]]</code> and <code>[[Python_Language_Modularization#main_.py|__main__.py]]</code> live.
 
To check whether <code>__path__</code> exists, use <code>[[Python_Introspection#hasattr.28.29|hasattr()]]</code>.
 
=Dynamic Module Tree Traversal and Class Loading=
<syntaxhighlight lang='python'>
def find_class(module_or_package: types.ModuleType, predicate) -> type:
    """
    If given a module, look for a class whose name, as string, satisfies the predicate and return the first match.
    If given a package, recursively load the modules while descending in the package structure, look for a class whose name, as string, satisfies
    the predicate and return the first match.
    :param module_or_package: the module or the package instance. Must be imported by the calling layer.
    :param predicate: a function that examines the class name, as string, and returns True if the class is acceptable, False otherwise.
    :return: the class instance, or None if no such class exists
    """
    if not isinstance(module_or_package, types.ModuleType):
        raise TypeError(f'invalid module: {module_or_package}')
    if not isinstance(predicate, types.FunctionType):
        raise TypeError(f'invalid predicate: {predicate}')
    if not hasattr(module_or_package, '__path__'):
        # module
        module = module_or_package
        for name, value in inspect.getmembers(module):
            if isinstance(value, type):
                # a class, apply the predicate
                if predicate(name):
                    return value
    else:
        # package
        package = module_or_package
        paths = package.__path__
        for p in paths:
            path = Path(p)
            if not path.is_dir():
                raise IllegalStateError(f'package {package.__name__} path not a directory: {p}')
            file_names = []
            dir_names = []
            for f in path.iterdir():
                # module or sub-package. Import in the local namespace and proceed recursively.
                if f.name.startswith('__'):
                    continue
                name = f.name.replace('.py', '')
                if f.is_file():
                    file_names.append(name)
                else:
                    dir_names.append(name)
            # Process modules first, to favor classes declared closest from the top
            all_names = file_names
            all_names.extend(dir_names)
            for name in all_names:
                module_or_package = importlib.import_module(f'.{name}', package.__name__)
                cls = find_class(module_or_package, predicate)
                if cls:
                    return cls
</syntaxhighlight>
 
Usage:
<syntaxhighlight lang='python'>
def predicate(class_name: str) -> bool:
    return class_name == 'SomeClass'
 
import test_package
cls = find_class(test_package, predicate)
</syntaxhighlight>

Latest revision as of 21:17, 4 January 2023

Internal

Overview

The module Class

All module and package instances are represented internally as instances of the module class.

Checking whether an Object Instance is a Module

import types
import mymodule

assert isinstance(mymodule, types.ModuleType)
assert not isinstance('something', types.ModuleType)

Attributes

All object instances declared in the module (variables, functions, classes, etc.) become attributes of the module instance and they are accessible with inspect.getmembers().

inspect.getmembers()

Additionally, the following special attributes are present:

__name__

The __name__ special variable contains the name of the module, when it was imported, as string.

import mymodule
assert mymodule.__name__ == 'mymodule'

When the module is executed directly with python mymodule.py, it is never imported, so __name__ is set to the "__main__" string.

__file__

Once imported, the file associated with the module can be determined using the module object's __file__ attribute, as string:

import mymodule
[...]
print(mymodule.__file__)

The directory portion of __file__ should be one of the directories in sys.path.

__doc__

The content of the module docstring, if declared, otherwise None.

__cached__

__loader__

__spec__

__package__

An empty string for a top-level module, the name of the package for a package or for a module that was loaded as part of a package.

__path__

The __path__ attribute exists only for module instances that represent packages, not for those instances that represent ordinary modules.

__path__ contains a list of the package root directories, where the component modules, subpackages, __init__.py and __main__.py live.

To check whether __path__ exists, use hasattr().

Dynamic Module Tree Traversal and Class Loading

def find_class(module_or_package: types.ModuleType, predicate) -> type:
    """
    If given a module, look for a class whose name, as string, satisfies the predicate and return the first match.
    If given a package, recursively load the modules while descending in the package structure, look for a class whose name, as string, satisfies
    the predicate and return the first match.
    :param module_or_package: the module or the package instance. Must be imported by the calling layer.
    :param predicate: a function that examines the class name, as string, and returns True if the class is acceptable, False otherwise.
    :return: the class instance, or None if no such class exists
    """
    if not isinstance(module_or_package, types.ModuleType):
        raise TypeError(f'invalid module: {module_or_package}')
    if not isinstance(predicate, types.FunctionType):
        raise TypeError(f'invalid predicate: {predicate}')
    if not hasattr(module_or_package, '__path__'):
        # module
        module = module_or_package
        for name, value in inspect.getmembers(module):
            if isinstance(value, type):
                # a class, apply the predicate
                if predicate(name):
                    return value
    else:
        # package
        package = module_or_package
        paths = package.__path__
        for p in paths:
            path = Path(p)
            if not path.is_dir():
                raise IllegalStateError(f'package {package.__name__} path not a directory: {p}')
            file_names = []
            dir_names = []
            for f in path.iterdir():
                # module or sub-package. Import in the local namespace and proceed recursively.
                if f.name.startswith('__'):
                    continue
                name = f.name.replace('.py', '')
                if f.is_file():
                    file_names.append(name)
                else:
                    dir_names.append(name)
            # Process modules first, to favor classes declared closest from the top
            all_names = file_names
            all_names.extend(dir_names)
            for name in all_names:
                module_or_package = importlib.import_module(f'.{name}', package.__name__)
                cls = find_class(module_or_package, predicate)
                if cls:
                    return cls

Usage:

def predicate(class_name: str) -> bool:
    return class_name == 'SomeClass'

import test_package
cls = find_class(test_package, predicate)