Python Language

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

External

Internal

TODO

Expand the following sections and subjects:

Overview

Python is a general-purpose, high-level, dynamically-typed language. Its design makes it very readable. Its relative terseness makes it possible to write a program that is much smaller than the equivalent static language program. However, if the program is CPU-bound, a program written in C, C++ or Java will generally run faster than its Python equivalent.

Python installation contains the core language support (built-ins) and standard libraries.

Python programs can be executed in two modes: interactively via an interpreter, also called a shell, or stored into a file with the usual, but optional .py extension and run by typing python followed by the file name.

In Python, everything is an object. This includes numbers, strings, tuples, lists, dictionaries, functions and programs. The ID of any object can be obtained with the built-in function id(). The definition of an object is called class.

In Python, spacing does matter, sequential blocks are indented at the same level, and they are known as indented blocks. Python uses white space to define program structure.

Printing

Printing done with the print() function in Python 3 (it uses to be a statement in Python 2).

print('something')

Also see:

Printing to stdout in Python

Comments

# This is a comment
s = 0 # This is also a comment

Everything from # to the end of the line is comment. Python does not have a multi-line comment.

Code Line Continuation

Python encourages short lines. Recommended maximum line length is 80 characters. To continue a line on the next line, use backslash:

s = 'A' + \
 'B' + \
 'C'

Line continuation is needed if a Python expression spans multiple lines.

Reserved Words

Reserved words, or keywords, can only be used to mean the thing Python expects them to mean. They cannot be used as variable names, function names, class names, or identifiers.

try except finally raise assert
while for break continue in
False True None
if else elif
def return lambda
and or not
import from
with as
nonlocal yield
pass
is
del
class
global

Also see:

Keywords

The as Reserved Word

Used:

Literals

Literals have a type.

Integer Literals

String Literals

Strings | Quotes

F-String

F-String

Constants

Constants are fixed values, they do not change throughout the program. Constants can be boolean (True, False), numeric (integers or floating point numbers), or strings, which can be single quoted or double quoted, or even "the absence of a value" (None). Constants can be assigned to variables, can be arguments of functions. Constants have a type.

Constant Variable

Python does not have anything equivalent to final in Java, so any variable can be modified after assignment. You can use:

CONST_NAME = "Name"

for clarity, but nothing prevents CONST_NAME to be assigned other value later in the program.

Single Quoted, Double Quoted, Triple Single Quoted, Triple Double Quoted

TODO

Multi-Line Strings

TODO, see Python Language Functions#Docstring

Triple Double-Quoted Strings

Triple double-quoted strings is the recommended style for docstrings.

s = """
This is a
  multi-line
  string
"""

Variables

Variables are names associated with memory locations used to store values. Variables and identifiers are equivalent. Variables are declared and assigned a value though an assignment statement. The assignment does not copy a value, it just attaches the name to the object that contains the data. A variable may be associated with an object that has a type, and the same variable may be assigned later an object of a different type. A useful mental representation of a variable is a sticky note that can be attached to an object, and then re-attached to a different object, not necessarily of the same type.

a = 1
b = 'something'
print(a)
print(b)
a = 'something else'
print(a)

Also see:

Variables, Parameters, Arguments

Variable Naming Rules

Variable names are case sensitive. Variable names can start with letters or underscore ('_') - but underscores should be generally avoided because Python treats names that begin with an underscore in special ways, and tends to use them for its internal purposes. The rest of the variable name can be lowercase letters (a through z), uppercase letters (A through Z), digits (0 through 9) and underscores. No other characters are allowed. Python has a set of words, called reserved words, that cannot be used as variable names. Variable names should be sensible (mnemonic). Function name follow the same rules.

Variables declared in functions should be lowercase. For more details see:

Python Style

Leading Underscore Variable Names

See:

Two Underscore (__) Variables

Names that being and end with two underscores (__) are reserved for use within Python, for built-in variables.

__name__

__name__ is a built-in variable which evaluates to the name of the current module and it is set differently depending on whether the module is imported or executed. For more details see:

Distinguishing between Importing and Executing a Module

__package__

https://docs.python.org/3/reference/import.html#package__
https://www.python.org/dev/peps/pep-0366

__package__ is a module-level attribute, which must be set. Its value must be a string, but it can be the same value as its __name__. When the module is a package, its __package__ value should be set to its __name__. When the module is not a package, __package__ should be set to the empty string for top-level modules, or for submodules, to the parent package’s name. When it is present, relative imports will be based on this attribute rather than the module __name__ attribute.

__file__

Contains the name of the file the function was loaded from.

__doc__

__loader__

__spec__

__annotations__

__builtins__

Variables Namespace and Scope

A namespace is a collection of names that can be used to access various underlying Python objects, such as data instances, functions, classes, etc. Multiple namespaces can co-exist at a given time, but they are isolated from each other. Within a namespace, a particular name is unique. A Python program has three kinds of namespaces: built-in, global and local.

Python Namespaces.png

A module also has a global namespace.

A scope is the portion of a program where a namespace can be accessed directly without any prefix. At any given moment, there are at least three nested scopes: the scope fo the current function that has local names, the scope of the module, which has global names and the outermost scope that has built-in names. When a reference is made inside a function, the name is searched in the local namespace, the global namespace and finally in the built-in namespace. If there is a function inside another function, the embedded function's local scope is nested in the upper level function's local scope.

The variables declared in higher scopes can be read, without any prefix, but unless they are prefixed with the namespace they belong to, they cannot be modified. Only the variables in the same scope can be modified. If we try to modify a variable from a higher scope without prefixing it (which we can read without prefixing it), a local variable with the same name is created and that variable is written. For more details see Interaction between Local and Global Variables, below.

Built-in Namespace

The built-in namespace contains all the built-in Python runtime names, such those of the built-in functions, for example. This namespace is create when the Python interpreter is started and exists for as long as the interpreter runs. The built-in namespace is the reason that built-in functions like id() and print() are always available in any part of the program, such a module, a function or a class.

Global Namespace and Variables

Each module creates its own global namespace for its variables, functions and classes. Since the namespaces are isolated, the same name may exist in different modules and it does not collide. The variables defined in the global namespace, outside any function or class, are known as global variables. Global variables can be used by everyone within the module, both inside and outside of functions. The global variables are maintained by Python in the module's global symbol table.

Global Symbol Table and globals()

The Python runtime maintains a global symbol table, which is a data structure that contains variable names and their association with objects in the global namespace of a module. The global symbol table contains all variables that are not associated with any class or function. The built-in function globals() returns a dictionary of the contents of the global symbol table. Assuming that the main program contains a some_func function declaration and a SomeClass class declaration, the content of the global symbol table of the corresponding module is similar to:

__name__        = '__main__'
__doc__         = None
__package__     = None
__loader__      = <_frozen_importlib_external.SourceFileLoader object at 0x1090d5fa0>
__spec__        = None
__annotations__ = {}
__builtins__    = <module 'builtins' (built-in)>
__file__        = './main.py'
__cached__      = None
some_func       = <function some_func at 0x101dd4310>
SomeClass       = <class '__main__.SomeClass'>

Local Namespaces and Variables

Each function and class define their own namespace, which is created when the function is called or the class is instantiated. Variables defined inside a function are distinct from the variables with the same name defined in the main program, and they are known as local variables. However, the global variables are accessible within functions. Upon access from a function, the global variable with the same name will remain as it was, global and with the original value. Special rules apply when a local variable has the same name as an existing global variable. In this case we say that the local variable temporarily overrides the global variable, and this interaction is described in the Interaction between Local and Global Variables section.

Local Symbol Table and locals()

Each function and class contains a local symbol table, which maintains the names of the variables in the local namespace. The built-in function locals() returns a dictionary of the contents of the local namespace.

Get Names in a Namespace with dir()

The built-in function dir() returns a list of defined name in a namespace. Without arguments, it produces an alphabetically sorted list of names in the current local symbol table. When given an argument that is the name of an imported module, dir() lists the names defined in the imported module.

Interaction between Local and Global Variables

If a global variable is accessed inside a function, and then there is an attempt to change it, the program throws an UnboundLocalError:

a = 10

def some_func():
    print('a:', a)
    a = 11

some_func() # this throws UnboundLocalError
Traceback (most recent call last):
  File "./main.py", line 8, in <module>
    some_func()
  File "./main.py", line 5, in some_func
    print('a:', a)
UnboundLocalError: local variable 'a' referenced before assignment

If a variable with the same name as a global variable is modified without accessing the global variable first, this implicitly declares the local variable with the same name, and the program works fine:

a = 10

def some_func():
    a = 11
    print('a:', a)

some_func()

This will display:

a: 11

To explicitly indicate that we want to use the global variable inside the function, use the global reserved word. This will tell the interpreter that we want to use the global variable, which then can be modified inside the function:

a = 10

def some_func():
    global a
    print("global 'a' from within the function:", a)
    a = 11

some_func()
print("global 'a' after function call:", a)

If global is not explicitly declared inside a function, Python uses the local namespace and the variable is implicitly local. Local variables are created during the function execution and discarded after the function completes.

References

See:

Variables, Parameters, Arguments | Reference

Identifiers

An identifier is a name given to objects. Everything in Python is an object, so the name is the way to access the underlying object. An identifier and a variable are equivalent.

Type

Python is a dynamically typed language, in that a variable may contain instances of different types, at different moments in the execution of the program. Everything in Python is implemented as an object, and any object has a type. The type determines whether the data value of the object is mutable or immutable (constant). Python is strongly typed, which means that the type of an object does not change, even if the value is mutable.

The type of an object can be obtained with the built-in function type() applied to the variable or a constant the object is assigned to. In Python, "class" and "type" mean pretty much the same thing.

Data Types

None

x = None
type(x)
<class 'NoneType'>

None is a special Python value that holds a place when there is nothing to say. It is not the same as False, although it looks false when it is evaluated as a boolean. None can be used with the is or is not operators:

if x is None:
  ...
if x is not None:
  ...

None is returned by a function that does not contain the return statement. None is useful to distinguish a missing value from an empty value. Zero-value integers or floats, empty strings (''), empty lists [], empty tuples (,), dictionaries {} and sets set() are all False but are not equal to None.

Two different None instances are equal:

a = None
b = None
assert a == b
assert None == None # this equality statement is singled out as static validation warning

Booleans

Python Boolean

Numbers

Integers

Python Integers

Floating Point Numbers

Numbers with a decimal point.

x = 98.6
type(x)
<class 'float'>

Floats can be used with operators (+, -, *, /, //, **, % and divmod() function).

NaN

NaN, standing for "Not a Number", is a special floating-point value that represents missing or undefined values in Python. NaN values can be tested with math.isnan() Python function or Pandas isnull() function. To handle NaN values, you can use Pandas fillna() function or dropna() function.

Sequence Types

There are three kinds of sequence types: strings, tuples and lists. Both contain zero or more elements, in both cases elements can be of different types. Strings and tuples are immutable. Lists are mutable. Mutability matters when the objects are stored in sets or as dictionary keys, because the collections are hashed in that case. If hashing a collection is not a concern, the rule of thumb is that fixed-size records of different objects are best represented as tuples, while variable-size collections of similar objects are best represented as lists.

String

String

List

List

Tuple

Tuple

Dictionary

Dictionary

Set

Set

Dataclass

Dataclass

Function

Functions

Built-in Types that Aren't Directly Accessible as a Builtin

See types.py:

import types

print(types.FunctionType)

Type Conversions

Explicit Type Conversions

There are built-in function that can be used for type conversion:

float()

>>> float('1.0e4')
10000.0

int()

int() can be called on a boolean, float or on a string. For a boolean, int(True) will return 1 and int(False) will return 0.

If the string int() is invoked on cannot be converted to an integer, the function invocation throws a ValueException.

str()

Convert other Data Types to Strings with str()

Implicit Type Conversions

If numeric types are mixed, Python will automatically convert them:

>>> 4 + 7.0
11.0

Data Structures

A Python data structure is, for example, what you get when you parse a JSON-serialized text and you recreate the lists and the maps in memory.

Collections

Organizatorium: Python Module collections

Iterable Types

Iterable types: string, list, tuple, set.

How to Tell whether an Instance is Iterable or Not

Method 1:

if type(i) is str or type(i) is list or type(i) is tuple or type(i) is set:
    ...

Method 2:

try:
    iter(i)
except TypeError:
    print('not iterable')

Iterator

Python Iterators

Iterate Multiple Sequences with zip()

IPy Iterate Multiple Sequences with zip() Page 83.

Iterate over Code Structures with itertools

IPy Iterate over Code Structures with itertools Page 121.

Comprehensions

Comprehensions

Statements

In Python 2, print used to be a statement, while in Python 3, print() is a function.

Assignment Statement

The assignment statement assigns a value to a variable. The assignment does not copy the value, it just attaches the variable name to the object that contains the data.

x = 1

The assignment statement accepts expressions:

x = x + 1

Import Statement

See:

import Statement

Statement that Does Nothing

pass is a statement that indicates a function does nothing:

def do_nothing():
  pass

Expressions

Numeric expressions. Order of evaluation takes into account operator precedence.

Operators

+ Addition For numbers, adds them together, for strings, it concatenates. + can be combined with the assignment operator: +=
- Subtraction - can be combined with the assignment operator: -=
* Multiplication * can be combined with the assignment operator: *=
/ Floating Point Division In Python 3 integer division converts to floating point (not the case in Python 2, which truncates). / can be combined with the assignment operator: /=
// Integer (truncating) Division
** Power (exponentiation)
% Remainder (modulo)
= Assignment Expression on the right side of = is calculated first, then assigned to the variable on the left side. See Assignment Statement.
< Less than
<= Less than or Equal to
== Equal to Applies to strings, also. It is the mathematical equality. Also see is, is not.
>= Greater than or Equal to
> Greater than
!= Not equal
is "is the same as" Returns a True or a False. Can be used in logical expression, implies "is the same as". It is similar but a stronger equality than "==". You should not use "is" when you should be using "==". "is" usually applies to True, False or None
is not "is not the same as" Returns a True or a False
in The membership operator
:= The "walrus operator", introduced in Python 3.8. Assigns values to variables as part of a larger expression: if something_else := something == 'blue': [...]

Multiple Comparisons in a Single Expression

x = 10
y = 20
assert 0 < x < y < 30

Boolean (Logical) Operators

and Logical AND
or Logical OR
not Logical NOT

Membership with in

Membership in a collection can be checked with the in operator. In case of a dictionary, in check the existence of a key:

str = 'abc'
list = ['a', 'b', 'c']
tu = ('a', 'b', 'c')
dict = {'a':'A', 'b':'B', 'c':'C'}
st = set()
st.add('a')
st.add('b')
assert 'a' in str
assert 'a' in list
assert 'a' in tu
assert 'a' in dict
assert 'a' in st

The membership operator is commonly used in for loops.

Ternary Operator

https://book.pythontips.com/en/latest/ternary_operators.html
value_if_true if condition else value_if_false

Operator Precedence

The following rules apply, and they are specified in the order of their descending precedence:

  • Parentheses are always respected.
  • Exponentiation.
  • Multiplication, division and remainder.
  • Addition and subtraction.
  • For operators with the same precedence, proceed left to right.

Control Flow

We can solve problems in a way far more easily with clever data structures than with clever control flow. Control flow is obvious and data structures are subtle. So by making clever data structures, your control flow is simplified Dr. Charles Severance.

Sequential Steps

Sequential steps have the same indentation level. A block with the same indentation level (recommended 4 spaces) designates a set of steps that execute sequentially.

Conditional Steps

if, elif and else. are Python statements that check whether a condition is True.

if expression:
  ...
elif expression:
  ...
else:
  ...

if x < 10:
  print('something')
if a == 1:
  print('something')
else:
  print('something else')

In the following case, once one of the alternative is triggered, the corresponding block is the only one that is executed, and the control gets out of the if statement. else is optional.

if a < 0:
  print('m')
elif a < 10:
  print('n')
elif a < 20:
  print('p')
else:
  print('q')

Loops and Iterations

Indefinite Loops

while

while <condition>:
  code-block
n = 5
while n > 0:
  print(n)
  n = n - 1
while x < 5:
  x = x + 1
  print

Loops have iteration variables, which are initialized, checked and changed within the loop. If the iteration variable that matters does not change within the loop, the loop will run forever - an infinite loop.

for i in range(5):
  print(i)

Definite Loops

for

A for loop is finite, it goes through all elements of a collection: all the lines in a file, all the items in a list, all the characters in a string, all the keys in a dictionary, etc. As part of the for syntax, the iteration variable follows the reserved word for, which is followed by the reserved word and operator in, which is then followed by a collection, which can be declared in-line or using a previously declared variable. The iteration variable iterates through the sequence (ordered set) and takes, in order, each value in the sequence. The statements to be executed in the loop are part of an indented block. The body is executed once for each value in the sequence.

for var_name in <collection>:
  code-block
for i in [1, 2, 3, 4, 5]:
  print(i)
collection = ['a', 'b', 'c']
for i in collection:
    print(i)

⚠️ is collection is None, then the statement will fail with TypeError:

TypeError: 'NoneType' object is not iterable

In case of the dictionary, the iteration is done over the dictionary's keys (or its keys() function). For more details about iterating over a dictionary's keys, values and both, see:

Iterate over a Dictionary

for can also be used to iterate over multiple sequences at the same time, using the zip() function.

Other Loop Statements

break

break is a reserved word that indicates a statement which breaks out of the loop. When encountered, the execution goes to the first statement after the loop.

Python has a syntactical oddity that allows checking whether a while or a for loop did not exit with break:

i = 0
while i < 5:
  if i == 7:
    break
  i += 1
else:
  print('we did not exit with break')

continue

continue is a reserved word that indicates a statement which skips the current iteration and starts the next iteration. The control goes to the top of the loop.

Loop Idioms

Access the Index and the Element at the Same Time with enumerate()

List | Access the Index and the Element at the Same Time with enumerate()

Iterate over a List in Reversed Order with reversed()

List | Iterate over a List in Reversed Order

Adding a Comma after All but Last Element

l = ['a', 'b', 'c']
s = ', '.join(l)
assert 'a, b, c' == s

If more complex processing is needed for each element, us a list comprehension.

Functions

Also handles notions like call stack, frame, etc.

Functions

Exceptions (try/except)

Exceptions

Generate Number Sequences with range()

IPy Generate Number Sequences with range() Page 83.

range(start, stop, step)

for i in range(4, 7)
  print(i)

prints:

4
5
6

To count backwards:

for i in range(7, 4, -1):
  print(i)

prints:

7
6
5

Generators

Generators

Decorators

Decorators

Coroutines

Coroutines

Traceback

This means Python quit somewhere.

Modularization

Discusses standalone programs, scripts, modules, packages, importing, package metadata:

Modularization

Python Standard Library

Python comes with a large standard library of modules that perform many useful tasks. This functionality is kept separate from the core language, to avoid bloating. However, they are shipped as part of the Python installation, so they are available locally wherever the Python runtime is installed. Because they the modules are shipped as part of the Python Standard Library, they are sometimes referred to as "standard libraries". When you are about to write some code, it's often worthwhile to first check whether there is a standard module that already does what you want.

The authoritative documentation for the modules included in the standard library is available here:

The Python Standard Library: https://docs.python.org/3/library/

An introduction to the standard library is provided by this tutorial:

Brief Tour of the Standard Library: https://docs.python.org/3.3/tutorial/stdlib.html

More documentation on the standard library is available here:

Doug Hellmann's Python Module of the Week: https://pymotw.com/2/contents.html

Python Module Index

The index of the modules shipped as part of the standard library is available here:

Python Module Index https://docs.python.org/3/py-modindex.html

Notable Python Standard Library Modules

Python Package Index PyPI

The Python Package Index (PyPI) is a repository of software for the Python programming language. PyPI helps with finding and installing software developed and shared by the Python community.

https://pypi.org

It can be searched with pip search.

Notable Packages

Object-Oriented Programming

Classes and Objects

Virtual Environment

Python Virtual Environment

Python Enhancement Proposals (PEPs)

Special Names

__main__

https://docs.python.org/3/library/__main__.html

Also see:

Distinguishing between Importing and Executing a Module

Layout and Structure of a Python Project

Layout of a Python Project

Dunder

Generic appellative given to the names enclosed within "__", for example "__main__".

Threads and Concurrency

Threads and Concurrency

with and Context Manager

Context Manager

Python Version

This is how to check the interpreter version at runtime:

sys.version_info
if sys.version_info >= (3, 7):
  ...

Protocol

https://peps.python.org/pep-0544/

Code Examples

Python Code Examples