NumPy ndarray

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

Internal

Overview

ndarray is an N-dimensional array object. It is a fast, flexible container for large datasets in Python. It is used to implement a Pandas Series. It allows performing mathematical operations on whole blocks of data using similar syntax to the equivalent operation between scalar elements. It also allows applying same mathematical operation, or function, to all array elements without the need to write loops. Examples are provided in Array Arithmetic section.

ndarrays are homogeneous, all elements of an ndarray instance have the same data type. The data type is exposed by the array's dtype attribute. The array's dimensions are exposed by the shape attribute.

ndarrays can be created by converting Python data structures, using generators, or initializing blocks of memory of specified shape with specified values. Once created, array sections can be selected with indexing and slicing syntax.

ndarray Geometry

The number of dimensions is reported by the ndim attribute.

Each array also has a shape tuple that indicates the sizes of each dimensions. The length of the shape tuple is equal with ndim

import numpy as np

a = np.array([[1, 2, 3], [4, 5, 6]])

a.ndim
2

a.shape
(2, 3)

shape and geometry

ndarray Creation

Convert Python Data Structures with array() and asarray()

The np.array() function takes Python data structures, such as lists, lists of list, tuples, and other sequence types and generates the ndarray of the corresponding shape. By default, it copies the input data. For example, a bi-dimensional 3 x 3 ndarray can be created by providing a list of 3 lists, each of the enclosed lists containing 3 elements:

import numpy as np

a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

The Python data structures provided as arguments to np.array() provide the array's geometry: nested sequences will converted to multidimensional arrays. If the structure is irregular, an "inhomogeneous" error message will be thrown. Unless explicitly provided as argument of the function, array() tries to infer a good data type for the array it creates.

array(object, dtype=None, *, copy=True, order='K', subok=False, ndmin=0, like=None)

The data structure to generate the array from must be provided as the first argument.

To enforce a specific data type:

a = np.array(..., np.dtype('float64'))

By Specifying Shape and Value

zeros()

To create an array of a specific shape filled with floating-point zeroes:

a = np.zeros((2, 3))

array([[0., 0., 0.],
       [0., 0., 0.]])

ones()

To create an array of a specific shape filled with floating-point ones:

a = np.ones((1, 2))

array(1., 1.)

empty()

numpy.empty() creates an array with the given shape without initializing the memory to any particular value. You should not rely on values present in such an array, and you should only use the function if you indent to explicitly initialize the array.

arange()

The arange() function is built upon the Python <code range() function. It returns a unidimensional array populated with the output of a function equivalent with range().

With Generators

Element Data Type

The dtype object describes the data type of the array. Since ndarrays are homogeneous, all elements have the same data type.

import numpy as np

a = np.array([[1, 2, 3], [4, 5, 6]])

a.dtype
dtype('int64')

The object instances representing a specific data type can be created with:

dt = np.dtype('float64')

Array Indexing and Slicing

Array Arithmetic

Transposing Arrays

Swapping Axes

Universal Functions

Universal Functions

Array-Oriented Programming

Conditional Logic as Array Operations

Mathematical and Statistical Operations

Sorting

Linear Algebra

File Input/Output with Arrays