NumPy Boolean Array Indexing

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

Internal

Overview

Boolean indexing is when a boolean array is used to select element from another array with the same shape.

Boolean indexing for unidimensional arrays:

a = np.array(['A', 'B', 'C', 'D'])
b = np.array([True, False, False, True])
a[b]

array(['A', 'D'], dtype='<U1')

Boolean indexing for two-dimensional arrays:

a = np.array([['A', 'B', 'C', 'D'], ['E', 'F', 'G', 'H']])
b = np.array([[True, False, True, False], [False, True, False, True]])

array([['A', 'B', 'C', 'D'],
       ['E', 'F', 'G', 'H']], dtype='<U1')
array([[ True, False,  True, False],
       [False,  True, False,  True]])

a[b]

 array(['A', 'C', 'F', 'H'], dtype='<U1')

Why did a two dimensional array turn into one-dimensional array?

To invert an array used in boolean indexing:

a = np.array(['A', 'B', 'C', 'D'])
b = np.array([True, False, False, True])
a[~b]

array(['B', 'C'], dtype='<U1')

Boolean arrays can be combined with the & or | operators when indexing (note that and and or keywords do not work with boolean arrays):

a = np.array(['A', 'B', 'C', 'D'])
b = np.array([True, False, True, False])
b2 = np.array([False, False, False, True]) 
a[b | b2]

array(['A', 'C', 'D'], dtype='<U1')

Selecting data from an array by boolean indexing and assigning the result to a new variable always creates a copy of the data, even if the returned array is unchanged.

If the boolean array and the target array do not have the same shape, the operation produces an IndexError:

IndexError: boolean index did not match indexed array along dimension 0; dimension is 6 but corresponding boolean dimension is 5

The boolean arrays used in boolean indexing can be generated with vectorized comparison.

Boolean arrays can be mixed with slices and indices when indexing (TODO).