NumPy Boolean Array Indexing
Internal
Overview
Boolean indexing is when a boolean array is used to select element from another array with the same shape.
Boolean indexing for unidimensional arrays:
a = np.array(['A', 'B', 'C', 'D'])
b = np.array([True, False, False, True])
a[b]
array(['A', 'D'], dtype='<U1')
Boolean indexing for two-dimensional arrays:
a = np.array([['A', 'B', 'C', 'D'], ['E', 'F', 'G', 'H']])
b = np.array([[True, False, True, False], [False, True, False, True]])
array([['A', 'B', 'C', 'D'], ['E', 'F', 'G', 'H']], dtype='<U1') array([[ True, False, True, False], [False, True, False, True]])
a[b]
array(['A', 'C', 'F', 'H'], dtype='<U1')
Why did a two dimensional array turn into one-dimensional array?
To invert an array used in boolean indexing:
a = np.array(['A', 'B', 'C', 'D'])
b = np.array([True, False, False, True])
a[~b]
array(['B', 'C'], dtype='<U1')
Boolean arrays can be combined with the &
or |
operators when indexing (note that and
and or
keywords do not work with boolean arrays):
a = np.array(['A', 'B', 'C', 'D'])
b = np.array([True, False, True, False])
b2 = np.array([False, False, False, True])
a[b | b2]
array(['A', 'C', 'D'], dtype='<U1')
Selecting data from an array by boolean indexing and assigning the result to a new variable always creates a copy of the data, even if the returned array is unchanged.
If the boolean array and the target array do not have the same shape, the operation produces an IndexError
:
IndexError: boolean index did not match indexed array along dimension 0; dimension is 6 but corresponding boolean dimension is 5
The boolean arrays used in boolean indexing can be generated with vectorized comparison.
Boolean arrays can be mixed with slices and indices when indexing (TODO).