NumPy Boolean Array Indexing: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
Line 47: Line 47:
</font>
</font>


Boolean arrays can be combined with the <code>&</code> or <code>|</code> operators when indexing:
Boolean arrays can be combined with the <code>&</code> or <code>|</code> operators when indexing (note that <code>and</code> and <code>or</code> keywords do not work with boolean arrays):
<syntaxhighlight lang='py'>
<syntaxhighlight lang='py'>
a = np.array(['A', 'B', 'C', 'D'])
a = np.array(['A', 'B', 'C', 'D'])
Line 57: Line 57:
  array(['A', 'C', 'D'], dtype='<U1')
  array(['A', 'C', 'D'], dtype='<U1')
</font>
</font>
Selecting data from an array by boolean indexing and assigning the result to a new variable always creates a copy of the data, even if the returned array is unchanged.


If the boolean array and the target array do not have the same shape, the operation produces an <code>IndexError</code>:
If the boolean array and the target array do not have the same shape, the operation produces an <code>IndexError</code>:

Revision as of 16:56, 21 May 2024

Internal

Overview

Boolean indexing is when a boolean array is used to select element from another array with the same shape.

Boolean indexing for unidimensional arrays:

a = np.array(['A', 'B', 'C', 'D'])
b = np.array([True, False, False, True])
a[b]

array(['A', 'D'], dtype='<U1')

Boolean indexing for two-dimensional arrays:

a = np.array([['A', 'B', 'C', 'D'], ['E', 'F', 'G', 'H']])
b = np.array([[True, False, True, False], [False, True, False, True]])

array([['A', 'B', 'C', 'D'],
       ['E', 'F', 'G', 'H']], dtype='<U1')
array([[ True, False,  True, False],
       [False,  True, False,  True]])

a[b]

 array(['A', 'C', 'F', 'H'], dtype='<U1')

Why did a two dimensional array turn into one-dimensional array?

To invert an array used in boolean indexing:

a = np.array(['A', 'B', 'C', 'D'])
b = np.array([True, False, False, True])
a[~b]

array(['B', 'C'], dtype='<U1')

Boolean arrays can be combined with the & or | operators when indexing (note that and and or keywords do not work with boolean arrays):

a = np.array(['A', 'B', 'C', 'D'])
b = np.array([True, False, True, False])
b2 = np.array([False, False, False, True]) 
a[b | b2]

array(['A', 'C', 'D'], dtype='<U1')

Selecting data from an array by boolean indexing and assigning the result to a new variable always creates a copy of the data, even if the returned array is unchanged.

If the boolean array and the target array do not have the same shape, the operation produces an IndexError:

IndexError: boolean index did not match indexed array along dimension 0; dimension is 6 but corresponding boolean dimension is 5

The boolean arrays used in boolean indexing can be generated with vectorized comparison.

Boolean arrays can be mixed with slices and indices when indexing (TODO).