Java Arrays

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

External

Internal

Overview

Array instances are objects. They are dynamically created, they can be assigned to a variable of type Object and all methods of class Object may be invoked on an array. Aside from the common Object member methods, the array instances have an additional field length, which gives the length of the array. More details about array creation are available in the Array Creation section below.

An array instance contains a number of variables, called array components. The array components have no names, and they can be referenced by array access expressions that use non-negative integer indices. The number of components of an array is fixed, once the array instance is created, the number of components cannot change. If the number of components of an array is zero, the array is said to be empty. If the array has n components, then it is said that the array has the length n. In this case, its components are referenced with zero-based indices from 0 to n-1. More details about accessing array components and elements are available in the Array Access section below.

The components can be of primitive types or reference types. All components of an array have the same type, called the component type of the array. Multidimensional arrays are a special case where all components have array types, recursively, up to the last components in the hierarchy that have non-array types.

For unidimensional arrays, the array components are also the array's elements.

For multidimensional arrays, if starting from the top-level instance, one considers its component type, which is an array type, and then the component type of that type, and so on, eventually one must reach a component type that is not an array type. That is called the element type of the original array and the components at that lowest level of the data structure are called the elements of the original array. The element type of an array type may be a primitive type or a reference type (but not an array type). In case the element type is a reference type, the elements of such an array may have as their value a null reference or an instance of the reference type.

int[] a;       // a unidimensional array whose components are 
               // ints and whose elements are also ints.
String[] s;    // a unidimensional array whose components are 
               // Strings and whose elements are also Strings.
int[][] b;     // a bi-dimensional array. b's components are int[], 
               // and the components of those components are ints
float[][][] c; // a tri-dimensional array. c's components are float[][], 
               // their components are float[] and their components are floats.

Array Types

An array type is a kind of a reference type.

If the component type of an array is T, then the type of the array itself is written as the name of its component type followed by an empty pair of square brackets T[]. In case of multidimensional arrays, if the element type of the array is T, then the type of the array is written as the name of the element type followed by a number of empty pairs of square brackets T[]...[]. The number of square brackets pair gives the dimensionality of the array, which is the depth of array nesting.

Each bracket pair in the array type may be annotated by type annotations. The annotation applies to the bracket pair that follows it.

An array length, which can be accessed via the length variable name on the array instance, is not part of the type.

Array Variables

An array variable is a variable of array type that holds a reference to an array instance. Declaring an array variable does not automatically create the array object or allocate space of the array components. It creates only the variable itself, which can contain a reference to an array. To create the array instance use an array creation expression or an initializer.

The array type of the variable depends on the bracket pairs that may appear as part of the type at the beginning of the variable declaration, or as part of the declarator for the variable, or both. Brackets are allowed in declarators as a nod to the tradition of C and C++, but it is not necessarily a recommended style. In the example provided below, a, b and c are all variables that hold references to the same type int[][] arrays:

int[][] a;
int b[][]; // "C-style" declaration, not recommended
int[] c[]; // "mixed notation", not recommended

Because the array length is not part of its type, a single variable of array type may contain references to arrays of different lengths.

If an array variable v has a type A[], where A is a reference type, then v can hold a reference to an instance of an array type B[], provided B can be assigned to A.

Array Creation

Arrays are created by an array creation expression or an array initializer.

Array Creation with an Expression

The array creation expression specifies the element type, the number of levels of nested arrays, and the length of the array for at least the top level of nesting. Lengths for other level of nesting can also be optionally specified, see multidimensional arrays. The array length is available as a final instance variable length.

int[] a = new int[10];
int[][] b = new int [10][]; // dimensions are not fully specified
int[][] c = new int[10][10]; // dimensions are fully specified

Array Creation with an Array Initializer

An array initializer creates an array and provides initial values for all its components:

int[] a = {1, 2, 3, 4};
int[][] b = {{11, 22}, {33, 44}};

An array initializer can be used as part of an array creation expression:

int[] a = new int[] {1, 2, 3, 4};

The length of the array to be constructed is equal to the number of variable initializers immediately enclosed by the braces of the array initializer. One one-dimensional array is created of the specified length and each component of the array is initialized to its default value. The variable initializers immediately enclosed by the braces of the array initializer are then executed from left to right in the textual order they occur in the source code. The nth variable initializer specifies the value of the n-1th array component. If execution of a variable initializer completes abruptly, then execution of the array initializer completes abruptly for the same reason. If all the variable initializer expressions complete normally, the array initializer completes normally, with the value of the newly initialized array. If the component type is an array type, then the variable initializer specifying a component may itself be an array initializer; that is, array initializers may be nested. In this case, execution of the nested array initializer constructs and initializes an array object by recursive application of the algorithm above, and assigns it to the component.

Array Access

An array component is accessed by an array access expression that consists of an expression whose value is an array reference followed by an indexing expression enclosed by [ and ]. Indices are 0-based. An array with length n can be accessed via indexes from 0 to n-1.

int[] a = {1, 2, 3, 4};
int[][] b = {{11, 22}, {33, 44}};
System.out.println(a[0]);
System.out.println(b[1][1]);

For more details on accessing multi-dimensional arrays, see Multidimensional Arrays below.

Arrays are indexed by int values. short, byte or char may also be used because they are subject to unary numeric promotion and become int values. An attempt to access an array component with a long index value results in a compile-time error.

For an array whose type is A[], where A is a reference type, an assignment to a component of the array is checked at runtime to ensure that the value being assigned is assignable to the component. If the type of the value being assigned is not assignment-compatible, an ArrayStoreException exception is thrown.

Multidimensional Arrays

The components of a multidimensional array are themselves of the same array type, and they contain references to subarrays, in a recursive manner. When a multidimensional array is declared, the array creation expression's leftmost dimension must be specified, and it represents the number of subarrays the top-level multidimensional array contains. The dimension is required because the JVM must allocate contiguous memory at runtime:

T[][]...[] a = new T[3][]...[];

Additional dimensions may optionally specified, from left to right, providing more details on how to allocate memory for subarrays, and their subarrays.

The subarrays of the top level array are referenced by using the left-most index of the array access expression:

T[][]...[] a = new T[3][]...[];
T[]...[] subarray0 = a[0];
T[]...[] subarray1 = a[1];
T[]...[] subarray2 = a[2];

The components of subarray0, subarray1, etc. can be further accessed using additional array access expression indices, from left to right:

T[][]...[] a = new T[3][5][7]...[];
T[]...[] sa = a[2][4];
T[]...[] ssa = a[2][4][6];
// ...

Bidimensional Arrays

 
  a[2][3] =
             col  col  col
              0    1    2      
          {            
  row 0 ──→ {"a", "b", "c"},
  row 1 ──→ {"d", "e", "f")
          }

  m[0][0] == "a"
  m[0][1] == "b"
  m[0][2] == "c"
  m[1][0] == "d"
  m[1][1] == "e"
  m[1][2] == "f"

The most common use case for a bidimensional array is to represent a bidimensional matrix m x n, where m is the number of rows and n the number of columns:

Matrix.png

Java representation of such a matrix is a bidimensional array with constant-length sub-arrays. Each sub-array represents a matrix row, which consists in a contiguous memory storage are for values corresponding to the columns from 1 to n, for that row. The top-level array stores contiguously in memory references to the array components representing the rows. The top-level array storage starts with the reference to row 1, continues with the reference to row 2 and so on, all the way to row m. The content corresponding to different rows may be stored in discontiguous areas of memory.

Because the Java arrays are zero-based, the 1-based matrix elements mathematical indices must be decremented with 1. To access the matrix element a23, Java code must use a[1][2].

arow,column = a[row-1][column-1];

The following code example initializes and accesses a 3x4 matrix:

┌           ┐
│11 12 13 14│
│21 22 23 24│
│31 32 33 34│
└           ┘

 
// m x n matrix with m = 3 rows and n = 4 columns
int[][] a =  // equivalent with int[][] a = new int[3][4]
            {
              {11, 12, 13, 14},
              {21, 22, 23, 24},
              {31, 32, 33, 34}
            };

System.out.println("number of rows: " + a.length); // will display m = 3
System.out.println("number of columns: " + a[0].length); // will display n = 4
System.out.println("a11=" + a[0][0] + " a12=" + a[0][1] + " a13=" + a[0][2] + " a14=" + a[0][3]);
System.out.println("a21=" + a[1][0] + " a22=" + a[1][1] + " a23=" + a[1][2] + " a24=" + a[1][3]);
System.out.println("a31=" + a[2][0] + " a32=" + a[2][1] + " a33=" + a[2][2] + " a34=" + a[2][3]);

Note that the sub-arrays must not necessarily have the same length, each sub-array can have a different length, but in this case the new int[3][4] expression cannot be used. new int[3][] can be used instead and then the sub-arrays can be initialized individually with different lengths.

Tridimensional Arrays

A tridimensional array T[][][] is an array of bidimensional arrays, as described above. The base array contains references to components of type T[][].

java.util.Arrays

T[] a = ...;
Arrays.stream(a).forEach(i -> ...);

Generic Arrays

TODO: https://www.baeldung.com/java-generic-array

(E[]) Array.newInstance(clazz, capacity);