Maximum Subarray Problem: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
No edit summary
Line 5: Line 5:
=Internal=
=Internal=


* [[Data Structures and Algorithms#Divide_and_Conquer|Data Structures and Algorithms]]
* [[Algorithms#Divide_and_Conquer|Algorithms]]


=Overview=
=Overview=

Revision as of 23:14, 28 May 2019

External

Internal

Overview

The maximum subarray problem is useful to solve the following problem: assuming we have access to stock prices of a company for a number of days, and the price does not change during a day, determine what would have been the best day to buy and the best day to sell one unit of stock to make the maximum profit, for the entire interval we have data for. For example, if we maintain the stock prices in an array, for 4 days (0-3) as follows {10, 8, 12, 11} then we would have made the biggest profit of 4 by buying on day 1 at 8 and selling on day 2 at 12.

The problem has an obvious O(n2) brute force approach solution, an O(n lg n) divide-and-conquer solution and an O(n) iterative solution.

Playground

https://github.com/NovaOrdis/playground/tree/master/data-structures-and-algorithms/maximum-subarray-problem/src/main/java/playground/dsa/maximumSubarrayProblem

O(n2) Brute Force Approach

The stock prices are maintained in an int[n] array with an array element for each of the n days. We loop over the prices, and then in an inner loop we calculate the difference in price between subsequent days and the current day, maintaining the maximum.

public void bruteForce(int[] price) {

  int maxProfit = Integer.MIN_VALUE;
  int buy = -1; // the index of the day we should buy
  int sell = -1; // the index of the day we should sell

  for(int i = 0; i < price.length; i ++) {

    for(int j = i + 1; j < price.length; j ++) {

      // we only compare with price in subsequent days
      int profit = price[j] - price[i];

      if (profit > maxProfit) {
        maxProfit = profit;
        buy = i;
        sell = j;
      }
    }
  }
}

Working code:

Playground MaxProfit.bruteForce()

The time complexity is given by the following sum: (n-1) + (n-2) + ... + 1, which is O(n2).

Divide-and-Conquer

To implement divide-and-conquer, we use the following idea: the profit for a day is given by the difference between the current price and the price in the previous day. As days pass between the buy and sell, the overall profit (or loss) is given by the sum of these daily differences The maximum profit is indicated by the contiguous subarray whose sum of elements returns the maximum possible value. This is the maximum subarray problem.

We need to convert the n-element array price[] into an n-1 element array diff[], which maintains the change in daily price. diff[0] contains the difference in price between day 1 and day 0, diff[1] the difference in price between day 2 and day 1, and so on:

Playground MaxProfit.priceToDiff()

We then process the diff[] array and find the maximum subarray using divide-and-conquer. The indices "s" and "e" indicate the section of the given array to process. "s" indicates the first element and "e" the element immediately following the last element to process, or the end of the array, if the last element to process is the last element of the array. The method returns an int[3], containing, in order:

  • the index of the first element of the maximum subarray.
  • the index of the element following the last element of the maximum subarray. Note that the last element of the maximum subarray is the last element of the array, then the index returned here will fall outside of the array.
  • the value of the sum, which is supposed to be maximum for given section [s, e) of the array.
public int[] recursiveMaxSubarray(int[] a, int s, int e) {

  //
  // detect the bottom and exit the recurrence 
  //

  if (s + 1 == e) {

    //
    // one element array, the trivial case of the maximum subarray problem
    //

    return new int[] {s, e, a[s]};
  }

  // 
  // we're at an intermediary level in the recurrence
  // 

  //
  // calculate the center to use to split the array in two
  //
  
  int c = s + (e - s)/2;

  //
  // Find the maximum subarray in the left half
  //
  
  int[] left = recursiveMaxSubarray(a, s, c);

  //
  // Find the maximum subarray in the right half
  //

  int[] right = recursiveMaxSubarray(a, c, e);

  //
  // The contiguous maximum subarray can be either in the left half,
  // right half or straddle the center. Luckily, we can figure out the
  // maximum subarray straddling the center in O(n)
  //

  int[] center = determineMaxSubarrayStraddlingTheCenter(a, s, e, c);

  //
  // we have three candidates, pick the one with the biggest sum. This method is O(1)
  //
  int[] max = determineMax(left, right, center);

  return max;
}

The maximum subarray straddling the center is determining starting from the center and expanding left, then right, to determine the maximum left subarray and maximum right subarray. The maximum subarray straddling the center is the sum of those local maximums. The method returns the same int[3] containing the start index, end index and the sum. Because we linearly expand to left and then right, the time complexity of this method is O(n).

public int[] determineMaxSubarrayStraddlingTheCenter(int[] a, int s, int e, int c) {

  //
  // expand left
  // 

  int l = c - 1;
  int sumLeft = 0;
  int maxSumLeft = Integer.MIN_VALUE;
  int maxStart = -1;
  while(l >= s) {
    sumLeft += a[l];
    if (sumLeft > maxSumLeft) {
       maxSumLeft = sumLeft;
       maxStart = l;
    }
    l --;
  }

  //
  // expand right
  // 

  int r = c;
  int sumRight = 0;
  int maxSumRight = Integer.MIN_VALUE;
  int maxEnd = -1;
  while(r < e) {
    sumRight += a[r];
    if (sumRight > maxSumRight) {
       maxSumRight = sumRight;
       maxEnd = r + 1;
    }
    r ++;
  }

  //
  // combine local maximums
  //

  return new int[] {maxStart, maxEnd, maxSumLeft + maxSumRight};

}

The time complexity of the recursive method can be estimated by building the recursion tree, which has a height of lg n + 1. The complexity of each level in the three is O(n), so the overall complexity of the algorithm is O(n⋅lg n).

O(n) Iterative Solution

CLRS page 75.