Master Method: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
 
(92 intermediate revisions by the same user not shown)
Line 1: Line 1:
=Internal=
=Internal=
* [[Recursive_Algorithms_Complexity#The_Master_Method_.28Master_Theorem.29|Recursive Algorithms Complexity]]
* [[Recursive_Algorithms_Complexity#The_Master_Method_.28Master_Theorem.29|Recursive Algorithms Complexity]]
=TODO=
<font color=darkkhaki>
* intuition
* definition
* proof
* Emphasize that is theta, not O
</font>


=Overview=
=Overview=


The <span id='Master_Method'></span>'''master method''' or '''master theorem''' is a general mathematical tool for analyzing the running time of [[Algorithms#Divide_and_Conquer|divide and conquer algorithms]] that divide a problem in '''equal''' subproblems and call themselves recursively. The fact that the subproblems are equal in size is important: master method does not apply to divide and conquer algorithms that divide the problem in unequal parts. There are generalization of the master method that apply to unbalanced problems, but they are not addressed in this article.


The <span id='Master_Method'></span>'''master method''' is a method of analyzing running time of recursive algorithms by using the "master theorem". The master theorem applies to algorithms whose running times can be expressed as a [[#Recurrence_Equation|recurrence equation]]. A generic form of a recurrence equation is:
Also, the master theorem offers running time predictions as Θ() - '''bounded from above and below''' - as long as the running time for the combine phase of the divide and conquer algorithm is offered as Θ(). If the running time for the combine phase of the algorithm is given as O(), then the master theorem can be used to predict O().


T(n) = a⋅T(n/b) + f(n)
The master theorem applies to recursive algorithms whose running times can be expressed with a generic formula called recurrence, as described below.


where a ≥ 1, b > 1 and f(n) is a function that approximates the non-recursive cost. Note that the above expression does include a boundary condition for n = 1. The reasons is that T(1) does not significantly change the solution to the recurrence, it may change it with a constant factor, but the order of growth will stay unchanged.
=Master Method and Recurrences=
Master method can be applied to predict running time for recursive algorithms whose running time can be expressed as a [[Recursive_Algorithms_Complexity#Recurrence_Equation|recurrence]] (or recurrence equation). The first step in applying the master method is to identify the recurrence that characterizes the algorithm.


<font color=darkgray>TODO [[CLRS]] page 93.</font>
We denote the maximum number of operations the algorithm needs to solve the computational problem for a problem of size n with T(n). This is the quantity we want an upper bound for. The recurrence is a way to express T(n) in terms of T() of smaller numbers, specifically the size of the subproblems the recursive algorithm calls itself recursively on.


<font color=darkkhaki>TODO integrate after class:</font>
Every recurrence has two ingredients:  
 
* a base case that describes the running time where there is no further recursion.
 
* the general case, that describes the running time done by the recursive calls invoked at that level plus the work that is done by the current level.
<font color=darkkhaki>Integrate class https://www.coursera.org/learn/algorithms-divide-conquer/lecture/HkcdO/formal-statement</font>
 
 
==Analyzing Divide-and-Conquer Algorithms==
 
<font color=darkkhaki>TODO integrate after class:</font>
 
For [[Algorithms#Divide_and_Conquer|divide-and-conquer recursive problems]], where solving the problem involves dividing the problem into ''a'' subproblems of size ''n/b'' of the original', applying the algorithm recursively ''a'' times on ''n/b''-size problems and then combining the result, the run time can be expressed by the generic recurrence presented below. If the problem size is small enough (n ≤ c) the straightforward solution takes constant time, which is expressed as a constant function [[#Constant_Function|Θ(1)]].
 
        | Θ(1)                    if n ≤ c
T(n) = |
        | aT(n/b) + D(n) + C(n)    otherwise
 
where the D(n) is the cost of dividing the n-size problem into a subproblems, and the C(n) is the cost of combining the results.


In the most generic case, a recurrence looks like this:
<font size=-1>
        ┌
        │ ≤ c              <font color=teal># base case, running time bounded by a constant for sufficiently small n</font>
T(n) = │       
        │ ≤ aT(n/b) + O(n<sup>d</sup>) <font color=teal># general case, for larger n</font>
        └
</font>


For the general case:
* a is the number of subproblems (recursive calls).
* b is the factor by the input size shrinks before the recursive call is applied. b is constant bigger than 1. For example, if we call recursively on each half of the current problem, then b is 2.
* d is the exponent in the running time of the "combine" phase, the amount of work that is done outside the recursive calls. d can be zero or larger.
The recurrence expresses the fact that solving the original problem involves dividing the problem into "a" subproblems of size "n/b", applying the algorithm recursively ''a'' times on ''n/b''-size problems and then combining the result in O(n<sup>d</sup>) time. If the problem size is small enough (n ≤ n<sub>0</sub>) the straightforward solution takes constant time.


The fact that we express the subproblem size as n/b shows that for master theorem to apply, all subproblems must have the same size.


=Formal Definition=
<font size=-1>
        ┌
        │                          a
        │ O(n<sup>d</sup>log n)  if a = b<sup>d</sup> (or ── = 1) <font color=teal><span id='Case_1'></span># Case 1</font>
        │                          b<sup>d</sup>
        │                          a
T(n) = │ O(n<sup>d</sup>)      if a < b<sup>d</sup> (or ── < 1) <font color=teal><span id='Case_2'></span># Case 2</font>
        │                          b<sup>d</sup> 
        │                          a
        │ O(n<sup>log<sub>b</sub>a</sup>)    if a > b<sup>d</sup> (or ── > 1) <font color=teal><span id='Case_3'></span># Case 3</font>
        │                          b<sup>d</sup> 
        └
</font>


=Intuition=
The conditions expressed in terms of a/b<sup>d</sup> can be understood intuitively as an expression of a tug of war between two opposing forces: the force of good and the force of evil. "a" is the number of subproblems a problem of a certain level is divided into, so "a" can seen as the rate at which the subproblems proliferate (RSP) as we descend in the recursion tree. This is the force of evil, more subproblems there are, the algorithm will run slower. On the other hand, with each recursion level we do less work per subproblem, and the extent to which we do less work is expressed by b<sup>d</sup>.  b<sup>d</sup> can be seen as rate of work shrinkage (RWS). Interestingly enough, the shrinkage of work grows with d - this seems a bit counterintuitive at the first sight.


In case of a tie (RSP is equal with RWS, a = b<sup>d</sup>), we find ourselves in [[#Case_1|Case 1]]. The amount of work should be the same at each level of the recursion tree, to the total amount of work is the work on the first level cn<sup>d</sup> multiplied with the number of levels log<sub>b</sub>n, hence O(n<sup>d</sup>log<sub>b</sub>n). The base of the logarithm does not count asymptotically so the time complexity will be O(n<sup>d</sup>log n).


When forces of good prevail (RSP < RWS, a < b<sup>d</sup>, rate of work done by each subproblem is shrinking faster than the subproblem number growth) we find ourselves in [[#Case_2|Case 2]]. In this case, the biggest amount of work is at the root level, and that running time at the root dominates the total running time: O(n<sup>d</sup>).


When forces of evil prevail (RSP > RWS, a >  b<sup>d</sup>, subproblems proliferate more rapidly than the savings per subproblem), we find ourselves in [[#Case_3|Case 3]]. The worst case will be at the leaves, the leaves dominate up to a constant factor, so the amount of work is proportional to the number of leaves: O(#leaves) = O(a<sup>log<sub>b</sub>n</sup>). This can be rewritten as O(n<sup>log<sub>b</sub>a</sup>) <font color=darkgray>Why?</font>.


=Proof=
{{External|https://www.coursera.org/learn/algorithms-divide-conquer/lecture/nx5Nf/proof-i}}
{{External|https://www.coursera.org/learn/algorithms-divide-conquer/lecture/h1eWQ/interpretation-of-the-3-cases}}
{{External|https://www.coursera.org/learn/algorithms-divide-conquer/lecture/39jDd/proof-ii}}
The proof uses a [[Recursion Trees#Overview|recursion tree]].


<center>&#91;[[Algorithms#Recursive_Algorithms_Complexity_-_Master_Method|Next in Algorithms]]] &#91;[[Algorithm_Complexity#The_Master_Method_.28Master_Theorem.29|Next in Algorithm Complexity]]]</center>
=Examples=
* [[Merge_Sort#Time_Complexity_Analysis|Merge Sort]]
* [[Binary_Search#Running_Time_Analysis_with_Master_Theorem|Binary search on a sorted array]]
* [[Karatsuba_Multiplication#Time_Complexity|Karatsuba multiplication algorithm]]
* [[Matrix_Multiplication#Strassen.27s_Algorithm_Time_Complexity|Strassen's matrix multiplication]]
=More Details=
<font color=darkkhaki>[[CLRS]] page 93.</font>

Latest revision as of 19:59, 9 November 2021

Internal

Overview

The master method or master theorem is a general mathematical tool for analyzing the running time of divide and conquer algorithms that divide a problem in equal subproblems and call themselves recursively. The fact that the subproblems are equal in size is important: master method does not apply to divide and conquer algorithms that divide the problem in unequal parts. There are generalization of the master method that apply to unbalanced problems, but they are not addressed in this article.

Also, the master theorem offers running time predictions as Θ() - bounded from above and below - as long as the running time for the combine phase of the divide and conquer algorithm is offered as Θ(). If the running time for the combine phase of the algorithm is given as O(), then the master theorem can be used to predict O().

The master theorem applies to recursive algorithms whose running times can be expressed with a generic formula called recurrence, as described below.

Master Method and Recurrences

Master method can be applied to predict running time for recursive algorithms whose running time can be expressed as a recurrence (or recurrence equation). The first step in applying the master method is to identify the recurrence that characterizes the algorithm.

We denote the maximum number of operations the algorithm needs to solve the computational problem for a problem of size n with T(n). This is the quantity we want an upper bound for. The recurrence is a way to express T(n) in terms of T() of smaller numbers, specifically the size of the subproblems the recursive algorithm calls itself recursively on.

Every recurrence has two ingredients:

  • a base case that describes the running time where there is no further recursion.
  • the general case, that describes the running time done by the recursive calls invoked at that level plus the work that is done by the current level.

In the most generic case, a recurrence looks like this:

       ┌
       │ ≤ c               # base case, running time bounded by a constant for sufficiently small n
T(n) = │         
       │ ≤ aT(n/b) + O(nd) # general case, for larger n

For the general case:

  • a is the number of subproblems (recursive calls).
  • b is the factor by the input size shrinks before the recursive call is applied. b is constant bigger than 1. For example, if we call recursively on each half of the current problem, then b is 2.
  • d is the exponent in the running time of the "combine" phase, the amount of work that is done outside the recursive calls. d can be zero or larger.

The recurrence expresses the fact that solving the original problem involves dividing the problem into "a" subproblems of size "n/b", applying the algorithm recursively a times on n/b-size problems and then combining the result in O(nd) time. If the problem size is small enough (n ≤ n0) the straightforward solution takes constant time.

The fact that we express the subproblem size as n/b shows that for master theorem to apply, all subproblems must have the same size.

Formal Definition

       ┌
       │                           a
       │ O(ndlog n)  if a = bd (or ── = 1) # Case 1
       │                           bd 
       │                           a
T(n) = │ O(nd)       if a < bd (or ── < 1) # Case 2
       │                           bd  
       │                           a
       │ O(nlogba)    if a > bd (or ── > 1) # Case 3
       │                           bd

Intuition

The conditions expressed in terms of a/bd can be understood intuitively as an expression of a tug of war between two opposing forces: the force of good and the force of evil. "a" is the number of subproblems a problem of a certain level is divided into, so "a" can seen as the rate at which the subproblems proliferate (RSP) as we descend in the recursion tree. This is the force of evil, more subproblems there are, the algorithm will run slower. On the other hand, with each recursion level we do less work per subproblem, and the extent to which we do less work is expressed by bd. bd can be seen as rate of work shrinkage (RWS). Interestingly enough, the shrinkage of work grows with d - this seems a bit counterintuitive at the first sight.

In case of a tie (RSP is equal with RWS, a = bd), we find ourselves in Case 1. The amount of work should be the same at each level of the recursion tree, to the total amount of work is the work on the first level cnd multiplied with the number of levels logbn, hence O(ndlogbn). The base of the logarithm does not count asymptotically so the time complexity will be O(ndlog n).

When forces of good prevail (RSP < RWS, a < bd, rate of work done by each subproblem is shrinking faster than the subproblem number growth) we find ourselves in Case 2. In this case, the biggest amount of work is at the root level, and that running time at the root dominates the total running time: O(nd).

When forces of evil prevail (RSP > RWS, a > bd, subproblems proliferate more rapidly than the savings per subproblem), we find ourselves in Case 3. The worst case will be at the leaves, the leaves dominate up to a constant factor, so the amount of work is proportional to the number of leaves: O(#leaves) = O(alogbn). This can be rewritten as O(nlogba) Why?.

Proof

https://www.coursera.org/learn/algorithms-divide-conquer/lecture/nx5Nf/proof-i
https://www.coursera.org/learn/algorithms-divide-conquer/lecture/h1eWQ/interpretation-of-the-3-cases
https://www.coursera.org/learn/algorithms-divide-conquer/lecture/39jDd/proof-ii

The proof uses a recursion tree.

Examples

More Details

CLRS page 93.