Algorithms
External
- Based on Introduction to Algorithms, 3rd edition by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein (CLRS)
Internal
Algorithm Definition
An algorithm is any well-defined computational procedure consisting in a sequence of steps, which takes some value or set of values, called input and produces a value, or a set of values, called output.
Another definition is a set of well defined rules - a tool - for solving a well-specified computational problem. In this context, a specific set of input values provided to the algorithm is called an instance of the problem. Algorithms manipulate data structures in various ways.
Algorithms should be considered a technology, the same as computer hardware or object-oriented programming. Total system performance depends on choosing efficient algorithms as much as on choosing fast hardware. Having a solid base of algorithmic knowledge and techniques is one of the factors that separates a skilled programmer from a novice.
Algorithm Correctness
One of the most important characteristics of an algorithm is its correctness. An algorithm is said to be correct if, for every input instance, it halts with the correct output. It is almost always desirable for an algorithm to be correct. However, in some cases, even incorrect algorithms are useful if we can control the error rate. An example of such algorithm is Miller-Rabin primality test. One of the techniques that can be used to demonstrate that an algorithm is correct is the loop invariant method.
Algorithm Efficiency
Another characteristic of algorithms is efficiency. The obvious reason to analyze the efficiency of an algorithm is that the computing time and the space in memory, are bounded resources and they must be utilized efficiently.
Algorithm Complexity
The efficiency of an algorithm can be analyzed through formal methods and expressed using a special notation, called asymptotic notation. The asymptotic notation uses functions that bound the algorithm's running time from above and from below. To say that the running time is asymptotically bounded from above by a specific function, say n2, we use the "big-O" notation: O(n2). All these notions are expanded upon in the algorithm complexity section.
Iterative vs. Recursive Algorithms
Algorithms can be coarsely categorized in iterative, or incremental, and recursive.
Recursive Algorithms
A recursive algorithm solves a problem by calling themselves recursively one or more times to deal with closely related sub-problems. The running time of recursive algorithms can be estimated using recurrences, or recurrence equations. A recursive algorithm requires a base case, where the input is sufficiently small that the output can be immediately computed, rather than recursing further, as expanded in the Divide and Conquer section below.
Divide and Conquer
A common recursive technique is divide-and-conquer, which consists in three steps, applied at each level of the recursion:
- Divide the problem into several sub-problems that are similar to the original problem but smaller in size.
- Conquer the sub-problems by solving them recursively. If the sub-problem size is small enough, it may be solved through iterative techniques, or it may be a trivial case that does not need any work. When we reach this stage, we say that the recursion "bottomed out" and we have gotten down to the base case.
- Combine the solutions to the sub-problems to create a solution to the original problem. Merge sort is an example of sorting algorithm that uses the divide-and-conquer strategy.
Examples of algorithms that use divide-and-conquer method are:
- Karatsuba multiplication, which multiplies two n-digit integers in a time better than n2
- Merge sort
- Finding the number of inversions in an array
- Strassen's algorithm for matrix multiplication
- Maximum subarray problem
- Closest pair problem
Are there any kind of other recursive algorithms than divide and conquer?
Randomized Algorithms
Algorithms whose behavior is determined not only by input, but also by the values produced by a random-number generator that are called
randomized algorithms. A randomized algorithm implies an inherent probability distribution for one or more variable, so the running time of such an algorithm may differ on different inputs on the same size. Probabilistic analysis is used to analyze running time of randomized algorithms. The canonical example of a randomized algorithm is randomized QuickSort, primality testing, graph partitioning, hashing. See:Sorting
Sorting a sequence of numbers into nondecreasing order is a problem that arises frequent in practice. The class of algorithms that addresses this problem are the sorting algorithms. Sorting algorithms may perform key comparison or not. When analyzing sorting algorithms, characteristics such as whether the algorithm is in place or whether the algorithm is stable may be discussed. Examples of sorting algorithms are insertion sort, merge sort.
The Selection Problem
The ith order statistic problems require selecting ith smallest element of a set. Finding the median is a particular case of a ith order statistic problem. These problems are known as the selection problem. They can be resolved generically by sorting the entire set and then selecting the desired element. However, key comparison sorting cannot be done more efficiently than Ω(n lgn), and more specialized and faster algorithms for the selection problem exist.
Number-Theoretic Algorithms
Number-theoretic algorithms are important due in large part to the invention of cryptographic schemes based on large prime numbers. Algorithms in this category are used to generate large prime numbers. Some of these algorithms, for example Miller-Rabin primality test algorithm, are not entirely correct, the sense that there is a very small chance of error, but the chance of error is so small that is considered acceptable.
NP Completeness
Almost all the algorithms mentioned so far have been polynomial-time algorithms, which is to say that on an input of size n, their worst running time is O(nk) for some constant k. Generally, we think of a problem that is solvable by a polynomial-time algorithm as tractable or easy. A problem that requires super-polynomial time is designated intractable or hard. There are also problems whose status is unknown: no polynomial-time algorithm has been yet discovered for them, nor has anyone yet been able to prove that no polynomial-time algorithm can exist for any of them. This class of problems is called NP-complete problems. The set of NP-complete problems has the property that if an efficient algorithm exists for any one of them, then efficient algorithms exist for all of them. There are methods to show that a problem is NP-complete, and if that is the case, an approximation algorithm instead of a polynomial-time algorithm, can be developed form it.
Multithreaded Algorithms
Multicore processors require algorithms designed with parallelism in mind. These are the multithreaded algorithms.
TODO
- Dynamic programming and memoization.
- Greedy algorithms.
- Matrix multiplication (NOKB and code).
- Random variable analysis. Indicator random variable.
- Probability Distribution.
- Consider starting upside down, with the bottom (math) sections, and NOKB concepts from them first.
- Mathematics: Understand and document induction.
- Need to understand aggregate analysis Section 17.1