Algorithms

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

External

Internal

Overview

A data structure is an arrangement of data in computer's memory or external storage. Data structures include arrays, linked lists, stacks, binary trees, hash tables, etc.

Algorithms manipulate the data in these structures in various ways. An algorithm is any well-defined computational procedure consisting in a sequence of steps, which takes some value or set of values, called input and produces a value, or a set of values, called output. An algorithm can be viewed as a tool for solving a well-specified computational problem. In this context, a specific set of input values provided to the algorithm is called an instance of the problem.

One of the most important characteristics of an algorithm is its correctness. An algorithm is said to be correct if, for every input instance, it halts with the correct output.

Another characteristic of algorithms is efficiency, which can be analyzed through formal methods. A special notation, called asymptotic notation and a mathematical formalism is used to express the time efficiency of algorithms, by bounding algorithms running time from above and from below. All these notions are expanded upon in the algorithm complexity section.

Algorithms can be coarsely categorized in iterative and recursive. A common recursive technique is divide-and-conquer. The running time of recursive algorithms can be estimated using recurrences.

A problem that arises frequent in practice is to sort a sequence of numbers into nondecreasing order. The class of algorithms that addresses this problem are the sorting algorithms.

Algorithms whose behavior is determined not only by input, but also by the values produced by a random-number generator that are called randomized algorithms. A randomized algorithm implies an inherent probability distribution for one or more variable, so the running time of such an algorithm may differ on different inputs on the same size. Probabilistic analysis is used to analyze running time of randomized algorithms.

  • Arrays.
  • Lists (single linked and double linked)
  • Stack
  • Queue. Difference between list and queue.
  • Sorting
    • Comparison Sort. Comparison sort algorithm class. Know that comparison sort cannot do better than n ln n. Proof in a separate article.
    • Non-comparison Sort
    • Sorting algorithms characteristics (in-place, stability)
  • Algorithm complexity, Bounds. Understand and document O, Omega and Theta notations. Perform complexity analysis on all the algorithms I examine. Expected running time, worst-case running time. Average case running time.
  • Iteration and recursion – fundamentally different approaches.
  • Set
  • Map
  • Hash Map
  • Distributed Hash Map. Insist on this as is key to systems that scale.
  • Collision resistant hash function.
  • Trees
    • Binary Trees. Binary tree height.
      • Sorting trees. Difference between a binary search tree and a red-black tree.
    • BTrees
  • Tree walking algorithms. Difference between depth first and breadth first.
  • Graphs
  • Graph algorithms
  • Dynamic programming and memoization.
  • Greedy algorithms.
  • Matrix multiplication (NOKB and code).
  • lg and ln notation. Link from the Mathematics main page.
  • Random variable analysis. Indicator random variable.

Probability Distribution.

  • Consider starting upside down, with the bottom (math) sections, and NOKB concepts from them first.
  • Mathematics: Understand and document induction.
  • Need to understand aggregate analysis Section 17.1

Organizatorium