The Knapsack Problem

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

External

Internal

Overview

The knapsack problem shows up every time there's a budget and we want to use it in the most optimal way possible, as a function of a second dimension. This is a quite fundamental problem which is a canonical use for the dynamic programming paradigm.

Problem Defintion

The input of the problem is given as n items {1, 2 ... n}. Each item comes with a non-negative value vi and a non-negative and integral size wi. The integrality of the size is essential, as it allows us to use discrete evaluation steps in the dynamic programming algorithm, as it will be shown below. The maximum total integral non-negative size W is also given as input.

The output should be a subset S ⊆ {1, 2 ... n} that maximizes the value of all objects of the subset:

 ∑vi
i∈S

in such a way that they all "fit" within the capacity W:

 ∑wi ≤ W
i∈S

Dynamic Programming Solution Discussion

A dynamic programming solution usually starts with an analysis of the optimal solution and with an attempt to express the optimal solution as a function (recurrence) of smaller solutions. Once the recurrence expression is established, the algorithm proceeds with a "brute force" computation of the smaller solutions and the selection of the optimal solution at step i by comparing results computed for previous steps and selecting the maximum. This "brute force" approach is feasible, because the number of smaller solutions is asymptotically small, so it makes sense to compute them in an efficient running time. The third step of the algorithm is executed after the final optimal solution is computed, and consists in backtracking the (possibly multidimensional) array of smaller solutions and establishing what input elements are actually part of the solution.

The process is applied to the knapsack problem as shown below. Even if there is no intrinsic sequentiality of the items, it helps if we think of the n input items as labeled from 1 to n and sequentially ordered.

The final optimal solution S ⊆ {1, 2 ... n} may consist in one and only one of the following two situations:

Case 1: item n does not belong to the optimal solution. In this case, the optimal solution must include only items from the first n - 1 items, while the total size of the items is smaller or equal with W, and the total value V is the maximum possible.

Case 2: the item n belongs to the optimal solution. In this case the optimal solution consists in item n and the optimal solution of the subset of n - 1 items {1, 2, ... n-1} for which the maximum size is W-wn (wn is already "allocated" to item n, which is part of the solution).

             │ ∑vi, where n ∉ S and ∑wi ≤ W,
             │i∈S                  i∈S
  V = max of │
             │vn+∑vi, where n ∈ S and ∑ wi ≤ W-wn
             │  i∈S-{n}              i∈S-{n}

This reasoning can be applied backwards to an intermediary sub-problem when only i {1, 2, ... i} items are concerned, and yields a recurrence formula we can then use algorithmically. To write down the recurrence formula, we introduce the following notation: let Vi,x be the maximum value that we can get by using only the first i items {1, 2, ... i} while the total size of all those items combined is at most x.

The recurrence for step i can be expressed as such:

              │ V(i-1),x
Vi,x = max of │  
              │ vi+V(i-1),(x-wi)

The algorithm consists in a double loop over i and x that computes the components of the recurrence:

initialize a bidimensional Vn+1,W+1 array to V[0..n][0..W] to 0
for i = 1 to n: # for each item, starting with first
  for x = 0 to W: # for each possible weight from 0 to capacity
    V[i][x] = max(V[i-1][x], vi+V[i-1][x-wi]) # for x ≤ wi, the second operand is ignored.