The Job Scheduling Problem

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

External

Internal

Overview

The job scheduling problem is a canonical example for the use of a greedy algorithm.

The Problem

Assume there is a one shared resource (a CPU) and there are n jobs that must be processed bu the shared resource. Each job has two known parameters:

  • a weight wj or priority, which qualifies its "importance". The jobs with higher weight deserve to be processed before the jobs with lower weight.
  • a length ℓj, which codifies the processing time.

The completion time of a job j Cj is defined as:

Cj = Wj + ℓj

where Wj is the wait time, or how much the job j had to wait before being scheduled. The wait time is given by the sum of the lengths for all jobs scheduled before j.

The question we need to resolve algorithmically is in what order we should sequence the job to maximize the objective function defined by the weighted sum of completion times:

    n
min ∑ wjCj
   j=1

The Greedy Algorithm

The process through which to come up with a greedy algorithm first involves looking at a special case of the problem, where is reasonably intuitive what should be the optimal thing to do.

We could assume that all jobs have the same length and different weights. The objective function will be ℓw1 + 2ℓw2 + ... nℓwn = ℓ(w1 + 2w2 + ... nwn) so the obvious way to minimize it would be to scheduled the job with the biggest weight first, followed by the job with the second biggest weight and so on.

Another special case is a sequence in which all jobs have the same weight, but different lengths. The objective function is ℓ1w + (ℓ1+ℓ2)w + ... + (ℓ1+ℓ2+...+ℓn)w =(nℓ1+(n-1)ℓ2 + ... + ℓn). To minimize the function, we should schedule the shortest job first, because the execution time will be reflected in the completion time of all subsequent jobs, then the second shortest job and the longest job last.

We then move beyond special case towards the general case. The analysis of the particular cases indicates that we should favor jobs with the biggest weight and jobs with the shortest length. An obvious way to aggregate the weight and a length in a single score that gives an indication on possible priority in scheduling (lager the score, the highest priority in scheduling) is:

 wj
────
 ℓj

Correctness Proof