Queueing Theory: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
Tag: Reverted
 
(37 intermediate revisions by the same user not shown)
Line 3: Line 3:
* [[Reactive Programming]]
* [[Reactive Programming]]
* [[Universal Scalability Law]]
* [[Universal Scalability Law]]
* [[Go_Pipelines#Stage_Queueing|Go Pipelines]]
=TODO=
<font color=darkkhaki>TODO: 2018.07.20 - The Essential Guide to Queueing Theory.pdf in Learning.</font>


=Overview=
=Overview=


<font color=darkgray>TODO: 2018.07.20 - The Essential Guide to Queueing Theory.pdf in Learning.</font>
=Queueing in Pipelines=
 
This section discusses queueing for pipelines.
 
A [[Go_Pipelines#Overview|pipeline]] consists in a series of stages, which process a stream of data fed into the pipeline. If there are no buffers (queues) between stages, a slow stage will block the upstream stages from working. This is because a slow stage will be unable to process, thus accept a new stream element from the upstream stage, making the upstream stage unable to hand over the stream element, thus blocking it. The delay will propagate upstream in the pipeline, possibly all the way to the pipeline input, preventing new stream elements to be processed altogether. Introducing queues seems like an obvious optimization, but in most situation, this is actually not the case.
 
'''Queueing''' is the technique of storing stream elements in a temporary location in memory, after they have been processed by a stage, so that other stage can retrieve it later. The producing stage does not hold a reference to the processed stream elements.
 
==Decoupling==
Introducing '''queues''' between stages makes possible to hold data stream elements processed by an upstream stage, while the downstream stage is still busy, thus freeing up the upstream stage to continue doing work. Queues '''decouple''' stages. However, if the difference in processing speed remains constant, the queue will fill up after a while, so introducing the queue will only delay blocking the upstream stage, not eliminate it.
 
==Queueing and Overall Pipeline Performance==
Adding queueing between the pipeline stages will almost never speed up the pipeline. It will only allow it to behave differently. The effect of introducing queues is that the time spent in a '''blocking state''' of the stage upstream of the queue is reduced.
 
The only situations when queueing '''can''' increase the overall situation of the pipeline are:
# <span id='Batching'></span>'''When batching requests in a stage save time'''. An operation that requires overhead is performed less often by batching (or chunking) more elements into a larger element, thus saving overall CPU cycles or other resources.
# <span id='Negative_Feedback_Loop'></span>'''When delays in a stage produce a feedback loop into the system'''. If the efficiency of a pipeline drops below a certain critical threshold, the upstream systems begin increasing their inputs Ito the pipeline, which causes the. pipeline to lose efficiency even more, causing a positive feedback loop, or a "death-spiral". Introducing a queue at the entrance of pipeline can break the feedback loop, or introduce a negative feedback loop, where accumulation of elements in the queue decreases the input rate. This is obtained at the cost of creating a lag in the existing requests. From the perspective of the caller into the pipeline, the requests appear to be processing but taking a very long time.
 
==Queueing and Synchronization Issues==
Adding queueing prematurely can hide synchronization issues such as [[Concurrent_(Parallel)_Programming#Deadlock|deadlocks]] and [[Concurrent_(Parallel)_Programming#Livelock|livelocks]]. As the program converges towards correctness, you may find that you need more or less queueing.
 
==Queue Tuning==
Queue tuning refers to where the queues should be placed and how large the buffers should be. As the [[#Queueing_and_Overall_Pipeline_Performance|Queueing and Overall Pipeline Performance]] section shows, it makes sense to implement queueing before stages where [[#Batching|batching]] (chunking) will lead to higher efficiency, and at the entrance of the pipeline, to create a [[#Negative_Feedback_Loop|negative feedback loop]] and thus save the pipeline from a death spiral.
 
==Pipeline Throughput==
===The Little's Law===
<font size=-2>
L = λW
</font>
where:
* L is the average number of units in the system.
* λ is the average arrival rate of units.
* W is the average time a unit spends in the system.
 
This equation applies to stable system.
 
In a pipeline, a '''stable''' system is one in which the rate that work enters the pipeline, or ingress, is equal to the rate in which it exits the system, or egress.
 
If the rate of ingress exceeds the rate of egress, the system is '''unstable''' and has entered a death spiral.
 
If the rate of egress exceeds the rate of ingress, the system is still unstable, but in the way that its resources are not being utilized efficiently.
 
For a multi-stage pipeline, Little's Law can be rewritten as:
 
<font size=-2>
L = λ∑<sub>i</sub>W<sub>i</sub>
</font>
 
which says that the pipeline will only be as fast as the slowest stage.
 
<font color=darkkhaki>To continue: Concurrency in Go, Katherine Cox-Buday. Chapter 4. Concurrency Patterns in Go. Section "Queueing".</font>

Latest revision as of 19:15, 7 February 2024

Internal

TODO

TODO: 2018.07.20 - The Essential Guide to Queueing Theory.pdf in Learning.

Overview

Queueing in Pipelines

This section discusses queueing for pipelines.

A pipeline consists in a series of stages, which process a stream of data fed into the pipeline. If there are no buffers (queues) between stages, a slow stage will block the upstream stages from working. This is because a slow stage will be unable to process, thus accept a new stream element from the upstream stage, making the upstream stage unable to hand over the stream element, thus blocking it. The delay will propagate upstream in the pipeline, possibly all the way to the pipeline input, preventing new stream elements to be processed altogether. Introducing queues seems like an obvious optimization, but in most situation, this is actually not the case.

Queueing is the technique of storing stream elements in a temporary location in memory, after they have been processed by a stage, so that other stage can retrieve it later. The producing stage does not hold a reference to the processed stream elements.

Decoupling

Introducing queues between stages makes possible to hold data stream elements processed by an upstream stage, while the downstream stage is still busy, thus freeing up the upstream stage to continue doing work. Queues decouple stages. However, if the difference in processing speed remains constant, the queue will fill up after a while, so introducing the queue will only delay blocking the upstream stage, not eliminate it.

Queueing and Overall Pipeline Performance

Adding queueing between the pipeline stages will almost never speed up the pipeline. It will only allow it to behave differently. The effect of introducing queues is that the time spent in a blocking state of the stage upstream of the queue is reduced.

The only situations when queueing can increase the overall situation of the pipeline are:

  1. When batching requests in a stage save time. An operation that requires overhead is performed less often by batching (or chunking) more elements into a larger element, thus saving overall CPU cycles or other resources.
  2. When delays in a stage produce a feedback loop into the system. If the efficiency of a pipeline drops below a certain critical threshold, the upstream systems begin increasing their inputs Ito the pipeline, which causes the. pipeline to lose efficiency even more, causing a positive feedback loop, or a "death-spiral". Introducing a queue at the entrance of pipeline can break the feedback loop, or introduce a negative feedback loop, where accumulation of elements in the queue decreases the input rate. This is obtained at the cost of creating a lag in the existing requests. From the perspective of the caller into the pipeline, the requests appear to be processing but taking a very long time.

Queueing and Synchronization Issues

Adding queueing prematurely can hide synchronization issues such as deadlocks and livelocks. As the program converges towards correctness, you may find that you need more or less queueing.

Queue Tuning

Queue tuning refers to where the queues should be placed and how large the buffers should be. As the Queueing and Overall Pipeline Performance section shows, it makes sense to implement queueing before stages where batching (chunking) will lead to higher efficiency, and at the entrance of the pipeline, to create a negative feedback loop and thus save the pipeline from a death spiral.

Pipeline Throughput

The Little's Law

L = λW

where:

  • L is the average number of units in the system.
  • λ is the average arrival rate of units.
  • W is the average time a unit spends in the system.

This equation applies to stable system.

In a pipeline, a stable system is one in which the rate that work enters the pipeline, or ingress, is equal to the rate in which it exits the system, or egress.

If the rate of ingress exceeds the rate of egress, the system is unstable and has entered a death spiral.

If the rate of egress exceeds the rate of ingress, the system is still unstable, but in the way that its resources are not being utilized efficiently.

For a multi-stage pipeline, Little's Law can be rewritten as:

L = λ∑iWi

which says that the pipeline will only be as fast as the slowest stage.

To continue: Concurrency in Go, Katherine Cox-Buday. Chapter 4. Concurrency Patterns in Go. Section "Queueing".