Kubernetes Resource Management Concepts

From NovaOrdis Knowledge Base
Jump to: navigation, search





A quantity is a fixed-point representation of a number, providing convenient marshalling/unmarshalling in JSON and YAML.



Resource Quotas

Resource quotas are Kubernetes policies.

TODO: https://kubernetes.io/docs/concepts/policy/resource-quotas/

Limit Ranges

Limit ranges are Kubernetes policies.

TODO: https://kubernetes.io/docs/concepts/policy/limit-range/


Resource Request


A resource request is a container or pod configuration element aimed at the pod scheduler. The scheduler uses this information to decide which node to place the Pod on. If the configuration specifies a resource request, then the process running inside the container is guaranteed an amount of the resources specified by the configuration.

TODO: https://medium.com/@betz.mark/understanding-resource-limits-in-kubernetes-memory-6b41e9a955f9

Resource Limit

Kubernetes limit instructs the Linux kernel to kill the process running inside a container if it tries to exceed the specified limit.

Also see:
CGroups Memory Limit

Compute Resources

Compute resource requests and limits apply to pods - the pod definition may specify these as an indication to the scheduler of how the pods can be best placed on nodes, to achieve satisfactory performance.

apiVersion: v1
kind: Pod
  - name: container-name
         cpu: 500m
         memory: 100Mi
         cpu: 1000m
         memory: 500Mi

CPU Usage

TODO: meaning of CPU: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu

The CPU usage is measured in millicores, a thousandth of a CPU.

CPU Request

The amount of CPU a pod needs to execute. A pod will not be scheduled on a node that does not have at least "requests.cpu" available. Once scheduled, if there is no contention for CPU, the pod is allowed to use all available CPU on the node. If there is CPU contention, "requests.cpu" amount will be used to calculate a relative weight across all containers on the system for how much CPU the container may use. CPU requests map to Kernel CFS shares to enforce this behavior. The CPU request value is used by the CPU-based autoscaling algorithm. For more details see:
CPU-based Autoscaling

CPU Limit

CPU limit specifies the maximum amount of CPU the container may use, independent on the contention on a node. If the container attempts to exceed the specified limit, the system will throttle the container.

Memory Usage

Memory is measured in bytes, but multipliers (K/Ki, M/Mi, G/Gi, T/Ti, P/Pi, E/Ei) can also be used. Ki/Mi/Gi/Ti/P/Ei represent the power of two multipliers.

Memory Request

By default, a container is allowed to consume as much memory on the node as possible. However, a pod may elect to request a minimal amount of memory guaranteed memory by specifying "requests.memory", and this will instruct the scheduler to only place the pod on a node that has at least that amount of free memory. "requests.memory" still allows a pod to consume as much memory as possible on the node.

Memory Limit

"limits.memory" specifies the upper bound of the amount of memory the container will be allowed to use. If the container exceeds the specified memory limit, it will be terminated, and potentially restarted dependent upon the container restart policy.

"limits.memory" propagates as /sys/fs/cgroup/memory/memory.limit_in_bytes in container.

Quality of Service

A compute resource is classified with a quality of service (QoS) attribute depending on the request and limit values used to request it. A container may have different quality of service for each computing resource.


The resource is provided when no request or limit is specified. A BestEffort CPU container is able to consume as much CPU as it is available on the node, but runs with the lowest priority. A BestEffort memory container is able to consume as much memory is available on the node, but there is no guarantee that the scheduler will place the container on a node with enough memory. In addition, BestEffort containers has the greatest chance of being killed if there is an out of memory event on the node.


The resource is provided when a "request" value is specified, and it is less than an optionally specified limit. A Burstable CPU container is guaranteed to get the minimum amount of CPU requested, but it may or may not get additional CPU time. Excess CPU resources are distributed based on the amount request across all containers on the node. A Burstable memory container will get the amount of memory requested, but it may consume more. If there is an out of memory event on the node, Burstable containers are killed after the BestEffort containers when attempting to recover memory.


The resource is provided when both "request" and "limit" are specified and they are equal. A Guaranteed CPU container is guaranteed to get the amount requested and no more, even if there is additional CPU available. A Guaranteed memory container gets the amount of memory requested, but no more. If an out of memory event occurs, it will only be kidded if there are no more BestEffort or Burstable containers on the system.

Metrics in Kubernetes

Metrics in Kubernetes

Resource Information

kubectl describe node <node-name>