YuniKorn Concepts: Difference between revisions
Line 9: | Line 9: | ||
=Allocation= | =Allocation= | ||
==Allocation Ask== | ==Allocation Ask== |
Revision as of 21:53, 18 January 2024
Internal
YuniKorn Core
Kuberentes Implementation
A namespace can have a "queue" if annotated with "yunikorn.apache.org/queue". A namespace can have a "parent queue" is annotated with "yunikorn.apache.org/parentqueue".
Allocation
Allocation Ask
Each allocation has a key. The key is used by both the scheduler and the resource manager to track allocations. The key does not have to be the resource manager's internal allocation ID, such as the pod name.
If the allocation specifies an application ID, the application must be registered in advance, otherwise we get "failed to find application ..."
Each allocation ask requests resources.
Each allocation ask has a priority.
What is "MaxAllocations"?
An application goes into a state transition after the first allocation ask, to "Running".
Identity
An application is submitted under a certain identity, that consists of a user and one or more groups.
TO PARSE:
- https://yunikorn.apache.org/docs/user_guide/usergroup_resolution/
- https://yunikorn.apache.org/docs/design/scheduler_configuration/#user-definition
User
Group
The identity an application is submitted under may be associated with one or more groups.
Plugin Mode
Resource Manager (RM)
YuniKorn communicates with various implementation of resource management systems (Kubernetes, YARN) via a standard interface defined in the yunikorn-scheduler-interface
package.
Task
Node
Can a node be declared to be part of a partition with the "si/node-partition" label? It seems that the node Attributes partially come from the node labels.
In the Kubernetes implementation, the node is first added, then updated.
As part of handling the RMNodeUpdateEvent
, RMProxy
calls callback.UdpateNode()
.
Configuration
Context
yunikorn-k8shim cache.Context
Resource
Quantity
Reservation
Manual Scheduling
Policy Group
Set in the scheduler when a new resource manager is registered.
Gang Scheduling
Gang scheduling style can be "hard" (the application will fail after placeholder timeout) or "soft" (after the timeout the application will be scheduled as a normal application).
Organizatorium
1
When a new pod annotated with schedulerName: yunikorn
needs scheduling, the API server (admission controller (?)) calls the "admission-webhook.yunikorn.mutate-pods" webhook with a POST https://yunikorn-admission-controller-service.yunikorn.svc:443/mutate?timeout=10s.
Service "yunikorn-admission-controller-service"
When running locally, the service does not get deployed, yet the pods get scheduled. This is how: there's a Kubernetes mechanism involving "informers" that periodically updates the state of the resources is interested in. There are "update", "add" and "delete" notifications. When a new pod shows up, general.Manager.AddPod()
is invoked, which creates and Application and Task using the pod metadata → PodEventHandler.addPod()
→ cache.Context.AddApplication()
. At the same time, there's the main KubernetesShim scheduling loop that finds the new application and so the scheduling process begins.