YuniKorn Core Concepts: Difference between revisions
Line 18: | Line 18: | ||
This is the sequence of operations of an [[#Allocation_Ask|Allocation Ask]]: | This is the sequence of operations of an [[#Allocation_Ask|Allocation Ask]]: | ||
* <code>scheduler.Scheduler</code> handles an <code>rmevent.RMUpdateAllocationEvent</code> "update allocation" event in the <code>handleRMEvent()</code> function, which immediately calls into <code>scheduler.ClusterContext#handleRMUpdateAllocationEvent()</code>. | * <code>scheduler.Scheduler</code> handles an <code>rmevent.RMUpdateAllocationEvent</code> "update allocation" event in the <code>handleRMEvent()</code> function, which immediately calls into <code>scheduler.ClusterContext#handleRMUpdateAllocationEvent()</code>. | ||
* <code>scheduler.ClusterContext#handleRMUpdateAllocationEvent()</code> → <code>scheduler.ClusterContext#processAsks()</code> | * <code>scheduler.ClusterContext#handleRMUpdateAllocationEvent()</code> → <code>scheduler.ClusterContext#processAsks()</code>. | ||
* <code>scheduler.ClusterContext#processAsks()</code> locates the corresponding partition and calls into <code>scheduler.PartitionContext#addAllocationAsk()</code>. | |||
* <code>scheduler.PartitionContext#addAllocationAsk()</code> locates the corresponding application. | |||
* <code>scheduler.PartitionContext#addAllocationAsk()</code> creates a new <code>objects.AllocationAsk</code> instance. | |||
* <code>scheduler.PartitionContext#addAllocationAsk()</code> invokes into <code>objects.Application#AddAllocationAsk()</code> with the newly created <code>objects.AllocationAsk</code> instance. | |||
=Partition= | =Partition= |
Revision as of 00:46, 12 January 2024
Internal
Overview
YuniKorn core is a universal scheduler that can be used to assign Application resource Allocations to Nodes that expose resources. Its default implementation allocate Kubernetes pods, where multiple pods belong to an application and request resources like memory, cores and GPUs, to Kubernetes nodes. However, Applications, Allocations and Nodes can be mapped onto an arbitrary domain. The scheduler assumes that different Allocation may have different priorities, and performs the higher priority Allocations first. The scheduler also has the concept of preemption.
Application
An application is an abstract programmatic entity that requires resources to execute. The application expresses its needs of resources by issuing Allocation requests, which are handled by the scheduler in an attempt to find a Node that can accommodate the resource need for that specific allocation request. In the default Kubernetes implementation, an application is any higher level workload resource that creates pods: deployments, jobs, etc.
Application Lifecycle
An application gets added as NEW. The application transitions from NEW to ACCEPTED when the first request (Ask) is added to the application. It then moves to STARTING when the Allocation is created. That is the point that the request (Ask) gets assigned to a node. It now shows as an Allocation on the application.
If another Ask was added and a second one gets allocated the application state changes to RUNNING immediately. If there is no other Ask and thus no second Allocation we stay for a maximum of 5 minutes in the STARTING state and then auto transition to RUNNING. This is to support state-aware scheduling. It has no impact on the scheduler or on the pods etc unless you have turned state-aware scheduling on. To configure application to transition to RUNNING after the first allocation Ask, place the tag "application.stateaware.disable": "true" on the AddApplicationRequest
when creating the application.
Allocation
Allocation Ask
Allocation Ask Implementation
This is the sequence of operations of an Allocation Ask:
scheduler.Scheduler
handles anrmevent.RMUpdateAllocationEvent
"update allocation" event in thehandleRMEvent()
function, which immediately calls intoscheduler.ClusterContext#handleRMUpdateAllocationEvent()
.scheduler.ClusterContext#handleRMUpdateAllocationEvent()
→scheduler.ClusterContext#processAsks()
.scheduler.ClusterContext#processAsks()
locates the corresponding partition and calls intoscheduler.PartitionContext#addAllocationAsk()
.scheduler.PartitionContext#addAllocationAsk()
locates the corresponding application.scheduler.PartitionContext#addAllocationAsk()
creates a newobjects.AllocationAsk
instance.scheduler.PartitionContext#addAllocationAsk()
invokes intoobjects.Application#AddAllocationAsk()
with the newly createdobjects.AllocationAsk
instance.