Ignara AI/Docs/Scheduling
Architecture & ResearchUpdated June 2026

Scheduling

On this page

AI scheduling challengesGang schedulingBackfill schedulingPriority and preemption

AI scheduling challenges

AI workloads present scheduling challenges not well-handled by general-purpose schedulers. Training jobs must start all required resources simultaneously. Long-running jobs have different priority semantics than short batch jobs. GPU fragmentation is a major source of inefficiency in multi-tenant clusters.

Gang scheduling

Distributed training jobs require all worker processes to start simultaneously — a partially started job wastes the resources allocated to it. Fabric implements gang scheduling with admission control: a job is either fully admitted or not admitted at all.

Backfill scheduling

Backfill scheduling improves cluster utilization by allowing lower-priority jobs to run in resources that would otherwise be idle while waiting for a high-priority job to have sufficient resources. Fabric's scheduler implements backfill with configurable time bounds to prevent starvation.

Priority and preemption

Fabric supports multiple priority tiers with configurable preemption policies. Higher-priority workloads can preempt lower-priority workloads when resources are constrained. Preemption policies can be tuned to balance fairness, efficiency, and job completion guarantees.

PreviousControl PlaneNext Storage Fabric