Asymptotically Optimal Scheduling of Multiple Parallelizable Job Classes
arxiv(2024)
摘要
Many modern computing workloads are composed of parallelizable jobs. A single
parallelizable job can be completed more quickly if it is run on additional
servers, however each job is typically limited in the number of servers it can
run on (its parallelizability level). A job's parallelizability level is
determined by the type of computation the job performs and how it was
implemented. As a result, a single workload of parallelizable jobs generally
consists of multiple job classes, where jobs from different classes
may have different parallelizability levels. The inherent sizes of jobs from
different classes may also be vastly different.
This paper considers the important, practical problem of how to schedule an
arbitrary number of classes of parallelizable jobs. Here, each class of jobs
has an associated job size distribution and parallelizability level. Given a
limited number of servers, k, we ask how to allocate the k servers across a
stream of arriving jobs in order to minimize the mean response time
– the average time from when a job arrives to the system until it is
completed.
The problem of optimal scheduling in multiserver systems is known to be
difficult, even when jobs are not parallelizable. To solve the harder problem
of scheduling multiple classes of parallelizable jobs, we turn to asymptotic
scaling regimes. We find that in lighter-load regimes (i.e., Sub-Halfin-Whitt),
the optimal allocation algorithm is Least-Parallelizable-First (LPF), a policy
that prioritizes jobs from the least parallelizable job classes. By contrast,
we also find that in the heavier-load regimes (i.e., Super-NDS), the optimal
allocation algorithm prioritizes the jobs with the Shortest Expected Remaining
Processing Time (SERPT). We also develop scheduling policies that perform
optimally when the scaling regime is not known to the system a priori.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要