Analysis of cycle stealing with switching cost

Analysis of Cycle Stealing with Switching Cost
Takayuki Osogami
Mor Harchol-Balter
Alan Scheller-Wolf
October 2002
CMU-CS-02-192
School of Computer Science
Carnegie Mellon University
Pittsburgh, PA 15213
Abstract
We consider the scenario of two independent processors, each serving its own workload, where one of the
processors (known as the “donor”) can help the other processor (known as the “beneficiary”) with its jobs,
during times when the donor processor is idle. That is the beneficiary processor “steals idle cycles” from
the donor processor. We assume that both donor jobs and beneficiary jobs may have generally-distributed
service requirements. We assume that there is a switching cost required for the donor processor to start
working on the beneficiary jobs, as well as a switching cost required for the donor processor to return
to working on its own jobs. We also allow for threshold constraints, whereby the donor processor only
initiates helping the beneficiary if both the donor is idle and the number of jobs at the beneficiary exceeds
a certain threshold.
We analyze the mean response time for the donor and beneficiary processors. Our analysis is approximate, but can be made as close to exact as desired. Analysis is validated via simulation. Results of the
analysis illuminate several interesting principles with respect to the general benefits of cycle stealing and
the design of cycle stealing policies.
Carnegie Mellon University, CSD. Email: [email protected]
Carnegie Mellon University, Computer Science Department. Email: [email protected]. This work was supported by
NSF Career Grant CCR-0133077, by NSF ITR Grant 99-167 ANI-0081396 and by Spinnaker Networks via Pittsburgh Digital
Greenhouse Grant 01-1.
Carnegie Mellon University, GSIA. Email: [email protected]
Keywords: cycle stealing, switching cost, queueing theory, task assignment, server farm, distributed
system, unfairness, starvation, load sharing, supercomputing, Matrix-Analytic.
1 Introduction
Motivation
Since the invention of networks of workstations, systems designers have touted the benefits of allowing
a user to take advantage of machines other than her own, at times when those machines are idle. This
notion is often referred to as cycle stealing. Cycle stealing allows a user, Betty, with multiple jobs to
offload one of her jobs to Dan’s machine if Dan’s machine is idle, giving Betty two machines to process
her jobs. When Dan’s workload resumes, Betty must return to using only her own machine. We refer to
Betty as the beneficiary, to her machine as the beneficiary machine/server, and to her jobs as beneficiary
jobs. Likewise, we refer to Dan as the donor, to his machine as the donor machine/server, and to his jobs
as donor jobs.
Although cycle stealing provides obvious benefits to the beneficiary, these benefits come at some cost
to the donor. For example, the beneficiary’s job may have to be checkpointed and the donor’s working
set memory reloaded before the donor can resume, delaying the resumption of processing of donor jobs.
In our model we refer to these additional costs associated with cycle stealing as switching costs.
A primary goal of this paper is to understand what is the benefit of cycle stealing for the beneficiary
and what is the penalty to the donor, as a function of switching costs. A second goal is to derive parameter
settings for cycle stealing. In particular, given non-zero switching costs, cycle stealing may only pay if the
beneficiary’s queue is “sufficiently” long. We seek to understand the optimal threshold on the beneficiary
queue. More broadly, we seek general insights into which problem parameters have the most significant
impact on the effectiveness of cycle stealing.
The analytical model
We assume two independent M/GI/1/FCFS queues, the beneficiary queue and the donor queue. Jobs
arrive at rate
(respectively,
) at the beneficiary (respectively, donor) queue and have service re-
quirement represented by the random variable from distribution drawn from distribution (respectively, drawn
). The load made up by beneficiary (respectively, donor) jobs is denoted by and . If the donor server is idle, and if
the number of jobs at the beneficiary queue is at least , the donor transitions into the switching state,
for a random amount of time, . After time, the donor server is available to work on the bene
ficiary queue and the beneficiary queue becomes an M/ /2 queue. If a donor job arrives during ,
(respectively, ) where or during the time the donor is helping the beneficiary, the donor transitions into a switching back state,
for a random amount of time, !#" . After the completion of the switch back, the donor server resumes
working on its own jobs. We assume the first four moments of the service times are finite, and queues
1
are stable. No work is accomplished by the donor server while switching.
A few details are in order. First, in the above model, the donor processor continues to cooperate with
the beneficiary even if there is no beneficiary work left for it to do — the donor processor only switches
back when a donor job arrives. (We also analyzed the case where the donor processor switches back when
it is not needed, see Appendix A. We found that this has almost no effect on performance, even under
high switching costs). Second, we assume that if the donor processor is working on a beneficiary job and
a donor job arrives, that beneficiary job is returned to the beneficiary queue and will be resumed by the
beneficiary processor as soon as that processor is available. The work done on the job is not lost, i.e., the
job is checkpointed. Finally, our model can be generalized to the case where there is a threshold, ,
on the number of donor jobs as well — i.e., the donor processor only returns to the donor queue when
the number of jobs at the donor queue is at least . Throughout this paper, aside from the stability
section (Section 4), we assume for simplicity, and focus on the effect of . In Appendix B,
we extend the analysis to general . Our performance metric throughout is mean response time for
each class of jobs.
Difficulty of analysis and Previous work
Consider the simplest instance of our problem — where the service requirements of all jobs are each
drawn from exponential distributions and the switching costs and threshold are zero. Even for this simplest instance the continuous-time Markov chain, while easy to describe, is mathematically intractable.
This is due to the fact that the stochastic process having state (number beneficiary jobs, number donor
jobs ) grows infinitely in two dimensions and contains no structure that can be exploited to obtain an exact solution. While solution by truncation of the Markov chain is possible, the errors that are introduced
by ignoring portions of the state space (infinite in two dimensions) can be quite significant, especially at
higher traffic intensities. Thus truncation is neither sufficiently accurate nor robust for our purposes.
To our knowledge, there has been no previous analytical work on the problem of cycle stealing
with setup costs. Below we describe prior work on the “coupled processor model”. In this model two
processors each serve their own class of job, and if either is idle it may help the other, increasing the
rate of this other processor. This help incurs no switching cost and has an effect if even a single job is
present. These models inherently assume a preemptive resume discipline: when a processor returns to
It is easy to generalize our analysis to the case where the beneficiary requires a “switching time” before it can resume a job
that the donor started. We have chosen to ignore this generalization for purpose of clarity. It is also trivial to extend our results
to the case where all work on the job in progress is lost if a donor job arrives, provided that we assume that the job is restarted
with a new service time – which we feel is unrealistic. It is also possible to extend our results to the case where the donor must
complete the beneficiary job in progress before it switches back. This is the topic of current research, see Section 7.
For and , truncation leads to error, even with states, and takes minutes to compute.
As nears , the error increases indefinitely. Under jobs sizes more variable than exponential, the error increases.
2
its own queue, none of its work is lost. All works we mention below consider Poisson arrivals.
Early work on the coupled processor model was by Fayolle and Iasnogorodski [5] and Konheim,
Meilijson and Melkman [6]. Both papers assume exponential service times, deriving expressions for
the stationary distribution of the number of jobs of each type. Fayolle and Iasnogorodski use complex
algebra, eventually solving either a Dirichlet problem or a homogeneous Riemann-Hilbert problem for a
circle, depending on the accelerated rates of the servers. Konheim et. al. assume that the accelerated rate
is twice the original rate, which yields simpler expressions (still in the form of complex integrals). No
indication is given in either work as to how the evaluation of these expressions. In addition, because the
processors work in concert, rather than on different jobs, these systems will gain no multi-server benefit
when serving highly variable jobs; short jobs may get stuck waiting behind long jobs in a single queue.
The above work was extended by Cohen and Boxma [4] to the case of general service times. They
consider stationary workload, which they formulate as a Wiener-Hopf boundary problem. This leads
to complex expressions involving either integrals or infinite sums; if the queues are symmetric simpler
expressions for mean total workload are found. They again have the two processors working in concert,
without a switching cost, providing complex analytical expressions, rather than numerical values.
In more recent work, Borst, Boxma and van Uitert [3] apply a transform method to the expressions in
[4], yielding asymptotic relations between the workloads and the service requirement distributions. This
leads to the insight that if a processor has a load less than one, it is “insulated” from the heavy-tail of the
other, as long periods without cooperation will not lead to large backlogs. This is not the case if the load
is greater than one, as the queue now must rely on help to be stable. Borst, Boxma and Jelenkovic [2]
consider a very similar question under generalized processor sharing. Using probabilistic bounds, they
show that different service rates can either insulate the performance of different classes from the others
or not, again depending on whether the minimal rate is larger than the load. Both of these papers are
concerned with the asymptotic behavior of workload, whereas our work isolates mean performance. Our
work is thus complementary to these results.
Our approach
This paper presents the first analysis of cycle stealing under general service requirements with switching
costs and a switching threshold. Recall that the difficulty in analyzing cycle stealing is that the corresponding stochastic process defines a Markov chain which grows infinitely in two dimensions (2D),
making it intractable. The key idea in our approach is to find a way to transform this 2D chain into some
1D infinite Markov chain which can be analyzed. The questions in applying such a transformation are
(i) what should the 1D Markov chain track, and (ii) how can all the relevant information from the 2D
chain be captured in the 1D chain. Our 1D chain tracks the number of beneficiary jobs. For the donor
3
jobs, our state space contains only binary knowledge: zero jobs or
jobs. Nevertheless we are able
to capture detailed information on the number of donor jobs by using special transitions in our Markov
chain, where these transitions represent the lengths of an assortment of busy periods. The difficulty lies
in specifying the right busy periods, some of which transcend the definition of the analytical model.
Once the 1D Markov chain is specified, the hard work is finished, since this chain can be solved
efficiently using known numerical (Matrix analytic) techniques. While a closed-form solution may be
preferable, our chain is compact enough, and Matrix analytic methods powerful enough, that only a
couple seconds are required to generate any of the results plots in this paper. Furthermore, our method
very easily generalizes to more complex problem formulations e.g., multiple donors (Section 7).
Our analysis is approximate, but can be made as close to exact as desired. The only approximation
lies in representing the length of the busy periods by a Coxian distribution fit to a finite number of the
busy period moments. In this paper, we use a 2-stage Coxian to capture the first 3 moments of the
busy periods, and verify that this is sufficient via simulation. However, our method naturally extends to
matching more moments.
Our analysis assumes a Poisson arrival process for both classes of jobs. The service requirements
of the each class are assumed to be drawn i.i.d. from general distributions (which we approximate by a
Coxians). The arrival process can easily be generalized to a MAP – Markovian Arrival Process [1].
Summary of results
Our analysis yields many interesting results concerning cycle stealing, detailed in Section 6. While
cycle stealing obviously benefits the beneficiaries and hurts the donors, we find that when ,
cycle stealing is profitable overall even under significant switching costs as it may ensure stability of
the beneficiary queue. For , we define load-regions under which cycle stealing pays. We find
that in general the switching cost is only prohibitive when it is large compared with
switching cost, cycle stealing always pays.
Two counterintuitive results are that when . Under zero
, the performance of the beneficiaries is sur-
prisingly insensitive to the switching cost, and also insensitive to the variability of the donor job size
distribution. Even when the variability of the donor job sizes is very high, and donor help is very bursty,
the beneficiaries don’t seem to suffer.
We also study the effect of , and present several rules of thumb for choosing the optimal setting
of as a function of the server loads, the switching cost, and the mean job sizes.
4
Outline
In Section 2 we present our analysis for the case of zero switching cost, generalizing to non-zero cost in
Section 3. In Section 4 we provide stability conditions for the donor and beneficiary hosts. In Section 5
we validate our analysis by considering limiting analytical cases and via simulation. In Section 6 we
present results for mean response times of donor and beneficiary jobs under various loads, job size
distributions, switching costs, and thresholds. Section 7 describes extensions to the model.
2 Analysis without switching cost
, i.e. the donor server switches to help beneficiary jobs when there are zero
In this section we analyze the simpler case of zero switching cost. Figure 1 shows our Markov chain
for the case where donor jobs and there are at least three beneficiary jobs. Figure 1(a) shows a simplified form of our chain
where job sizes and busy periods are assumed to be exponentially-distributed. Figure 1(b) and (c) show
the case of generally-distributed (Coxian) job sizes and busy periods. The actual chain that we evaluate
in the paper is a superposition of the chains in Figure 1(b) and (c), shown in Appendix C.
The first two components of each state denote the number of beneficiary jobs and the number of
donor jobs respectively. The states of the Markov chain have been grouped into three rows, labeled (i)
cooperating row: indicates that the donor processor is cooperating with the beneficiary; (ii) independent
row, with
donor job: indicates that the donor and beneficiary processors are each working indepen-
dently on their own queues and there is at least 1 donor job present; (iii) independent row, with
donor
jobs: indicates that the donor and beneficiary processors are each working independently on their own
queues and there are zero donor jobs present.
Observe that while the states track the precise number of beneficiary jobs, they keep only a binary
record (zero or
) of the donor jobs. The key idea is that to determine beneficiary performance we can
avoid tracking the number of donor jobs because we only need to know when the donor queue is idle.
Thus we use transitions (labeled
) to represent the length of a busy period of donor jobs.
The logic behind the Markov chain in Figure 1(a) is as follows: If we are in the row where the
processors are working independently and the number of donor jobs is zero, the left-right transitions
allow us to track the number of beneficiary jobs. If a donor job arrives we transition to the row where
processors are working independently and the number of donor jobs is at least one. The time spent in this
row is the length of a donor busy period. If at the end of the busy period the number of beneficiary jobs
is below , then we return to the row where processors are working independently and the number of
donor jobs is zero. If however at the end of the busy period the number of beneficiary jobs is at least
, then we move to the cooperating row, where we stay until there is a donor arrival.
5
λ
Cooperating
Servers
µ
0B,0D,C
λ
B
B
λ
B
λD
B
B
λD
λ
λ
4B,0D,C
B
λD
BD
λ
BD
λ
BD
B
3B,1D
B
λD
B
2µ
B
µ
4B,1D
µ
B
B
B
B
2B,0D
1B,0D
µ
B
λD
B
BD
B
0B,0D
3B,0D,C
2B,1D
µ
BD
2µ
λ
1B,1D
µ
λ
B
λD
λ
0B,1D
Working
Independently,
# Donors = 0
2B,0D,C
λD
λD
Working
Independently,
# Donors = 1+
2µ
1B,0D,C
λ
B
µ
B
B
(a)
λ
µ
0B,0D,C
λ
B
D
0B,1D
β2
µ
β1
λ
µ
β3
1B,1D
B
D
λ
β2
λ
β3
λ
β2
B
λ
D
2µ
β2
λ
µ
β1
β3
µ
B
λ
B
µ
λ
β1
4B,0D,C
B
β3
B
2B,1D
λ
D
β3
β2
B
B
β1
B
3B,1D
4B,1D
µ
B
B
B
D
B
2B,0D
1B,0D
µ
B
B
D
λ
B
B
λ
B
3B,0D,C
λ
λ
µ
λ
B
2µ
B
µ
β1
D
0B,0D
B
D
λ
B
2B,0D,C
λ
B
B
λ
B
2µ
1B,0D,C
λ
λ
λ
λ
B
µ
B
B
(b)
b2
0B,0D,C
2b2
1B,0D,C
b1
b3
b3
b1
2b3
b2
b3
b2
1B,1D
b1
b3
2b3
b3
b1
2b3
b1
4B,0D,C
2b1
b2
b2
2B,1D
b1
b3
2b2
3B,0D,C
2b1
b2
b2
0B,1D
2b2
2B,0D,C
2b1
b3
b3
b2
3B,1D
b3
b1
4B,1D
b1
b3
b3
λB
b2
0B,0D
b3
1B,0D
b2
b1
b3
b3
λD
2B,0D
b2
b1
b1
b3
XB
ΒD
(c)
Figure 1: Markov chain for cycle stealing without switching cost. (a) Where is exponential and
is drawn using a single (exponential) transition. (b) Where is exponential, and
is represented
by a 2-stage Coxian. (c) Where is represented by a 2-stage Coxian and
is drawn using a single
(exponential) transition.
6
Figure 1(b) and (c) show the same Markov chain as 1(a), where now and
are represented
by 2-stage Coxian distributions. Observe that when the donor and beneficiary are cooperating, the state
space now must maintain a state for two partially-completed beneficiary jobs. If a donor job arrives while
we’re in the cooperating row, the partially-completed job that the donor was working on will be moved
to the head of the beneficiary queue (we currently assume zero cost for the transfer, but this is easy to
generalize to non-zero cost).
In order to specify the Markov chain, we need to compute the first three moments of
find a 2-stage Coxian which matches these. The Laplace transform of
The moments of
and then
is:
are obtained from the transform by taking derivatives. In [9] we derive necessary
and sufficient condition for matching the first three moments of a distribution using a 2-stage Coxian.
2.1 Computing response times
The mean response time for donor jobs is trivial to compute – it is simply the response time of the
M/ /1 queue, since the beneficiary jobs are preemptive and switching cost is zero.
The mean number of beneficiary jobs is easy to compute from the chain in Figure 1 using the MatrixAnalytic method, described in [8] and [7]. This is a simple, compact and efficient method for solving
QBD (quasi-birth-death) Markov chains which are infinite in only one dimension, where the chain repeats
itself after some point, as does Figure 1. We then apply Little’s Law to get their mean response time.
Every plot in this paper which uses Matrix-Analytic analysis (to solve multiple instances with different
parameter values) was produced within a couple of seconds using the Matlab 6 environment.
3 Analysis with switching cost
In this section, we analyze cycle stealing with switching costs in both directions. We assume that (i) the
donor only makes the switch if the donor queue is idle and the number of jobs at the beneficiary queue is
at least , and (ii) the donor stays at the beneficiary queue until there is a donor arrival.
Let denote the time required for the donor to switch to working on the beneficiary queue, and
#" the time to switch back to the donor queue. Figure 2 shows the Markov chain for cycle stealing with
. For simplicity, we have drawn and as being exponentiallyswitching cost where distributed — these are easy to replace with Coxian distributions.
The first two rows of the Markov chain in Figure 2 represent the case where the donor and beneficiary
servers are working independently, where row 2 indicates zero donor jobs and row 1 indicates at least
7
λ
Working
Independently
#Donors = 1+
0B,1D
λD
B
B D+ba
0B,1D,ba
λD
Switching
to help
λ
0B,0D,sw
λD
B D+ba
Ksw
0B,0D,C
Region 2
λ
1B,1D,ba
λD
λ
1B,0D,sw
B
λD
Ksw
λ
B
λ
2B,1D,ba
λ
B
µ
B
3B,1D,ba
λD
λ
2B,0D,sw
B
λD
2B,0D,C
B
λ
3B,0D,sw
λ
Figure 2: Markov chain for cycle stealing with switching cost, Ksw
B
2µ
3B,0D,C
B
λD
B D+ba
B
µ
B
λD
Ksw
4B,1D,ba
B
B D+ba
B
µ
B
µ
B
λD
B
B
2µ
1B,0D,C
B
λ
B
µ
B
µ
B D+ba
µ
B
B
µ
BD
BD
B
λ
B
µ
BD
B
2B,0D
µ
B
λ
4B,1D
µ
B
B
1B,0D
µ
Region 1
Switching
Back
λD
B
3B,1D
µ
B
BD
λ
0B,0D
λ
B
2B,1D
µ
B
BD
λ
Working
Independently
#Donors = 0
λ
B
1B,1D
µ
λD
Cooperating
Servers
λ
B
4B,0D,sw
B
λD
λ
Ksw
B
2µ
4B,0D,C
B
and exponential, .
one donor job. A transition from row 2 to row 1 starts a donor job busy period, the length of which is
beneficiary jobs in the system,
the Markov chain simply transitions back to row 2. However if there are
beneficiary jobs, the
represented by
. When this busy period completes, if there are
Markov chain transitions to row 4 – the row for switching to help. Observe that the Markov chain can
also go from row 2, the zero donor jobs row, directly to row 4, the switching to help row, as soon as there
are beneficiary jobs. After time, if no donor job arrived, the Markov chain transitions to row 5,
the cooperating row. As soon as a donor job arrives, either during or during the cooperating phase,
the Markov chain immediately transitions into row 3 – the switching back row. At the point of entering
row 3, there is a single donor job, which just arrived. The time to return to the case of zero donor jobs is
thus a busy period started by the sum of and #" . We denote this busy period by
#" .
In our Markov chain, we have drawn the busy period transitions with a single bold transition, although
in evaluating the chain we will replace this bold transition with a 2-stage Coxian as usual. Computing
the first three moments of a busy period of donor jobs, started by a job of size #" , the switch back time,
is straightforward by taking derivatives of the Laplace transform:
!#"
For some busy periods such as , there does not exist a 2-stage Coxian which matches the first 3 moments, see [9].
In such cases we divide into two busy periods: , the busy period started by a donor job, and
, the busy period
started by switching back. We then may use two 2-stage Coxians to represent each busy period, since .
!#"
#"
$%" & (' )#"
8
Calculation of response times
The mean response time for beneficiary jobs is easy to compute, since the chain keeps track of the
number of beneficiary jobs and can be analyzed via Matrix Analytic methods. The donor jobs see an
M/GI/1 queue where, at times, the first job in a busy period must wait for a time to switch back, #" .
Thus the response time for donor jobs is the response time under an M/GI/1 queue with setup time :
with probability #"
with probability We consider only regions 1 and 2, as
Region 1 Region 1 or 2 Region 2 Region 1 or 2 is defined by what the first job to start a busy period sees. The
expected waiting time for an M/GI/1 queue with only donor jobs and setup time
We thus have:
!
#"
Response time for donor
"
is known [10]:
" ! 4 Stability
In this section we derive stability conditions on and thresholds and .
for cycle stealing with switching costs and
Theorem 1 The stability condition (necessary and sufficient) for donor jobs is Proof: We first prove sufficiency. Assume In this case
. Let
.
denote the length of a busy period at the
period at the donor queue is started by a
., theA busy
mean length of a busy period is
#" %
$ donor queue. We first consider the case switching cost #" and donor jobs. As . In this case clearly has a proper distribution and thus the queue is positive regenerative, hence stable.
Next we consider the case is smaller than in the case because
there will be donor busy periods in which the donor hasn’t left the donor queue, implying (i) there is no
switching back cost, and (ii) the busy period is started by only one donor job.
9
Necessity follows immediately from the fact that the donor queue is unstable for all Before we derive the stability condition on .
, we prove a lemma allowing us to assume , then it is stable at , %$
Lemma 1.1 If the beneficiary queue is stable at .
.
denote the number of beneficiary jobs at time given . Consider a new
in which no service is given by either server to a beneficiary job if there are process jobs
stochastically dominates . Along any sample
present at the beneficiary queue. Note that .
path, will be equal to + . Therefore, if is proper, so too is Proof: Let
We now prove the stability condition on for exponential ! .
Theorem 2 For exponential , the stability condition for the beneficiaries with donor threshold is:
#"
. Let ,
Proof: By Lemma 1.1, we are allowed to assume "!"&#$ (*% ' )+
denote the time average fraction of
time that the donor helps the beneficiary. Then the strong law of large numbers can be used to show that
a necessary and sufficient condition for stability of the beneficiary jobs is
If ,
,
. Thus we assume and derive
,
,
using renewal reward theory. Let’s consider a
where
-
,
. By renewal-reward theory:
renewal to occur every time the donor becomes idle. Recall Fraction of time donor helps beneficiary
denotes the help (reward) during the renewal cycle, and
-
denotes the length of the renewal
cycle. Observe that there may be any number of donor arrivals ranging from
we switch back only after the arrival.
Let
denote the sum of the service times of donor jobs. Then, as .
#"
to during and
/ For non-exponential 02143 , conditions can be derived in terms of expectations and probabilities.
10
,
#"
Ksw
where
.
Donor
arrival
Donor
goes
idle
BS+ba
Donor helping
Donor
arrival
Donor
arrival
Donor
arrival
Donor
arrival
Donor
goes
idle
Nth arrivals
D
-
is the interarrival time for donor jobs.
To compute
we condition on the number of donor arrivals during . If there are arrivals,
then the expected time spent helping is the time until there are more donor arrivals, .
.
Let denote the probability that there are donor arrivals during . As is exponentiallydistributed,
Therefore,
-
"! #$ ! # $ %( ' )+
Thus, the stability condition for the beneficiary jobs is
!#" %$ <
"%$ ON "%$
BED
'&)( +* -, / ..80 130 192%42:45 57;=<?C
>A@
.0 G 5 .80 19H
IJ5 76 %" $ >F@
6
RQ8S
'&)( LK=M
'
&)( =P ?
<
&)( >F@
Note that the right hand side is an increasing function of ; in terms of stability, the larger , the
.
, UT $ , "
; as better the performance. In particular, when Figure 3 shows the stability condition for beneficiaries as a function of . In case (1), we set
#"
and . The region below each line satisfies the stability condition.
As increases as high as 100, the effect of switching overhead becomes negligible, and the stability
. In case (2), we set and #" .
"
condition approaches The switching cost is large, and it is observed that there is little benefit at moderate or high in terms
of stability. However, there is still large benefit at low . In case (3), we set and
#"
. The stability region is much larger; when , the stability region is
11
2
2
N th=1
D
N th=10
D
NDth=100
1.8
1.8
B
ρ
ρ
ρ
B
1.6
1.4
1.4
1.4
1.2
1.2
1.2
1
0
0.2
0.4
ρD
0.6
0.8
1
0
1
(1) N th=1
D
N th=10
D
NDth=100
1.8
1.6
B
1.6
2
N th=1
D
N th=10
D
NDth=100
(2) 0.2
0.4
ρD
0.6
0.8
1
0
1
0.2
(3) 0.4
ρD
0.6
0.8
1
Figure 3: Stability condition on beneficiaries for cycle stealing with switching costs and thresholds.
almost the same as that of in case (1). This is intuitive: when
and expected amount of donor work when the donor switches back is the same as that when .
, the
and
5 Validation of analysis
Since our analysis involves approximation of busy periods by 2-stage Coxians, it is of paramount importance to validate the analysis. Analytical validation against limiting cases is presented in Section 5.1,
and simulation validation is reported in Section 5.2.
5.1 Validation against known limiting cases
We evaluate the performance of cycle stealing under 4 limiting situations: T
and " , where "
is the stability condition for T
, , T
T
,
derived in Section 4. For these limiting
cases, the response times of the beneficiaries and donors are easy to evaluate analytically since they
converging in performance to either an M/GI/1, an M/GI/1 with setup cost, or an M/M/2. Figure 4
verifies that our analysis of cycle stealing has the correct limiting behavior in all these cases.
5.2 Validation against simulation
The accuracy of our analysis was also validated against simulation: a subset of out validation experiments
is shown in Figure 5. Job sizes are assumed to be exponential. We show three cases: In case (a),
. In case (b),
and
. In case (c), and
. In all cases, the results of analysis are in very close agreement with simulation. The only
mild discrepancy is the performance of the beneficiaries in case (b), under high load. Under case (b),
donor jobs are large, making their busy periods more variable, especially at high loads. As our analysis
is very dependent on these busy periods, matching only the first three moments may introduce error in
12
Performance of Beneficiaries
6
1.8
1.7
1.6
1.5
1.4
1.3
1.2
0
0.05
ρ
1.2
1.15
1.1
1.05
1
0.95
0
0.1
response time for beneficiaries
1.9
11
M/M/1
CS
0.05
ρ
d
2
M/M/1
CS
response time for beneficiaries
1.25
M/M/2
CS
response time for beneficiaries
response time for beneficiaries
2
10
9
8
7
6
0.9
0.1
0.95
ρ
B
x 10
CS
1.5
1
0.5
0
1
1
1.1
D
1.2
ρ
1.3
1.4
B
Performance of Donors
4
2.3
2.2
2.1
2
1.9
0
0.1
T
D
7.76
7.74
7.72
7.7
7.68
0
0.05
ρ
7.78
0.05
ρ
T
(2) B
8.85
M/G/1
CS
3
2
1
(3) 0.95
ρD
8.75
8.7
8.65
8.6
8.55
8.5
1
1
T
M/G/1 w/setup
CS
8.8
4
0
0.9
0.1
x 10
response time for donors
2.4
5
M/G/1
CS
7.8
response time for donors
response time for donors
2.5
(1) 7.82
M/G/1
CS
response time for donors
2.6
1.1
T
(4) 1.2
ρ
B
1.3
1.4
" Figure 4: Validation of our cycle stealing analysis against four limiting cases. For lack of space we
, #" , is exponentiallyshow graphs only for the specific parameters: distributed with mean 1 and is a Coxian with mean 1 and
.
this case. We hypothesize that the accuracy of our analysis would improve if we matched more moments
of the busy periods using Coxians with more stages.
6 Results of Analysis
This section discusses our results, organized as take-home messages. We begin with a summary:
6.1 Take-home messages
(Section 6.2) Cycle stealing is always a win when When (and , but doesn’t pay when .
), cycle stealing can provide an infinite benefit to beneficiaries over
dedicated servers, with comparatively little pain to the donors. Recall that the stability region for
the donors under cycle stealing is much greater than under dedicated servers. Factors such as increased switching costs, increased When , there is so little gain to the beneficiaries that cycle stealing with non-zero switch-.
However since the gain is still infinite these factors are less important in the domain , and increased result in lower gains for the beneficiary.
ing overhead doesn’t pay.
We therefore focus the rest of the results section on the remaining case: 13
.
Performance of beneficiaries
8
Analysis
Simulation
7
6
5
4
3
2
0.2
0.4
0.6
ρD
0.8
7
6
5
4
3
2
0.2
1
80
Analysis
Simulation
response time for beneficiaries
8
response time for beneficiaries
response time for beneficiaries
9
0.4
0.6
ρD
0.8
Analysis
Simulation
70
60
50
40
30
20
0.2
1
0.4
0.6
ρD
0.8
1
Performance of Donors
120
10
8
6
4
2
0.2
12
Analysis
Simulation
100
response time for donors
Analysis
Simulation
response time for donors
response time for donors
12
80
60
40
20
0.4
(a) 0.6
ρD
0.8
8
6
4
0.4
(b) 0.6
ρD
0.8
1
0
0.2
0.4
0.6
ρD
0.8
1
(c) Figure 5: Validation of analysis against simulation. #"
. We fix , and vary
.
and are exponentially-distributed.
. Analysis
Simulation
2
0
0.2
1
10
(Section 6.3) For , cycle stealing has regions of high gain/low pain and also regions where the reverse is true. These regions depend on job sizes, switching costs, thresholds
and loads. We organize performance into these gain/pain regions. We find that when the switching
costs are low, cycle stealing leads to high gain and low pain. However high switching costs can
reverse this effect. We find that the performance of the donors is sensitive to the switching cost,
while surprisingly, the performance of the beneficiaries is far less sensitive to the switching cost.
We also see that higher implies higher gains.
(Section 6.4) For , variability of donor job sizes has very little effect. We find that
the variability of the donor jobs has little effect on beneficiary performance. This is very surprising;
we expected the beneficiary to gain far less from the bursty help of a donor with irregular (highly
variable) job sizes. For the above intuition is in fact correct. This echoes findings in [3]
and [2] for a different model.
(Section 6.5) Setting the best threshold is non-trivial.
Finally we tackle the practical problem of how to set . Higher implies less gain for
the beneficiaries and less pain for the donors. However, the performance of the beneficiaries is
relatively insensitive to increasing , while the donors benefit quite a bit from raising . The
absolute effect of depends greatly on the particular loads 14
and . The effect of on
Performance of beneficiary jobs
30
20
10
0.5
1
ρ
(a) 20
10
0
0
B
30
0.5
ρ
20
10
0
0
1
2.2
2.05
2
1.95
1.9
1
ρB
(a) 20
10
0
0
0.5
ρB
2.8
2.6
2.4
2.2
1
(d) E[K]=1
CS with Nth=1
B
CS with Nth=10
6.5
5.4
3
1.8
0
Dedicated
30
1
Performance of donor jobs
2
1.85
40
(c) E[K]=0
Response time for donors
Response time for donors
2.1
CS with Nth
=1
B
CS with Nth
=10
B
B
3.2
2.15
0.5
0.5
ρ
(b) E[K]=1
Response time for donors
30
B
1.8
0
40
50
B
Response time for donors
0
0
40
Response time for beneficiaries
40
50
Response time for beneficiaries
50
Response time for beneficiaries
Response time for beneficiaries
50
5.2
5
4.8
Dedicated
6
5.5
5
4.6
0.5
ρ
1
0
0.5
ρ
B
1
0
(b) E[K]=1
0.5
ρ
1
B
B
(c) E[K]=0
(d) E[K]=1
Figure 6: The response time for beneficiaries and donors as a function of . Graphs show the case of (i)
, (ii) cycle stealing with , and (iii) dedicated servers. In all figures
cycle stealing with are exponential with mean 1. Switching cost is exponential with mean 0 or 1, as labeled.
and mean performance (of beneficiaries and donors together) is irregular and delicate. In this section
we derive five rules of thumb for setting as a function of the server loads, the switching costs,
and the donor job size.
Due to limited space we typically only show a couple options for each parameter. Many more graphs
are available in Appendix D.
6.2 Benefits of cycle stealing for wide range of
Figure 6 shows the response time for beneficiary jobs (top row) and donor jobs (bottom row) as a function
of , where is held fixed at (columns 1 and 2) and at (columns 3 and 4). Each graph shows
, (ii) cycle stealing with , and (iii) dedicated
response time under (i) cycle stealing with servers. The first and third columns have no switching cost, and the second and fourth columns have
exponential switching cost with mean 1.
When , cycle stealing can provide an infinite benefit to beneficiary jobs over dedicated servers.
We find that factors like increased switching costs, increased load at donor, and increased all result
15
in lower gains for the beneficiary. However since the gain is infinite these factors are less important when
. We also see that the response time of donor jobs is bounded by the response time for an M/GI/1
queue with setup cost , where is the switch back cost. Increasing results in less penalty to the
reaches its maximum.
When assuming non-zero
, the beneficiaries benefit so little that cycle stealing doesn’t pay,
switching cost. We therefore focus the rest of the results section on the case: .
donor, converging to the same point for all values when 6.3 Benefit of cycle stealing for
Gain of Beneficiaries and Loss of Donors — 0.8
mid gain
low loss
1
0.8
0.8
ρ
high gain
low loss
0.4
1
low gain
low loss
GOOD
BAD
0.8
mid gain
mid loss
0.6
ρD
0.6
D
0.6
1
0.6
ρD
low gain
low loss
ρD
1
0.4
0.4
0.2
0.2
GOOD
0.4
mid gain
high loss
0.2
0
0.6
ρB
0.8
0
1
(a) 0.6
ρB
0.8
0
1
high gain
high loss
0.6
ρB
0.8
0.2
0
1
0.6
ρB
0.8
1
(b) E[K]=1
Gain of Beneficiaries and Loss of Donors — 1
0.8
0.8
0.6
1
low gain
low loss
GOOD
0.6
mid gain
mid loss
0.4
0.4
0.4
0.2
0.2
0.2
BAD
0.8
0.6
ρD
D
0.8
0.6
high gain
low loss
mid gain
low loss
ρ
1
ρD
low gain
low loss
ρD
1
GOOD
0.4
high gain
mid loss
0.2
high gain
high loss
0
0.6
ρB
0.8
1
(a) 0
0.6
ρB
0.8
0
1
0.6
ρB
0.8
1
0
0.6
ρB
0.8
1
(b) E[K]=1
Figure 7: Columns 1 and 3 show the gain of beneficiary jobs and loss of donor jobs relative to dedicated
servers. High gain/loss refers to a
difference. Low gain/loss refers to a
difference.
Columns 2 and 4 show the mean performance (over all jobs) relative to dedicated servers. In all figures:
and are exponentially-distributed with mean 1.
Figures 7 and 8 evaluate cycle stealing in the region , as a function of . In Figure 7
and both have mean 1, while Figure 8 assumes has mean 1 and has mean 10. Both
figures assume switching cost is 0 or 1, and is 1 or 3. In our discussion below, we first concentrate
on the effect of switching cost, and then look at the effect of .
A few definitions are necessary: The term “gain” refers to the improvement in mean response time
experienced by beneficiary jobs, as compared with dedicated servers. This gain is expressed as a per16
Gain of Beneficiaries and Loss of Donors — 1
low gain
low loss
1
0.8
mid gain
low loss
0.8
0.4
high gain
low loss
0.2
0
0.6
ρB
0.8
GOOD
0.4
0.4
0.2
0.2
0
1
(a) 0.8
0.6
ρD
ρ
D
0.6
0.6
ρB
0.8
0
1
BAD
0.6
ρD
0.8
0.6
1
low gain
low loss
ρD
1
mid gain
low loss
high gain
low loss
0.6
GOOD
0.4
ρB
0.2
0.8
0
1
0.6
ρB
0.8
1
ρB
0.8
1
(b) E[K]= 1
Gain of Beneficiaries and Loss of Donors — 1
low gain
low loss
1
0.8
mid gain
low loss
0.8
0.6
ρD
ρ
D
0.6
0.4
high gain
low loss
0.2
0
0.6
ρB
0.8
1
(a)
GOOD
0.4
0.4
0.2
0.2
0
0.6
ρB
0.8
0
1
BAD
0.8
0.6
ρD
0.8
0.6
1
low gain
low loss
ρD
1
mid gain
low loss
high gain
low loss
0.6
GOOD
0.4
ρB
0.2
0.8
0
1
0.6
(b) E[K]=1
Figure 8: Columns 1 and 3 show the gain of beneficiary jobs and loss of donor jobs relative to dedicated
servers. High gain/loss refers to a
difference. Low gain/loss refers to a
difference.
Columns 2 and 4 show the mean performance (over all jobs) relative to dedicated servers. In all figures:
and are exponentially-distributed, where has mean 1 and has mean 10.
centage of the mean response time under dedicated servers. The beneficiary jobs may experience low
gain (
), mid gain (between
and
), or high gain (
). The term “loss” refers to the
relative pain (increase) in response time for donor jobs, as compared with dedicated servers. Loss is
also expressed as a percentage of the response time under dedicated servers. Odd-numbered columns
show various regions of gain and loss. Even-numbered columns use different terms: “good” and “bad.”
The “good” and “bad” terms refer to mean response time overall (over all jobs including beneficiaries
and donors). “Good” indicates that mean response time has dropped as a consequence of cycle stealing.
“Bad” indicates that mean response time has increased as a consequence of cycle stealing.
We start by looking at Figure 7. The first and second columns show the case of zero switching cost.
We see that all regions are low loss regions (in fact zero loss), and higher yields higher gain for
the beneficiaries. The third and fourth columns show switching cost of 1. The non-zero switching cost
creates slightly reduced gain for the beneficiaries (Recall from Section 6.2 that switching cost has a much
more pronounced effect on the beneficiaries when , as the beneficiary is dependent on the donor).
More importantly switching cost creates loss for the donor jobs. When 17
is very low, the loss appears
high, but this is primarily due to the fact that “loss” is defined as a percentage relative to the response
time under dedicated servers, and the response time under dedicated servers is obviously low for small
. The mean gain over all jobs is no longer always positive as shown in the fourth column of Figure 7.
Although not shown, we have also investigated higher switching costs, and these lead to the same trend
of slightly less gain for beneficiaries and more loss for donors.
Figure 8 differs from Figure 7 only in which now has mean 10. The effect of this change is
dramatic: now a switching cost of 1 has almost no effect on either beneficiaries or donors. This makes
sense since the setup cost experienced by the donor job is now relatively small compared to its size. We
can conclude that cycle stealing is most effective when the switching cost is small relative to the size of
the donor jobs.
Consider now columns 2 and 4 of both figures, which depict the effect on overall mean response
time. When the switching cost is zero, cycle stealing always overwhelms the dedicated policy. When
switching cost is non-zero, cycle stealing is a good idea provided either is low compared to is high, or the switching cost
. These trends continue in our experiments with higher switching costs.
Finally, we discuss the effect of in both figures. The performance of the beneficiaries is relatively
insensitive to increasing from
to
in both figures; however the donors benefit quite a bit from
raising . We further investigate selection of in Section 6.5.
6.4 Effect of donor job size variability
In this section, we consider the effect of variability in the donor job sizes. It seems intuitive that when
donor job sizes are made more variable, two things should happen:
1. The donor pain percentage should go down. This is because the donor response times will be
higher overall and so the relative pain will appear to be less, and
2. The beneficiary gain should go down. This is because high variability in the donor job sizes,
implies high variability in the length of the donor busy periods, which implies that the donor’s
visits to the beneficiary queue will become more irregular. Sporadic help should be inferior to
regular help, for the beneficiary.
Figure 9 below shows that the first hypothesis above is in fact true, while the second hypothesis is
surprisingly false, at least for (
. Comparing row 1 (
) with row 2
has high variability: ) we see that there is no discernible difference in donor performance.
To study this effect more closely, we next increase the variability in the donor job sizes to .
has low variability:
Figure 10 shows the performance of just the beneficiary jobs under the case of zero switching cost, when
the donor job size variability is 1, 8, or 50, and is fixed at 0.5. 18
is allowed to vary from 0 to the
Gain of Beneficiaries and Loss of Donors — 1
0.8
1
low gain
low loss
GOOD
0.6
mid gain
mid loss
0.4
0.4
0.4
0.2
0.2
0.2
BAD
0.8
0.6
ρD
D
ρ
0.8
0.6
high gain
low loss
mid gain
low loss
1
0.8
0.6
ρD
low gain
low loss
ρD
1
with
GOOD
0.4
high gain
mid loss
0.2
high gain
high loss
0
0.6
ρB
0.8
0
1
(a) 0.6
ρB
0.8
0
1
0.6
ρB
0.8
Gain of Beneficiaries and Loss of Donors — 1
1
0
1
(b) 0.6
ρB
0.8
1
with
1
1
low gain
low loss
mid gain
low loss
0.8
ρ
0.4
high gain
low loss
0.2
0
0.6
ρB
0.8
GOOD
0.4
0.4
0.2
0.2
0
1
(a) 0.6
ρB
0.8
0
1
BAD
0.8
mid gain
low loss
0.6
ρD
0.6
ρD
0.6
D
0.6
0.8
ρD
0.8
GOOD
0.4
high gain
mid loss
mid gain
mid loss
0.2
high gain
high loss
0.6
ρB
0.8
1
0
(b) 0.6
ρB
0.8
1
Figure 9: Beneficiary gain and donor pain shown for donor job sizes with low variability (top) and
high variability (bottom). Under higher donor variability, percentage of donor pain is lessened, but
.
percentage of beneficiary gain stays the same. All graphs assume: and have mean 1, stability condition. As is observed in Figure 9, the effect of variability of is very small when . When on the performance of , the effect of variability in donor sizes is non-negligible, but
is still not as great as one might expect.
6.5 Optimal setting for beneficiary threshold, This section investigates the optimal setting for . We start with some obvious rules: (i) increasing
leads to lower gain for the beneficiaries and lower pain for the donors; (ii) if the switching cost is
zero, the optimal is 1, since there is never any pain for the donors.
Figure 11 shows optimal values of for minimizing mean response time over all jobs (donors
and beneficiaries) as a function of and under various switching costs and donor job sizes. The
numbers on the contour curves show the best at each load. For clarity we only show lines up to
. The following additional rules of thumb are implied by the figure: (iii) the optimal is an
increasing function of and a decreasing function of ; (iv) increasing the switching cost increases
the optimal ; (v) increasing the donor job size leads to lower optimal .
19
Effect of variability of donor jobs on performance of beneficiary jobs
response time for beneficiaries
100
Exp
2
C =8
2
C =50
80
60
40
20
0
0
0.5
1
ρB
1.5
Figure 10: The response time for beneficiary jobs under different donor job size variability. is fixed
at 0.5. varies from 0 to the stability condition of 1.5. The mean donor job size and beneficiary job
size is 1, and switching cost is zero.
Selecting the optimal 8
0.9
6
0.7
,
3
8
0.9
6
4
2
0.7
2
2
0.6
0.5
4
6
0.8
0.7
D
4
0.5
4
3
0.8
ρ
D
0.6
ρ
D
0.6
ρ
14
12 0
1
8
0.7
4
4
10
0.8
65
0.9
14
12
8
6
8
0.8
6
0.9
,
0.6
D
14
1120
0.5
ρ
2
0.5
6
0.4
0.4
0.3
0.3
2
0.2
4
4
0.2
0.1
0.1
0.6
0.4
0.3
0.3
0.2
0.2
2
2
0.5
0.4
0.7
ρ
B
E[K]=1
0.8
0.9
0.5
2
0.1
0.6
0.7
ρ
0.8
0.9
0.1
0.5
B
0.6
0.7
ρ
0.8
B
E[K]=2
E[K]=1
0.9
0.5
0.6
0.7
ρ
0.8
0.9
B
E[K]=2
Figure 11: Graphs showing the optimal value of with respect to mean response time over all jobs.
7 Extensions and Current Work
This paper solves the problem of cycle stealing with switching costs and thresholds, presenting many
insights into the characteristics and performance of cycle stealing. Our analytical approach easily generalizes to more complex cycle stealing models. For example, in this paper we have assumed that the donor
immediately switches back when a donor job arrives. As we’ve stated earlier, we can also solve the more
general case where the donor only switches back when donor jobs are waiting. Furthermore, we
don’t even need to require that the donor switches back immediately at this point; we can allow the donor
to first complete the beneficiary job in progress. Completing the beneficiary job obviates the need for
checkpointing the job, however it sometimes reduces system performance, particularly when beneficiary
jobs have higher variability than donor jobs. We can also handle the problem of one beneficiary and two
20
or more donors, where if donors are helping, the beneficiary sees an
. queue. This extension
also allows the different donors to have different switching thresholds and switching costs.
References
[1] S. Asmussen. Phase-type distributions and related point processes: Fitting and recent advances. In S. R. Chakravarthy
and A. S. Alfa, editors, Matrix-Analytic Methods in Stochastic Models, pages 137–149. Marcel Dekker, 1997.
[2] S. Borst, O. Boxma, and P. Jelenkovic. Reduced-load equivalence and induced burstiness in gps queues with long-tailed
traffic flows. Under submission.
[3] S. Borst, O. Boxma, and M. van Uitert. The asymptotic workload behavior of two coupled queues. Under submission.
[4] J. Cohen and O. Boxma. Boundary Value Problems in Queueing System Analysis. North-Holland Publ. Cy., 1983.
[5] G. Fayolle and R. Iasnogorodski. Two couple processors: the reduction to a riemann-hilbert problem. Zeitschrift fur
Wahrscheinlichkeitstheorie und vervandte Gebiete, 47:325–351, 1979.
[6] A. Konheim, I. Meilijson, and A. Melkman. Processor-sharing of two parallel lines. Journal of Applied Probability,
18:952–956, 1981.
[7] G. Latouche and V. Ramaswami. Introduction to Matrix Analytic Methods in Stochastic Modeling. ASA-SIAM, Philadelphia, 1999.
[8] M. F. Neuts. Matrix-Geometric Solutions in Stochastic Models. Johns Hopkins University Press, 1981.
[9] T. Osogami and M. Harchol-Balter. Matching three moments of distributions and busy periods by -stage coxians.
Technical Report CMU-CS-02-178, School of Computer Science, Carnegie Mellon University, September 2002.
[10] H. Takagi. Queueing Analysis – A Foundation of Performance Evaluation, volume 1: Vacation and Priority Systems, Part
1. North Holland, 1991.
21
A
Analysis of a variant: Donor switches back when unneeded
λ
0B,1D
B
BD
λ
B
B
λ
0B,1D,ba
λD
λ
B
B
B
λD
B D+ba
λ
B
µ
BD
BD
2B,0D,ba
µ
B
B D+ba
B
B
1B,0D,ba
µ
BD
Kba
λ
0B,0D,ba
B
2B,0D
µ
Kba
λ
4B,1D
µ
B
1B,0D
µ
Kba
λD
B
λD
B
3B,1D
µ
BD
B
0B,0D
Region 1
λ
B
2B,1D
µ
λD
λ
Region 2
λ
B
1B,1D
µ
λD
λ
B
1B,1D,ba
λ
B
µ
B
B D+ba
λB
2B,1D,ba
µ
B
µ
λ
B
λD
Ksw
2B,0D,C
Region 2
λ
3B,0D,sw
B
λD
λ
B
2µ
B
3B,0D,C
4B,1D,ba
B
B D+ba
B
µ
B
µ
B
λD
2B,0D,sw
2µ
3B,1D,ba
λD
B
λ
B
Ksw
B D+ba
λD
B
µ
4B,0D,sw
B
λD
λ
Ksw
B
2µ
B
4B,0D,C
Figure 12: Markov chain of a variant. The donor server switches back when there is at most one beneficiary job regardless of the number of donor jobs.
In this section, we consider a variant of our algorithm: the donor server switches back when there is
beneficiary job, even if there are no donor jobs at the time. One might imagine that the performance
of donor jobs would improve under this policy, especially when the arrival rate of donor jobs is high
compared to that of beneficiary jobs, and the cost of switching back is relatively high.
Figure 12 shows the Markov chain for the analysis of the variant in the case of . It is assumed
that if the beneficiary server becomes idle while the donor server is still working on a beneficiary job,
then the job in progress at the donor server is passed over to the beneficiary server without any cost. It
is also assumed that ! , #" , and are exponentially distributed, all of which can be generalized by
using Coxian distributions. The response time of the donor job is analyzed in the same way as before,
except that Region 2 in the chain has been expanded.
Having analyzed the above Markov chain, we find that counter to our intuition this variant does not
significantly affect the response time for the donor or the beneficiary, under all conditions studied.
22
.
B Incorporating a Donor threshold
Throughout the paper we have assumed that the donor server switches back immediately when a new
donor job arrives at an empty donor queue. However, this is not necessarily the best policy with respect
to overall mean performance; given high switching costs, it might be better for the donor server to only
switch back if there are at least donor jobs waiting. Furthermore, in Section 4 we found that the
stability region for the beneficiary is an increasing function of , thereby giving further justification for
raising . In this section, we show how to analyze a generalization of our algorithm, which assumes
general .
" and Figure 13 shows a Markov chain that corresponds to cycle stealing where λ
Region A
λ
B
0B,1D
1B,1D
µ
λD
λ
B
2B,1D
µ
B
λD
BD
λ
λ
B
B
λD
BD
λ
B
B
3B,1D
µ
.
4B,1D
µ
B
BD
B
BD
BD
B
Region B
0B,0D
1B,0D
µ
Region C1
B 2D+ba
λ
0B,2D,ba
B
B 2D+ba
µ
1B,2D,ba
0B,1D,sw
Region C2
λ
0B,0D,sw
1B,1D,sw
B
Ksw
λ
1B,0D,sw
B
µ
Region D
λD
2B,1D,sw
Ksw
λ
2B,0D,sw
Ksw
1B,0D,C
2µ
B
µ
1B,1D,C
B
λD
λ
3B,1D,sw
B
λ
4B,1D,sw
B
3B,0D,sw
λD
Ksw
2B,0D,C
2µ
B
λD
Ksw
Figure 13: Markov chain for cycle stealing with and .
23
B
Ksw
3B,0D,C
2µ
B
λD
3B,1D,C
B
" and Ksw
Ksw
4B,0D,C
B
λD
λ
B
2µ
B
λ
4B,0D,sw
B
λD
λ
2B,1D,C
B
µ
B
B
B 2D+ba
B
µ
λ
B
µ
Ksw
B
2µ
λD
B 2D+ba
λD
λ
4B,2D,ba
B
λD
λ
B
B
B
µ
B
B
µ
B
λD
λ
3B,2D,ba
λD
λ
B
µ
B
λD
µ
B
λ
B
λD
B
Ksw
0B,0D,C
Region C3
2B,2D,ba
B
µ
λ
B
µ
λ
B
λD
λD
B
λD
λ
B
µ
λ
B 2D+ba
µ
B
B
λD
λ
0B,1D,C
λ
B
λD
λD
2B,0D
µ
B
2µ
4B,1D,C
B
and exponentially-distributed
B.1
Response time for donor jobs
We first analyze the response time for donor jobs when " . Then, the results are extended to
general .
The states of the Markov chain are divided into four regions
,
further subdivided into three regions
, and
,
,
, and
, where Region
is
, as shown in Figure 13. The time in queue for a
donor arrival is analyzed by conditioning on the region it arrives into. That is, the expected time in queue
is
where
+
! # " $ %&"('*) ) is the expected time in queue given that the arrival sees Region .
When a donor job arrives in Region A, it sees a busy M/GI/1 queue, where the job sizes
the arrival rate is
, and
. Recall that the expected time in queue for an M/GI/1 is
"
which is also written as
-/. 1 0 243 5- . 6 0
2=3 87:9<;
>7:9<;
-5. 60 is the expected time in queue in an M/GI/1 given that the arrival finds the system
243 -5. 60 busy. Observing that
, and the probability that the system is busy is
87:9<;
, we have
where
+ @? -5. 1 0 " When a donor job arrives in Region B, the waiting time is zero, since the donor server is staying at
an empty donor queue. Thus,
A CB To analyze the performance of a donor job, given that it arrives in Region C, first note that if we
aggregate Regions
with rate
of size and
into
, and the duration of staying at Region
and a setup time
ED , the duration of staying at Region ED is exponentially distributed
#"
. Furthermore, Region
Therefore, the fraction of time that the system spends in Region
is the length of a busy period started by a job
alternately; Regions A, B, and D are visited on the way from Region
Region
and Region
to Region
ED are visited
ED (see Figure 14).
given that the system is either in
or in Region ED is the same as the fraction of time that the M/GI/1 queue with setup time
24
A
B
C1
C3
C2
D
Figure 14: Transitions among regions of Markov chain in figure 13
is busy. The expected time in queue given that an arrival sees Region
(
ED , respectively) equals the
expected time in queue for an M/GI/1 queue with setup time, given that the arrival finds the system busy
(idle, respectively). Therefore, the expected time in queue given that an arrival sees Region C is just the
expected time in queue in a corresponding M/GI/1 queue with setup time :
!
"
" " When a donor job arrives in Region D, it waits for one more donor job to arrive, after which the
donor server switches back. Thus, the expected time in queue of an arrival into Region D is
A #"
In summary, the time in queue for donor jobs is
A -5. 6 0 ! +
E ! ! The analysis is easily extended to general . The Markov chain describing the states of such a
system can still be divided into four regions A, B, C, and D, but Region D is further divided into
regions , ..., (Region C is still subdivided into three regions, as before). In Region A, the donor
server is working on donor jobs. In Region B, the donor server is idle at the donor queue. In Region
the donor server is in the process of switching back to the donor queue. In Region
is in the process of switching to the beneficiary queue and there are
the donor server is working on beneficiary jobs and there are
25
,
, the donor server
donor jobs. In Region
,
donor jobs. In Region , the donor
server is either switching to the beneficiary queue or working on beneficiary jobs and there are
donor jobs, for
-*
. A similar argument as above leads to the time in queue for donor jobs:
-5. 6 0 ! +
where the setup time is now
#"
!
.
26
E ! ! A C
Superposition of the chains
Figure 15 shows a superposition of the chains in Figure 1(b) and (c).
b2
0B,0D,C
b1
b3
2b2
3B,0D,C
b3
2b1
2b3
b1
2b2
2B,0D,C
b3
2b1
2b3
b1
1B,0D,C
2b3
b2
b2
β2
b3
0B,1D
β2
β1
b2
b2
1B,1D
b1
b3
b3
b3
β1
β2
b1
β1
β1
β1
b2
β3
0B,0D
β2
b3
β3
b3
b3
β3
b2
b1
b3
β1
β3
β2 β3
β3
β3
β1
β1
β1
b2
b3
b3
b2
b1
b1
b3
β2
β1
β2
β1
β1
b2
3B,1D
1B,0D
b3
β2
β3
β2
4B,1D
b1
b3
b3
λB
b1 β
2
β3
b2
b3
b1
β3
b2
b1
b3
β2
b2
b1
b3
2B,1D
β3 β2
2b2
4B,0D,C
b3
2b1
b1
b2
λD
β3
b2
b1
b3
β2
2B,0D
β1
Figure 15: Markov chain for cycle stealing without switching cost, where sented by 2-stage Coxians.
27
and
β3
XB
ΒD
are both repre-
D
More graphs
In this section we include a few more result graphs omitted from the body of the paper for lack of space.
Figure 16 shows the gain/loss performance regions from Section 6.3 for the case where the beneficiaries
now have mean size 10 and the donors have mean size 1. The performance of the beneficiaries is virtually
identical to the case where and have mean 1. The donor performance is improved. The reason
for the improvement in donor performance is that the donor visits the beneficiary less frequently, since
the number of jobs queued at the beneficiary isn’t as high, for large jobs.
Figure 17 shows additional optimal computations, for the case of large beneficiary jobs and the
case of highly variable donor jobs. The case of large beneficiary jobs, results in lower optimal .
Increasing the variability of donor jobs doesn’t seem to have any effect.
Gain of Beneficiaries and Loss of Donors — 1
low gain
low loss
mid gain
low loss
0.8
0.6
ρD
ρ
D
0.6
high gain
low loss
0.2
0.6
ρB
0.8
GOOD
0.4
0.4
0.2
0.2
0
1
(a) 0.6
ρB
0.8
0
1
BAD
0.8
mid gain
mid loss
0.6
ρD
0.8
0.4
0
1
low gain
low loss
0.8
0.6
1
ρD
1
high gain
mid loss
high gain
high loss
0.6
ρB
0.8
GOOD
0.4
0.2
0
1
0.6
ρB
0.8
1
ρB
0.8
1
(b) E[K]= 1
Gain of Beneficiaries and Loss of Donors — 1
low gain
low loss
mid gain
low loss
0.8
0.6
ρD
ρ
D
0.6
high gain
low loss
0.2
0.6
ρB
0.8
1
(a) GOOD
0.4
0.2
0.2
0.6
ρB
0.8
mid gain
low loss
0
1
BAD
0.6
mid gain
mid loss
0.4
0
0.8
ρD
0.8
0.4
0
1
low gain
low loss
0.8
0.6
1
ρD
1
GOOD
0.4
high gain
mid loss
0.6
ρB
0.8
0.2
1
0
0.6
(b) E[K]=1
Figure 16: Columns 1 and 3 show the gain of beneficiary jobs and loss of donor jobs relative to dedicated
difference. Low gain/loss refers to a
difference.
servers. High gain/loss refers to a
Columns 2 and 4 show the mean performance (over all jobs) relative to dedicated servers. In all figures,
and are exponentially-distributed, where has mean 10 and has mean 1.
28
Selecting the optimal 142
110
8
0.9
4
142
1
10
8
6
0.7
0.5
0.5
8
6
14120
1
8
0.8
0.7
4
0.6
4
1412
10
8
0.5
6
0.4
0.4
0.3
0.3
0.3
0.2
0.4
4
0.3
2
0.2
0.2
4
2
2
0.1
0.1
2
0.5
0.6
0.7
ρ
B
E[K]=1
0.8
0.9
0.5
0.6
0.9
4
0.4
0.2
: mean 1 and
0.6
D
ρ
0.7
4
,
8
6
0.8
ρ
D
0.6
0.5
1142
10
0.9
6
0.8
0.6
D
4
6
0.7
ρ
6
6
, D
14 1012
8
0.9
12
10
0.8 8
ρ
0.7
ρ
0.8
0.1
2
0.9
0.5
B
0.6
0.7
ρ
0.8
B
E[K]=2
E[K]=1
2
0.1
0.9
0.5
0.6
0.7
ρ
0.8
0.9
B
E[K]=2
Figure 17: Graphs showing the optimal value of with respect to mean response time over all jobs.
29