Table Of Content2011 Third International Conference on Computational Intelligence, Communication Systems and Networks
Utilizing Bee Colony to Solve Task Scheduling Problem in Distributed Systems
M. H. Kashani
M. Jamei
Department of Computer Engineering Department of Computer Engineering
Islamic Azad University, Shahr-e-Qods Branch Islamic Azad University, Shahr-e-Qods Branch
Tehran, Iran Tehran, Iran
[email protected] [email protected]
M. Akbari R. Moosavi Tayebi
Department of Computer Engineering Department of Computer Engineering
Islamic Azad University, Shahr-e-Qods Branch Islamic Azad University, Shahr-e-Qods Branch
Tehran, Iran Tehran, Iran
[email protected] [email protected]
Abstract—Tasks scheduling problem is a key factor for algorithm schedules a set of tasks with known processing
distributed systems to gain better performance. Even in the and communication characteristics on processors to
best conditions, the scheduling in distributed systems is optimize Makespan and CPU utilization. In this paper, the
known as an NP-complete problem. Hence, many genetic we focused on static scheduling. The proposed methods
algorithms have been proposed for searching optimal can be generally classified into three categories: Graph-
solutions from entire solution space. However, these existing
theory-based approaches [2], mathematical models-based
approaches are going to scan the entire solution space
methods [3] and heuristic Techniques [4-8].
without considering the techniques that can reduce the
One of the crucial aspects of the scheduling problem is
complexity of the optimization. Spending too much time for
load balancing. When newly created processes enter the
doing scheduling is considered the main shortcoming of
system randomly, some processors may be overloaded
these approaches. Therefore, in this paper memetic
algorithm has been used to cope with this shortcoming. With heavily while the others are under-loaded or idle. The
regard to load balancing efficiently, Bee Colony main objectives of load balancing are to spread load on
Optimization (BCO) has been applied as local search in the processors equally, maximizing processors utilization, and
proposed memetic algorithm. Extended experimental results minimizing total execution time. In dynamic load
demonstrated that the proposed method outperform the balancing, processes must be dynamically allocated to
existent GA-based method in term of Makespan.
processors in arrival time and obtain a near optimal
schedule; therefore, the execution of the dynamic load
Keywords-Task Scheduling; Distributed Systems; Memetic
balancing algorithm should not take too long to arrive at a
Algorithm; Bee Colony Optimization;
rapid task assignments decision [9-11]. In this paper
I. INTRODUCTION scheduling algorithm considering load balancing has been
proposed.
In many achievements such as computer architectures
As mentioned above the scheduling problem has been
and software packages, distributed systems are used in a
known to be NP-complete. Therefore using heuristic
great variety of applications. The problem of task
Techniques can solve this problem more efficiently. Three
scheduling in these systems has received a great attention
most well-known heuristics are the iterative improvement
recently. Task scheduling in a distributed operating
algorithms [12], the probabilistic optimization algorithms,
system can be defined as allocating tasks to processors of
and the constructive heuristics. In the probabilistic
each computer in a way that the optimum performance is
optimization group, GA-based methods [13-18] are
obtained. The aim of task scheduling is minimizing
considerable. The main distinction among them is
Makespan (job completion time) and communication cost
chromosomal representation used for a schedule.
while maximizing CPU utilization. This problem is
However, these approaches scan the entire solution space
known as NP-complete [1].
without considering the techniques that can reduce the
There are two categories for task scheduling: static
and dynamic. In dynamic scheduling, schedules are complexity of the optimization. In other words, their main
created during run time without having any knowledge of shortcoming is spending much time doing scheduling.
the task in hand before arriving. While in static This shortcoming of GA-based methods can be reduced
scheduling, schedules are created before run time and do by combing GA with another optimization technique.
not change. Similarly, tasks must be all known in Hence, this paper proposes a new algorithm by using
advance. In other words, a static task scheduling memetic algorithm to cope with this shortcoming. Bee
978-0-7695-4482-3 2011 298
U.S. Government Work Not Protected by U.S. Copyright
DOI 10.1109/CICSyN.2011.69
Colony Optimization has been applied as local search in values need to be set prior the algorithm execution are as
the proposed memetic algorithm. Experimental results follows:
demonstrate that our proposed method outperforms the B - The number of bees in the hive.
existent GA-based method in term of Makespan. N -The number of constructive moves during one forward
The rest of this paper is organized as follow: In pass.
section 2 used method as local search in the proposed At the beginning of search process, all bees are in the
memetic algorithm is presented. Proposed method comes hive. To continue, it is the pseudo code of the BCO
in section 3. Experimental results are given in section 4. algorithm:
Section 5 concludes the paper. 1. Initialization: every bee is set to an empty solution;
2. For every bee do the forward pass:
II. USED METHOD AS LOCAL SEARCH IN THE
a) Set k = 1;
PROPOSED ALGORITHM
//counter for constructive moves in the forward pass;
In this section the method which has been used as b) Evaluate all possible constructive moves;
local search in the proposed memetic algorithm has been c) According to evaluation, choose one move using
described. the roulette wheel;
Bee Colony Optimization (BCO) is a metaheuristic for d) k = k + 1; If k ≤ NC Go To step b.
solving combinatorial optimization problems. The BCO is 3. All bees are back to the hive; // backward pass starts;
inspired by bees' behavior in the nature. In a honey bee 4. Sort the bees by their objective function value;
colony forager bees search the environment for flower 5. Every bee decides randomly whether to continue its
paths and if they find a good source of food, share it with own exploration and become a recruiter, or to become
other bees. While the forager bees come back to the hive a follower (bees with higher objective function value
they share the found information about food sources by a have greater chance to continue its own exploration);
special movement named waggle dance. The studies on 6. For every follower, choose a new solution from
this type of bee dance shows that in the midst of this recruiters by the roulette wheel;
dance some information like direction, distance, quantity 7. If the stopping condition is not met Go To step 2;
and quality of the food source are shared with respect to 8. Output the best result.
other bees [19-21]. Initial bee colony is constructed by the following
The BCO is a population based algorithm. Population algorithm:
of the artificial bees searches for the optimal solution. 1. At each step, a next task to be assigned is selected.
Every artificial bee generates one solution to the problem. 2. The processor (the selected task is going to be
The algorithm consists of two alternating phases: forward assigned to) is determined.
pass and backward pass. During each forward pass, every 3. Repeat these two steps until all tasks have been
bee is exploring the search space. It applies a predefined assigned to a processor.
number of moves, which construct and improve the The choice is probabilistic bias to a probability function.
solution, yielding to a new solution. This function is updated at each iteration in a
Having obtained new partial solutions, the bees return reinforcement way by using the features of good
to the nest and start the second phase, the so-called solutions.
backward pass. During the backward pass, all bees share In the relation (1), after selecting a task, a randomly
information about their solutions. In nature, bees would chosen processor will be selected.
perform a dancing ritual, which would inform other bees
about the amount of food they have found, and the pro = aij/Load(pj) (1)
ij m
proximity of the patch to the nest. In the search algorithm, ∑[a /Load(p )]
ik k
the bees announce the quality of the solution, i.e. the k=1
value of objective function. During the backward pass,
every bee decides with a certain probability whether it In the above a , 1≤i≤n , 1≤ j≤m is the
ij
will advertise its solution or not. The bees with better execution time of task t on processor P. The processor
i j
solutions have more chances to advertise their solutions. with a lower value of the running times has greater
The remaining bees have to decide whether to continue to probability to be chosen.
explore their own solution in the next forward pass, or to All bees return to the hive after generating the partial
start exploring the neighborhood of one of the solutions solutions. All these solutions are then evaluated by all
being advertised. Similarly, this decision is taken with a bees. (The latest time point of finishing the last task at any
probability, so that better solutions have higher processor characterizes every generated partial solution).
probability of being chosen for exploration. Let us denote by Load(P )(b=1, 2, ..., B) the latest
b
The two phases of the search algorithm, forward and time point of finishing the last task at any processor in the
backward pass, are performed iteratively, until a stopping partial solution generated by the b-th bee. We denote by
condition is met. The BCO algorithm parameters whose O normalized value of the time point Load(P ), i.e.:
b b
299
Makespan−Load(p ) In our model there are finite numbers of tasks, each
O = b (2)
b Makespan−Minspan having a task number and an execution time and placed in
a task pool from which tasks are assigned to processors.
In the above Load(P ) is the processor load for each
b Figure 1 shows the proposed chromosome. In this
processor, and Makespan and Minspan are respectively
chromosome, tasks 1,2,3,4,5,6,7 and 8 are assigned to
the smallest and the largest time point among all time
processors 2,1,2,3,3,4,2 and 1 respectively. Both tasks 1
points produced by all bees. The probability that b-th bee
and 3 are assigned to processor 2 but first task 1 is
(at the beginning of the new forward pass) is loyal to its
executed then 3.
previously generated partial solution is expressed as
follows:
−Makespan−Ob
pu+1 =e u , b=1,2,...,B (3) Figure 1. An example of proposed chromosome
b
Which u is the ordinary number of the forward pass (e.g.,
Before describing the proposed method, it is necessary
u = 1 for first forward pass, u = 2 for second forward pass,
to state some definitions which have been stated in [29] as
etc.).
follow:
For each uncommitted bee with a certain probability it • T ={t ,t ,t ,...,t }is the set of tasks to execute.
is decided which recruiter it would follow. The 1 2 3 n
probability that b’s partial solution would be chosen by • P={p,p ,p ,...,p } is a set of processors in the
1 2 3 m
any uncommitted bee is equal to:
distributed system. Each processor can only execute one
O
pro = b ,b=1,2,...,R (4) task at each moment, a processor completes current task
b R before executing a new one, and a task cannot be moved
∑O
k to another processor during execution.
k=1 • R is an m×m matrix, where the element
Where O represents normalized value for the
objective funcktion of the k-th advertised partial solution ruv 1≤u,v≤m of R, is the communication delay rate
and R denotes the number of recruiters. Using relation (4) between p and p .
u v
and a random number generator, each uncommitted • H is an m×m matrix, where the element
follower joins one recruiter.
Genetic Algorithms (GA) are search methods that take huv 1≤u,v≤m of H, is the time required to transmit a
their inspiration from natural selection and survival of the unit of data from p to p . It is obvious that h =0
u v uu
fittest in the biological world [22-24]. GAs differ from and r =0.
more traditional optimization techniques in that they uu
involve a search from a "population" of solutions, not • A is an n×m matrix, where the element
from a single point. Each iteration of a GA involves a a 1≤i≤n , 1≤ j≤m of A, is the execution time of
ij
competitive selection that weeds out poor solutions. The
task ti on processorp .
solutions with high "fitness" are "recombined" with other j
solutions by swapping parts of a solution with another • D is a linear matrix, where the element
(crossover). After selection and crossover, there is a new d 1≤i≤n of D, is the data volume for task ti to be
i
population full of individuals. Some of them are directly
transmitted, when task ti is to be executed on a remote
copied, and the others are produced by crossover. In order
to ensure that the individuals are not all exactly the same, processor.
a small chance of mutation is allowed. Solutions are also • F is a linear matrix, where the element
"mutated" by making a small change to a single element fi 1≤i≤n of F, is the target processor that is selected
of the solution. Recombination and mutation are used to for task ti to be executed on.
generate new solutions that are biased towards regions of
• C is a linear matrix, where the element
the space for which good solutions have already been
seen. Mutation is, however, vital to ensuring genetic c 1≤i≤n of C, is the processor that the task ti is
i
diversity within the population. worked on.
As form of genetic algorithm, Memetic algorithms • The processor load for each processor is stated as
apply a local search process to refine solutions to hard follows:
problems [25-28]. In this paper Bee Colony Optimization No.of allocated No.of NewAssigned
processeson processesto
has been applied as local search in the proposed memetic
processori processor.i
algorithm. Load(p )= ∑a + ∑a (5)
i j,i k,i
j=1 k=1
• The Makespan of a schedule is the maximal finishing
III. PROPOSED METHOD
time of all processes or maximum load.
300
makespan(T)=max(Load(p)) Bee Colony Optimization (BCO) has been applied as
i (6) local search in the proposed memetic algorithm. The
∀1≤i≤Numberof Processors
proposed method works as follows:
• The minpan of a schedule is the miniimal finishing In each step of iteration, a task in the chromosome is
time of all processes or minimum load. randomly selected (index of the chromosome) according
Minspan(T)=min(Load(p)) to probability function, and then the selected task is
i (7) allocated to its processor. The bee colony optimization is
∀1≤i≤Numberof Processors checked to see whether a new chromosome has been
• The Processor utilization for each processor and the found. This function is updated at each iteration in a
average of processors utilization are also computed as reinforcing way by using the features of good solutions.
follows: The update of the probability function is done by the bee
Load(p ) that produced the best overall chromosome.
U(p )= i (8)
i Makespan
IV. SIMULATION RESULTS
Noofprocessors (9)
AveU =( ∑ U(p)) NumberOf Processors In this section, the proposed method compared with [29]
i
i=1 regarding Makespan are evaluated.
• Number of Acceptable Processor Queues (NoAPQ):
We must define thresholds for light and heavy load on
First experiment:
processors. If the tasks completion time of a processor is
In this experiment attempt has been made to increase
within the light and heavy thresholds, this processor
the number of tasks and compute average of all
queue will be acceptable. If it is above the heavy
Makespan. Figure 3 depict experimental results. As it can
threshold or below the light-threshold, then it is
be obtained from figure 3, average of all Makespan in
unacceptable. But what is important is average of number
proposed method is less than GA-based method (reference
of acceptable processors queues, which is achievable by :
AveNoAPQ=NoAPQ NumberOf processors (10) [29]); increasing the number of tasks, BCO shows better
results than GA method.
A Queue associated with every processor, shows the tasks
that processor has to execute. The execution order of tasks
on each processor is based on queues. Finally the fitness
of the chromosome (Schedule T) can be computed as
follows:
(γ×AveU)×(θ×AveNoAPQ)
fitness(T)= (11)
(α'×makespan(T))×(β'×CC(T))
In the above mentioned formula, 0<α,β,γ,θ≤1 are
control parameters to control effect of each part according
to special cases and their default value is one. This
equation shows that a fitter solution (Schedule) has less
Makespan, less Communication Cost, higher processor
utilization and higher Average number of acceptable
Figure 3. Makespan in both methods considering number of task
processor queues.
Now the proposed method in details is described.
Second experiment:
Figure 2 depicts the proposed method. Bee Colony
Similarly, in this experiment, the aim is computing the
Optimization has been applied as local search in the
Makespan of all processors. Figure 4 shows that
proposed memetic algorithm.
Makespan in GA-Based method is more than the proposed
method. It means that if population size is increased, then
the trend of Makespan in GA-based method will be
increased. On the other hand, it can be concluded that
GA-based method has not scalability.
Figure 2. Structure of the proposed method
301
[5] C.I.Park and T.Y.Choe, “An optimal scheduling algorithm based on
task duplication,” IEEE Trans. on Computers, 51(4), 2002, pp. 444–
448.
[6] C. M. Woodside, and G. G. Monforton, “Fast Allocation of Processes
in Distributed and Parallel Systems,” IEEE Trans. On Parallel and
Distributed Systems, 4(2), 1993, pp. 164-174.
[7]Andrew J. Page, Thomas M. Keane and Thomas J. Naughton, “Multi-
heuristic dynamic task allocation using genetic algorithms in a
heterogeneous distributed system,” Journal of Parallel and
Distributed Computing, Volume 70, Issue 7, 2010, pp.758-766.
[8] A. K. Sarje and G. Sagar, “Heuristic Model for Task Allocation in
Distributed Computer Systems,” Proc. of the IEEE, 138(5), 1991,
pp. 313-318.
[9] Y. Lan, & T. Yu, “A Dynamic Central Scheduler Load-Balancing
Figure 4. Makespan in both methods regarding pop size
Mechanism,” Proc. of the 14th IEEE Ann. Int'l Phoenix Conf. on
Computers and Communication, 1995, pp. 734-740.
Third Experiment:
[10] A.K.Sarje & G.Sagar, “Heuristic Model for Task Allocation in
The objective of this experiment is to compute the
Distributed Computer Systems,” Proc. of the IEEE, 138(5), 1991,
above metrics when the numbers of generations are
pp. 313-318.
increased. Figure 5 shows that Makespan in GA-based [11] F. Bonomi, & A. Kumar, “Adaptive Optimal Load-Balancing in a
method is more than BCO method when the number of Heterogeneous Multiserver System with a Central Job Scheduler,”
generation is increased. IEEE Trans. on Computers, 39(10), 1990, pp. 1232-1250.
[12] M. Lin and L.T.Yang, “Hybrid Genetic Algorithms for Scheduling
Partially Ordered Tasks in a Multi-processor Environment,” Proc.
of the 6th IEEE Conf. on Real-Time Computer Systems and
Applications, 1999, pp. 382–387.
[13] W.Yao, J.Yao and B.Li, “Main Sequences Genetic Scheduling For
Multiprocessor Systems Using Task Duplication,” International
Journal of Microprocessors and Microsystems, 28, 2004, pp. 85-94.
[14] M. Moore, “An Accurate and Efficient Parallel Genetic Algorithm
to Schedule Tasks on a Cluster,” Proceedings of the IEEE
International Parallel and Distributed Processing Symposium, 2003.
[15] V. D. Martino, “Sub Optimal Scheduling in a Grid using Genetic
Algorithms,” Proceedings of the IEEE International Parallel and
Distributed Processing Symposium, 2003.
[16] A.Y. Zomaya, C. Ward and B. Macey, “Genetic Scheduling for
Figure 5. Makespan in both methods with respect to number of
Parallel Processor Systems: Comparative Studies and Performance
generation
Issues,” IEEE Trans. On Parallel and Distributed Systems, 10(8),
1999 pp. 795-812.
IV. CONCLUSION
[17] S. H. Woo, S. Yang, S. Kim and T. Han, “Task scheduling in
Scheduling in distributed operating systems has a distributed computing systems with a genetic algorithm,” High-
significant role in overall system performance and Performance Computing on the Information Superhighway, HPC-
Asia '97, 1997, pp. 301-307.
throughput. In this paper, Memetic algorithm has been
[18] E. S. H. Hou, N. Ansari and H. Ren, “A Genetic Algorithm for
used for task scheduling. We applied Bee Colony
Multiprocessor Scheduling,” IEEE Trans. On Parallel and
Optimization as a local search in memetic. Proposed Distributed Systems, 5(2), 1994, pp. 113-120.
method minimizes Makespan. Extended experimental [19] D. Teodorovic, T. Davidovic and M. Selmic, “Bee Colony
results demonstrate that the proposed method outperforms Optimization: The Applications Survey,” ACM Transactions on
Computational Logic, 2011, pp. 1-20.
the existent GA-based method in term of Makespan.
[20] D. Karaboga and B. Basturk, “On the performance of artificial bee
colony (ABC),” algorithm. Appl. Soft Comput, 2008, 8(3), pp. 687–
REFERENCES 697.
[1] Y. Chow and W.H. Kohler, “Models for Dynamic Load Balancing in [21] A. Hanani, S. Nourossana, H. Haj seyed javadi and A. Rahmani,
a Heterogeneous Multiple Processor System,” IEEE Transactions on ”Solving the Scheduling Problem in Multi-Processor Systems with
Computers, Vol. 28, 1979, pp. 354-361. Communication Cost and Precedence using Bee Colony System,” in
[2] C.C.Shen, & W.H.Tsai, “A Graph Matching Approach to Optimal Proc. of the 3rd International Conference on Advanced Computer
Task Assignment in Distributed Computing Using a Minimax Theory and Engineering, Chengdu, china, 2010, pp. V5-464-V5-
469.
Criterion,” IEEE Trans. On Computers, 34(3), 1985, pp. 197-203.
[22] M. F. Bramlette, “Initialization, Mutation, and Selection Methods in
[3] P.Y.R.Ma, E.Y. S. Lee and J.Tsuchiya, “A Task Allocation Model
Genetic Algorithms for Function Optimization,” Proc of the Fourth
for Distributed Computing Systems,” IEEE Trans. On Computers,
International Conference on Genetic Algorithms. Los Altos, CA,
31(1), 1982, pp. 41-47.
1991.
[4] G. L. Park, “Performance Evaluation of a List Scheduling Algorithm
[23] Lawrence David Davis, Handbook of Genetic Algorithms, First
In Distributed Memory Multiprocessor Systems,” International
Edition, Van Nostrand Reinhold, 1991.
Journal of Future Generation Computer Systems 20, 2004, pp. 249-
256.
302
[24] S. Forrest, Genetic Algorithms: Principles of Natural Selection
Applied to Computation, Science, 261 (1993) 872-878.
[25] Speel H. C, “Memetics: On a conceptual framework for cultural
evolution,” Symposium of Einstein Meets Margritte, Free
University of Brussels, 1995.
[26] S. Areibi, M. Moussa, and H. Abdullah, “A comparison of
genetic/memetic algorithms and heuristic searching,” In Proc of the
2001 International Conference on Artificial Intelligence ICAI 2001,
Las Vegas, Nevada, 25, 2001.
[27] H. Ishibuchi, T. Yoshida, and Tadahiko Murata, “Balance between
genetic search and local search in memetic algorithms for
multiobjective permutation owshop scheduling,” IEEE Transactions
on Evolutionary Computation,7(2) (2003) 204-223.
[28] P. Moscato, New Ideas in Optimization, chapter Memetic
Algorithms: A Short Introduction, McGraw Hill, 1999, pp. 219-
234.
[29] M. Nikravan and M. H. Kashani, “A Genetic Algorithm For Process
Scheduling In Distributed Operating Systems Considering Load
balancing,” in Proc. Proceedings 21th European Conference on
Modelling and Simulation, Prague, Czech Republic (2007) 645-650.
303