Table Of ContentCapacity Allocation Mechanisms
for Grid Environments
Peter Gardfj¨all
Licentiate Thesis, October 2006
Department of Computing Science
Ume˚A University
Sweden
Department of Computing Science
Ume˚a University
SE-901 87 Ume˚a, Sweden
[email protected]
Copyright (cid:13)c 2006 by Peter Gardfj¨all
ExceptPaperI,(cid:13)c WorldScientificPublishingCompany,2006
PaperII,(cid:13)c Springer-Verlag,2006
PaperIV,(cid:13)c IEEEComputerSocietyPress,2005
ISBN 91-7264-216-5
ISSN 0348-0542
UMINF 06.38
Print & Media, Ume˚a University
Abstract
During the past decade, Grid computing has gained popularity as a means
to build powerful computing infrastructures by aggregating distributed com-
puting capacity. Grid technology allows computing resources that belong to
different organizations to be integrated into a single unified system image – a
Grid. As such, Grid technology constitutes a key enabler of large-scale, cross-
organizational sharing of computing resources. An important objective for the
Virtual Organizations (VOs) that result from such sharing is to tame the dis-
tributed capacity of the Grid in order to manage it and make fair and efficient
use of the pooled computing resources.
MostGridstodatehave,however,beencompletelyunregulated,essentially
serving as a “source of free CPU cycles” for authorized Grid users. Whenever
unrestricted access is admitted to a shared resource there is a risk of over-
exploitation and degradation of the common resource, a phenomenon often
referredtoas“thetragedyofthecommons”. Thisthesisaddressesthisproblem
by presenting two complementary Grid capacity allocation systems that allow
theaggregatecomputingcapacityofaGridtobedividedbetweenusersinorder
to protect the Grid from overuse while delivering fair service that satisfies the
individual computational needs of different user groups.
These two Grid capacity allocation mechanisms constitute the core contri-
bution of this thesis. The first mechanism, the SweGrid Accounting System
(SGAS), addresses the need for coordinated soft, real-time quota enforcement
across Grid sites. The SGAS project was an early adopter of the service-
oriented principles that are now common practice in the Grid community, and
the system has been tested in the Swegrid production environment. Further-
more, SGAS has been included in the Globus Toolkit, the de-facto standard
Gridmiddlewaretoolkit. SGASemploysacredit-basedallocationmodelwhere
researchprojectsaregrantedquotaallowancesthatcanbespentacrosstheGrid
resources,whichchargeusersfortheirresourceconsumption. Thisenforcement
of usage limits thus produces real-time overuse protection.
The second approach, employed by the Fair Share Grid (FSGrid) system,
uses a share-based allocation model where project entitlements are expressed
in terms of hierarchical share policies that logically divide the Grid capacity
between user groups. By coordinating local job scheduling to maintain these
global capacity shares, the Grid resources collectively strive to schedule users
for a “share of the Grid”. We refer to this cooperative scheduling model as
decentralized Grid-wide fairshare scheduling.
iii
iv
Preface
The thesis consists of an introduction and the following four papers:
Paper I T.Sandholm,P.Gardfj¨all,E.Elmroth,L.Johnsson,andO.Mulmo.
A Service-oriented Approach to Enforce Grid Resource Alloca-
tions. International Journal of Cooperative Information Systems,
Vol. 15, No. 3, pp. 439-459, 2006.
Paper II E.Elmroth,P.Gardfj¨all,O.Mulmo,andT.Sandholm. AnOGSA-
based Bank Service for Grid Accounting Systems. In J. Dongarra
et al. (eds), Applied Parallel Computing. State-of-the-art in Sci-
entific Computing. Springer-Verlag, Lecture Notes in Computer
Science, Vol. 3732, pp. 1051-1060, 2006.
Paper III P. Gardfj¨all, E. Elmroth, L. Johnsson, O. Mulmo, and T. Sand-
holm. Scalable Grid-wide Capacity Allocation with the SweGrid
AccountingSystem(SGAS).Submittedforjournalpublication,2006.
Paper IV E. Elmroth and P. Gardfj¨all. Design and Evaluation of a De-
centralized System for Grid-wide Fairshare Scheduling. In Heinz
Stockinger et al. (eds), e-Science 2005. First IEEE Conference
on e-Science and Grid Computing, IEEEComputerSocietyPress,
USA, pp. 221-229, 2005.
PaperIisanextendedandrevisedversionof“AnOGSA-BasedAccounting
System forAllocationEnforcement across HPC Centers” publishedinthe pro-
ceedingsoftheInternationalConferenceonService-OrientedComputing2004,
ACM.
ThisworkhasbeensupportedbytheSwedishResearchCouncil(VR)under
contracts 343-2003-953 and 621-2005-3667.
v
vi
Acknowledgements
I am grateful to a great deal of people. Without their help and support, I am
sure that the outcome would not have been nearly as good as it turned out.
First and foremost, I thank my supervisor Erik Elmroth. Besides being a co-
authoronallpapers,Erikisalwayseagertodiscussproblemsandideas,andhis
constant enthusiasm and positive spirit has been a great source of inspiration.
I also thank Bo K˚agstr¨om, my assistant supervisor for constructive comments
and support.
I express my gratitude to Thomas Sandholm, with whom I collaborated
closely during the design and development phases of the SweGrid Accounting
System(SGAS).ThomasandtheotherPDC-affiliatesOlleMulmoandLennart
JohnssonalsodeservetobethankedfortheircontributionstotheSGASpapers.
Thanks to Lars Malinowsky and ˚Ake Sandgren for providing helpful com-
ments, technical assistance as well as for pointing out technical problem spots.
I also thank ˚Ake, together with Mats Nyl´en, for sharing experiences from
scheduling and resource allocation in high performance computing environ-
ments. Bj¨orn Torkelsson, who has administered the “Grid playground” ma-
chines at HPC2N also deserves to be thanked.
AspecialthankstoJohanTordsson,myresearchcolleague. Overtheyears,
we have come to share a lot: research field, courses, teaching, office, hotel
rooms, colds, thoughts, and laughter. It has been a pleasure.
IalsothankPedherJohansson,whoseLaTeXskillsreallycametotherescue
as I was compiling this thesis document. I also take the opportunity to thank
the rest of my colleagues at the department for all work related and unrelated
discussions.
Lastbutnotleast,IthanktheonesIholddear: momanddad,mybrother,
relatives and friends for cheering me up and for making my spare time a great
pleasure.
Thank you all.
Ume˚a, October 2006
Peter Gardfj¨all
vii
viii
Contents
1 Background and Motivation 1
2 Introduction to Grid Computing 3
2.1 Grid System Characteristics 4
2.2 Executing a Job on the Grid 5
2.3 Interoperability, SOA and Web Services 7
3 Two Approaches to Grid Capacity Allocation 11
3.1 Operation Context 12
3.2 The SweGrid Accounting System (SGAS) 13
3.2.1 The SGAS Software 15
3.3 The Fair Share Grid (FSGrid) System 16
3.4 Summary of Papers 20
3.4.1 Paper I 20
3.4.2 Paper II 20
3.4.3 Paper III 20
3.4.4 Paper IV 21
3.5 Related Work 21
3.5.1 SGAS Related Work 24
3.5.2 FSGrid Related Work 24
3.6 Discussion 26
3.6.1 A Comparison of the Systems 26
3.6.2 Open Issues and Future Work 27
Paper I 37
Paper II 63
Paper III 77
Paper IV 113
ix
x
Description:use of the pooled computing resources. Most Grids to date have, however, of computing resources that are shared by the VO participants can be het-