Table Of ContentTHEORY OF COMPLEXITY CLASSES
VOLUME 1
Chee Keng Yap
Courant Institute of Mathematical Sciences
New York University
251 Mercer Street
New York, NY 10012
September 29, 1998
Copyright: This book will be published by Oxford University Press. This prelim-
inary version may be freely copied, in part or wholly, and distributed for private
or class use, provided this copyright page is kept intact with each copy. Such users
may contact the author for the on-going changes to the manuscript. The reader
is kindly requested to inform the author of any errors, typographical or otherwise.
Suggestions welcome. Electronic mail: [email protected].
i
PREFACE
Complexity Theory is a thriving and growing subject. Its claim to a central
position in computer science comes from the fact that algorithms permeate every
application of computers. For instance, the term NP-completeness is by now in
the vocabulary of the general computer science community. Complexity Theory
has evolved considerably within the last decade. Yet there are few books that
systematically present the broad sweep of the subject to a prospective student from
a ‘modern’ point of view, that is, the subject as a practitioner sees it. The few
available treatments are embedded in volumes that tended to treat it as a sub(cid:12)eld
of related and older subjects such as formal language theory, automata theory,
computability and recursive function theory. There are good historical reasons for
this, for our legacy lies in these subjects. Today, such an approach would not do
justice to the subject { we might say that Complexity Theory has come-of-age. So
one goal of this book is to present the subject fully on its own merits.
Complexity Theory can be practiced at several levels, so to speak. At one ex-
treme we have analysis of algorithms and concrete complexity. Complexity Theory
at this low level is highly model-dependent. At the other end, we encounter ax-
iomatic complexity, recursive function theory and beyond. At this very high level,
the situation becomes quite abstract (which is also one of its beauties); the main
tools here center around diagonalization and the results often concern somewhat
pathological properties of complexity classes. So what we mean by Complexity
Theory in this book might be termed medium level. Here diagonalization remains
an e(cid:11)ective tool but powerful combinatorial arguments are also available. To dis-
tinguish the subject matter of this book from these closely related areas, I have
tried to emphasize what I perceive to be our central theme: Complexity Theory is
ultimately about the relationships among complexity classes. Of course, there are
reasonable alternative themes from which to choose. For instance, one can construe
ComplexityTheoryto bethe theoryof variousmodelsof computation{indeed,this
seems to be the older point of view. But note that our chosen view-point is more
abstract than this: models of computation de(cid:12)ne complexity classes, not vice-versa.
How Complexity Theory arrives at such an abstract theme may be of interest:
hopefully such evidence is scattered throughout the entire book. But for a practi-
tionerinthe (cid:12)eld,perhapsthe mostcogent argumentisthatmostofthe majoropen
problems in the subject can be posed in the form: \Is J included in K?" where J
and K are complexity classes. This certainly includes the P versus NP, the DLOG
versus NLOG and the LBA questions. Other questions which prima facie are not
about complexity classes (e.g. space versus time, determinism versus nondetermin-
ism) can be translated into the above form. Even investigations about individual
ii
languages (e.g. complete languages) can be viewed as attempts to answer questions
aboutcomplexityclassesortoplacethelanguageintoawell-knowncomplexityclass
(e.g., is graph isomorphism in P?). Of course, these by no means exhaust all the
work in Complexity Theory but their centrality cannot be denied.
As for the student of complexity, some basic questions are more immediate: for
a phenomenon as complex as complexity (no tautology intended) it is no surprise
that the theory contains many assumptions, not all of which can be rigorously or
even convincingly justi(cid:12)ed. We can only o(cid:11)er heuristic arguments and evidence.
This is the burden of chapter one which presents the methodological assumptions of
Complexity Theory. It comes from my attempt to distill what practitioners in this
(cid:12)eld had been (subconsciously or otherwise) cultivating in the past twenty years or
more. By making them explicit (though by no means to our total satisfaction), it
is hoped that they will speedup the entry of novitiates into the (cid:12)eld. Indeed, this
is out of self-interest: a (cid:12)eld thrives with fresh ideas from each new generation of
practitioners. Another possible result of explicating the underlying assumptions is
renewed criticisms of them, leading to alternative foundations: again, this is most
healthy for the (cid:12)eld and certainly welcome. In any case, once these assumptionsare
accepted, we hope the student will discover a powerful and beautiful theory that
o(cid:11)ers much insight into the phenomenon of complexity. This is enough to justify
the present theory (though not to exclude others).
There are several notable features of this book:
(a) We have avoided traditional entry-points to Complexity Theory, such as au-
tomata theory, recursive function theory or formal language theory. Indeed,
none of these topics are treated except where they illuminate our immediate
concern.
(b) The notion of computational modes is introduced early to emphasize its cen-
trality in the modern view of Complexity Theory. For instance, it allows
us to formulate a polynomial analogue of Church’s thesis. Many of us are
brought up on the sequential-deterministic mode of computation, called here
the fundamental mode. Other computational modes includenondeterministic,
probabilistic, parallel, and alternating, to name the main ones. These modes
tend to be viewed with suspicion by non-practitioners (this is understandable
since most actual computers operate in the fundamental mode). However, it
is important to wean students from the fundamental mode as early as possi-
ble, forseveral reasons: Not onlyare the other modes theoretically important,
the technological promise of economical and vast quantities of hardware have
stirred considerable practical interest in parallel and other modes of compu-
tation. The student will also come to appreciate the fact that computational
iii
modes such as represented in proof systems or grammars in formal language
theory are valid concepts of computation. On the other hand, these alterna-
tive computational modes have distinct complexity properties which is in fact
what makes the subject so rich.
(c) Traditionally, the computational resources of time and space have been em-
phasized (other resources are more or less curiosities). The recent discovery
of the importance and generality of the computational resource of reversals
is stressed from the beginning of the book. Of course, the number of results
we currently have on reversals is meager compared to the number of papers
devoted time and space; indeed through the writing of this book I am con-
vinced that this imbalance should be remedied. (Paucity of results of other
sorts are also evident: in the chapters on reducibilities, diagonalization and
relative classes, we see no results for modes other than the fundamental and
nondeterministic modes. It is very likely that the (cid:12)lling of these gaps will
require new techniques). It is my strong belief that the new triumvirate of
time-space-reversal gives us a more complete picture of computational com-
plexity. We also treat simultaneous resource bounds, such as simultaneous
time-space. Again this approach gives us a more rounded view of complexity.
It is also a topic of increasing importance.
(d) I believe an important contribution of this book is the theory of valuations
and choice machines. Besides its unifyingand simplifyingappeal, it addresses
many foundational questions raised by the newer notions of computation such
as interactive proofs. Researchers were able to simply ignore some of these
questions in the past because they focused on \nice" situations such as poly-
nomialtime. Butoncetheproperfoundationisintroduced,newissuesariseon
theirownright. Forinstance,theuseofintervalalgebraexposesnewsubtleties
in the concepts of error. We hope that this inquiry is only the beginning.
(e) In attempting to give a coherent treatment of the subject, it is necessary
to unify the widely varying notations and de(cid:12)nitions found in the literature.
Thus, many results on space complexity are proved using a version of Turing
machine di(cid:11)erent from that used in time complexity. Yet a common machine
model must be used if we want to study simultaneous time-space complex-
ity. We choose the o(cid:11)-line multitape Turing machines. Another case where
uniformity is badly needed is something so basic as the de(cid:12)nition of \time
complexity": to say that a nondeterministic machine accepts within t steps,
some de(cid:12)nitions require all paths to halt within this time bound; others are
satis(cid:12)ed if some accepting path halt within this bound. We distinguish them
iv
as running time and accepting time, respectively. Generalizing this distinc-
tion to other measures, we hence speak of running complexity versus accepting
complexity. How comparable are these results? By and large, we would prefer
to sticktoacceptingcomplexitybecauseitseemsmorefundamentalanda(cid:11)ord
simpler proofs. This we manage to do for most of the (cid:12)rst half of the book.
Unfortunately(orfortunately?),thecorpusofknownresultsseemsrichenough
to defeat any arti(cid:12)cial attempt to impose uniformity in this respect. This is
most evident in probabilistic computations where running complexity seems
to be the more fruitful concept. A (cid:12)nal example is the disparate notations
for reducibilities. From the diverse possibilities, a unifying choice appears to
be (cid:20)ct where t indicates the type of reduction (many-one, Turing-reducibility,
etc)andcindicatesthecomplexityconsiderations(polynomialtime,log-space,
etc). Or again, for a complexity class K, we prefer to say that a language is
\K-complete under (cid:20)-reducibility"rather than \(cid:20)-complete for K": here the
choice is driven by the wide currency of the term \NP-complete language".
We are aware of the dark side of the grand uni(cid:12)cation impulse, which can
rapidly lead to unwieldy notations: it is hoped that a reasonable compromise
has been made within the scope of this book.
(f) In selecting results (many appearing in book-form for the (cid:12)rst time) for inclu-
sion in this book, I have tried to avoid those results that are essentially about
particular machine models. This is consistent with our book’s theme. The
main exception is chapter two where it is necessary to dispose of well-known
technical results concerning the Turing model of computation.
Finally, this book is seen as a self-contained and complete (though clearly non-
exhaustive)introductiontothesubject. Itiswrittensoastobeusableasatextbook
for an advanced undergraduate course or an introductory graduate course. The
didactic intent shouldbe evident in the early chapters of thisbook: for instance, we
may o(cid:11)er competing de(cid:12)nitions (running complexity versus acceptance complexity,
one-way oracle machines versus two-way oracle machines) even when we eventually
only need one of them. By exposing students to such de(cid:12)nitional undercurrents, we
hope they will appreciate better the choices (which most experts make without a
fuss) that are actually made. The later part of the book, especially volume two, is
intended more as a reference and the treatment is necessarily more condensed.
A quick synopsis of this two-volume book is as follows: There are twenty chap-
ters, with ten in each volume. Chapter 1 attempts to uncover the foundations
and presuppositions of the enterprise called Complexity Theory. Chapter 2 estab-
lishesthe basicmachineryfor discussingComplexityTheory; we try to con(cid:12)ne most
model-dependent results to this chapter. Chapter 3 on the class NP is really the
v
introduction to the heart this book: a large portion of the remainder of the book
is either an elaboration of motifs begun here, or can be traced back to attempts to
answer questions raised here. Chapters 4 to 6, is a study of the tools that might
be termed ‘classical’ if this is appropriate for such a young (cid:12)eld as ours: reducibili-
ties, complete languages, diagonalization and translation techniques for separating
complexity classes. Chapters7 and 8 considertwo importantcomputational modes:
probabilismandalternation. Althoughthesetwomodesareseldomviewedasclosely
related, we choose the unusual approach of introducing them in a common machine
model. One advantage is that some results known for one mode can be strength-
ened to their combination; also, contrasts between the two modes become accented.
Chapter 9 is about the polynomial-time hierarchy and its cognates. Chapter 10
introduces circuit complexity: super(cid:12)cially, this is an unlikely topic since circuits
describe (cid:12)nite functions. Traditionally, the interest here comes from the hope that
combinatorial techniques may yield non-linear lower bounds for circuit complex-
ity which in turn translates into non-trivial lower bounds for machine complexity.
This hope has (as yet) not borne out but circuit complexity has yielded other un-
expected insights into our main subject. Two other topics which we would have
liked in volume 1 to round up what we regard as the core topics of the (cid:12)eld are
relativized classes and parallel computation. Unfortunately we must defer them to
volume 2. The rest of volume 2 covers topics that are somewhat esoteric as well as
some directionsthat are actively beingpursued: randomness, structuralapproaches
to complexity, alternative models of computation (storage modi(cid:12)cation machines,
auxilliary Turing machines, etc), alternative complexity theories (such as Kolgo-
morov complexity, optimization problems, Levin’s average complexity), specialized
approaches to complexity (such as mathematical logic), and complexity of some
classes of languages that have inherent interest (such context-free languages, theory
of real addition, etc).
Ideally, Complexity Theory should be taught in a two course sequence; this is
probably a luxury that many Computer Science curriculum cannot support. For a
one course sequence, I suggest selections from the (cid:12)rst 7 chapters and perhapssome
advancedtopics;preferablysuchacourseshouldcomeafteramoretraditionalcourse
on the theory of automata and computability.
The reader is kindly requested to inform the author of any errors of commission
aswellasofomission. Sincemuchof ourmaterialisorganizedinthismannerforthe
(cid:12)rst time, there will be inevitable rough spots; we ask for the readers’ indulgence.
All suggestions and comments are welcome.
I have variously taught from this material since 1981 at the Courant Institute
in New York University, and most recently, at the University of British Columbia.
vi
As expected, this ten-year old manuscript has evolved considerably over time, some
parts beyond recognition. Although many individuals, students and colleagues,
have given me many thoughtful feedback over the years, it is clear that the most
recent readers have the greatest visibleimpacton the book. Nevertheless, I am very
grateful to all of the following names, arranged somewhat chronologically: Norman
Schulman, Lou Salkind, Jian-er Chen, Jim Cox, Colm O(cid:19)’Du(cid:19)nlaing, Richard Cole,
BudMishra,MartinDavis,AlbertMeyer,DexterKozen,FritzHengleinandRichard
Beigel. The extensive comments of Professors Michael Loui and Eric Allender, and
most recently, Professors Jim Cox and Kenneth Regan, from their use of these
notes in classes are specially appreciated. Finally, I am grateful to Professor David
Kirkpatrick’s hospitality and for the facilities at the University of British Columbia
while completing the (cid:12)nal portions of the book.
C.K.Y.
New York, New York
March, 1991
Contents
vii
Chapter 1
Initiation to Complexity Theory
January 16, 2001
This book presumes no background in Complexity Theory. However, \general mathematical maturity"
andrudimentsofautomataandcomputabilitytheorywouldbeuseful. Thisintroductorychapterexplores
the assumptions of Complexity Theory: here, we occasionally refer to familiar ideas from the theory of
computability in order to show the rationale for some critical decisions. Even without such a background
the reader will be able to understand the essential thrusts of this informal chapter.
The rest of the book does not depend on this chapter except for the asymptotic notations of section
3.
This chapter has an appendixthat establishes the largely standard notation and terminology of naive
set theory and formal language theory. It should serve as a general reference.
1.1 Central Questions
Most disciplines center around some basic phenomenon, the understandingof which is either intrinsically
interestingorcouldleadtopracticalbene(cid:12)ts. Forus,thephenomenonistheintuitivenotionofcomplexity
of computational problems as it arises in ComputerScience. The understandingof what makes a problem
(computationally) complex is one cornerstone of the art and science of algorithm design. The stress in
‘complexity of computational problems’ is on ‘complexity’; the concept of ‘computational problem’, is
generally relegated to the background.1
To set the frame of mind, we examine some rather natural questions. A main motivation of our
subject is to provide satisfactory answers to questions such as:
(1) Is multiplication harder than addition?
The appeal of this question, (cid:12)rst asked by Cobham [5], is that it relates to what are probably the two
most widely known non-trivial algorithms in the world: the so-called high school algorithms for addition
and multiplication. To add (resp. multiply) two n-digit numbers using the high school algorithm takes
linear (resp. quadratic) time. More precisely, the addition (resp. multiplication) algorithm takes at most
c n (resp. c n2) steps, for some positive constants c and c . Here a ‘step’ is a basic arithmetic operation
1 2 1 2
(+;(cid:0);(cid:2);(cid:4)) on single digit numbers. So a simple but unsatisfactory answer to (1) is ‘yes’ because c n2
2
dominates c n when n gets large enough. It is unsatisfactory because the answer only says something
1
1There is no general theory of computational problems except in special cases. Such a theory should study the logical
structure of problems, their taxonomy and inter-relationships. Instead, complexity theory obliterates all natural structures
in computational problems bycertain sweeping assumptions we will come to.
1
2 CHAPTER 1. INITIATION TO COMPLEXITY THEORY
about particular methods of adding and multiplying. To provide a more satisfactory answer, we probe
deeper.
It is intuitively clear that one cannot hope to do additions in less than n steps. This is because any
algorithm must at least read all the n input digits. So the high school method for addition is optimal {
notethathereandthroughoutthebook,optimality istakenupto somemultiplicative constant factor. The
situation is less clear with multiplication: Is there any algorithm for multiplication that is asymptotically
faster than the high school method? The answer turns out to be ‘yes’; in 1971 (culminating a series of
developments) Scho(cid:127)nhage and Strassen [30] discovered what is today asymptotically the fastest known
multiplication algorithm. Their algorithm takes c nlognloglogn steps2 on a Turing machine (to be
3
introduced in Chapter 2). Since a Turing machine turns out to be more primitive than any real-world
computer,thismeansthesametimeboundisachievableonactualcomputers. However,thelargeconstant
c in the Scho(cid:127)nhage-Strassen algorithm renders it slower than other methods for practical values of n.
3
For now, let us just accept that Turing machines are indeed fundamental and thus statements about
computations by Turing machine are intrinsically important. But is the Scho(cid:127)nhage-Strassen algorithm
the best possible? More precisely,
(2) Must every Turing machine that multiplies, in worst-case, use at least cnlognloglogn
steps, for some c > 0 and for in(cid:12)nitely many values of n? Here c may depend on the
Turing machine.
Thisisan importantopenprobleminComplexityTheory. A negative answer to question (2)typically
means that we explicitly show an algorithm that is faster than that of Scho(cid:127)nhage-Strassen. For instance,
an algorithm with running time of nlogn will do. We say such an algorithm shows an upper bound of
nlogn on the complexity of multiplication. (It is also conceivable that the negative answer comes from
showing the existence of a faster algorithm, but no algorithm is exhibited in the proof3.) On the other
hand, answering question (2) in the a(cid:14)rmative means showing a lower bound of cnlognloglogn on the
complexity of multiplication. Such a result would evidently be very deep: it says something about all
possible Turing machines that multiply! Combining such a result with the result of Scho(cid:127)nhage-Strassen,
we would then say that the intrinsic complexity of multiplication is nlognloglogn. In general, when
the upper and lower bounds on the complexity of any problem P meet (up to a constant multiplicative
factor) wehave obtained aboundwhichis intrinsic4 to P. We maynow (satisfactorily) interpret question
(1) as asking whether the intrinsic complexity of multiplication is greater than the intrinsic complexity
of addition. Since the intrinsic complexity of addition is easily seen to be linear, Cobham’s question
amounts to asking whether multiplication is intrinsically non-linear in complexity. (Most practitioners in
the (cid:12)eld believe it is.) Generally, for any problem P, we may ask
(3) What is the intrinsic complexity of P?
It turns out that there is another very natural model of computers called Storage Modi(cid:12)cation Ma-
chines (see x5) which can multiply in linear time. This shows that complexity is relative to a given model
of computation. The student may recall that as far as the theory of computability goes, all reasonable
models of computation are equivalent: this is known as Church’s Thesis. But simplistic attempts to
formulate analogous statements in Complexity Theorywould fail. For instance, there are problems which
can be solved in linear time in one model but provably take cn2 (for some c > 0, for in(cid:12)nitely many n)
in another model. So a fundamental question is:
(4) Which model of computation is appropriate for Complexity Theory?
2Unless otherwise indicated, the reader may always take logarithms to the base 2. We shall see that the choice of the
base is inconsequential.
3Suchproofs are knownbutarerare. Forinstance, therecentworkof RobertsonandSeymouron graph minors leads to
precisely suchconclusions (seefor example[18]). Indeedthesituation couldbemore complicated because therearedegrees
of explicitness.
4See section 8 for more discussion of this.