CS计算机代考程序代写 algorithm Limits of Computation

Limits of Computation
15 – Hierarchy Theorems
 (short version) Bernhard Reus
1
The complexity story so far
• how to measure running time for different models
• runtime bounds
• complexity classes, e.g. LIN and P
• Cook’s (Invariance) Thesis and Cook-Karp Thesis
2

THIS TIME
Hierarchy Theorems
• Tackle the question: “Can we decide more problems if we have ‘more’ time?”
“Runtime Complexity” Hierarchy – 
 for Animals (on land)
image: www.eduplace.com (Leigh Haeger)

for WH1LE constants in time
Exact formulation leads to
Hierarchy Theorems:

bounds do matter.

in general where ‘more’ means (asymptotically) faster growing time bounds
3
Can we decide more problems if we increase the time bound?
The “fathers” of computational complexity (this line of work started in 1965):
J. Hartmanis R.E. Stearns P. Lewis II Turing Award 1993
4

15.1.LIOREMS 15.1.LINOREMS
NEAR TIME CHAPTER 15. HIERARCHY THE EAR TIME CHAPTER 15. HIERARCHY THE
apters, diagonalisation, will save the day once again, It is possible pters, diagonalisation, will save the day once again, It is possible
program that involves runtime information with the help of a self-int ogram that involves runtime information with the help of a self-in
…in other words, if time
nts runtime. Using diagonalisation, i.e. a form of self-application,
s runtime. Using diagonalisation, i.e. a form of self-application, ne the problem that shows the Hierarchy Theorem in question, to be
e the problem that shows the Hierarchy Theorem in question, to b by this particular program.
bound f is “smaller” than y this particular program.
g, are there problems in inear Time Hierarchy Theorems
ear TiTmIeMHiEera(rcgh)y Ttheaortemasre not in
rst consider more in detail what “bigger” means for time bound
L
TIMEL(f) ?
st consider more in detail what “bigger” means for time bound ut linear time bounds, i.e. linear functions, then it should be obvi
linearintviemsteigbatoeutnhdiss,foir.ed. ilfifnereeanrt fluangcutiaognes,Lthaennd fitunschtoiounlsdf,bge obvi f(n) = a⇥n is “bigger” than g(n) = b⇥n if a > b. Therefore, f
(n) = a⇥n is “bigger” than g(n) = b⇥n if a > b. Therefore, f we can reformulate the above question to
e can reformulate the above question to
vious ch to con- vious cha to con-
structaerpreter structaprterpreter
that cou one can that count one can
then defi the one thendefinetheone
decided decided b
15.1 L 15.1 Lin
Letusfis.Ifwe
Letusfirs.Ifwe talkaboousthat
talkaboutousthat functionorlinear
functionforlinear bounds,
bounds, w
Do constants matter for linear time complexity (classes)?
Do constants matter for linear time complexity (classes)?
5
We can thus rephrase the original question for the special cases of linear runtime as
We can thus rephrase the original question for the special cases of linear runtime as follows:
follows: CanLgtime
-programs decide more problems with a larger constant in their linear runnin
rnocger?ams decide more problems with a larger constant in their linear runnin ce?
formally ormally
Does a < b imply TIMEL(a⇥n) ( TIMEL(b⇥n) ? Does a < b imply TIMEL(a⇥n) ( TIMEL(b⇥n) ? alently proper inclusion entlyDoes a < b imply TIMEL(b ⇥ n) \ TIMEL(a ⇥ n) 6= 0/ ? Constants do matter Does a < b imply TIMEL(b⇥n)\TIMEL(a⇥n) 6= 0/ ? 1 anCsawneLr-tphriosgqraumestidoencidfoermthoerelapnrgoubalegmesWwHithLEal(arergcaelrlcthoantsttahnitsis in their linear “running time al1lowance”? rograms can only have one variable). In order to prove it we need to nswer this question for the language WH LE (recall that this is l programs with timeout parameter, called timed universal program grams can only have one variable). In order to prove it we need to It can be shown that – in line with our intuition – this is true programs with timeouftopratrhaemlaentgeur,agcealWleHdLtiEm.ed universal program 11 on 15.1 (timed universal program). A WH LE program tu is a ti Need some proof technique: timed universal program rogram if for all p 2 WH1LE-program, d 2 D1 and n 1: 15.1 (timed universal program). A WH LE program tu is a ti 111 grWaHmLEifforallp2WH LWEH-LpErogram,d2DaWnHdLnE1: e 1 (d) nthen tu [p,d,n]=[ p (d)] p CaanlloLw-pagtime allowan or, more or, more f or equiv or equival We can where p We can a universa where pro WHILE universal s: consider s: Definitimeduni- versal p Definitionmeduni- versal pro 1.Iftim  JK JK 11 116 1 WHWHLELE WWHHLLEE WH LE 1.2I.fItfimtimee (d()d)>nnthtehnenJtJtuuKK [[pp,,dd,,nn]]=[nJiplK (d)] pp
2.IftimeWH1LE(d)>nthen tuWH1LE[p,d,n]1=nil
p WH LE This definition states that the effect of tu
[p,d,n] is to simulate p for either WH1LE 1 WH1LE
consider
WHILE
JK
JK
time (d) or n steps, whatever is the smWaHllLeEr. If time (d)  n, i.e. p termi-
pp
This definition states that the effect of p d n is to simulate p for either

}
else ( out of time so stop )
{CSt := nil;
} } (* end of while loop *)
** X := nil (* and return nil *)
}write X
Time Hierarchy for WH1LE
Fig. 15.1: Timed universal program tw for WH1LE in WHILE
Neil D Jones
Now we are able to prove a concrete version of the question ⇥EL(b⇥n)?
instantiating L to be WH1LE.
Theorem 15.2 (Linear Time Hierarchy Theorem, [5, Thm. 19.3.1]). There is a constant b such that for all a 1 there is a problem A in TIMEWH1LE(a⇥b⇥n) that is not in TIMEWH1LE(a⇥n).
constants do actually matter for WH1LE. Does a>bimplyTIME (a n)(TIM
L
This states the existence of a set (or problem) A that can be decided in linear time The proof uses a well known technique again that we used in the proof of
with bound f(n) = (a⇥b)⇥n but not with time bound f(n) = a⇥n. Therefore, in the Undecidability of the Halting Problem (and the Barber’s Paradox): self-
order to decide A, the bigger constant a ⇥ b is sufficient but a is insufficient. reference (diagonalisation).
Proof. To show the existence of such a problem we construct one by diagonalisa- tion, a concept we have already used before (for the proof of the undecidability of the Halting problem).
Let us first sketch the idea of this pro7of. We define A via diagonalisation such that the following hold:
192
Open problem
• Does this theorem hold for WHILE and
GOTO as well?
• Remark: one can show, however, an analogous theorem for SRAM (using logarithmic time measure).
8

we can do it easily. The reason is that Turing machines consist of ha b
i c
o .
n .
e
s
m l
C h
y g
n s
r
software and we can redefine parts of the hardware, the tape alpha storing information on the tape, as well as the software (the transit
or T and Tur
alto tocmemachine
uring machine program). Note that the factor need decompression here is 2n + 4, it may be di↵erent
ing machines.
Some people claim that this is evidence for ignoring c
gether as future machines will be a constant factor qu onsider complexity of algorithms running ton the sa
Let us now consider
es write that the “physical realizability of this trick” e in order to get a speedup “is dubious” (in his book,
super-linear time bounds
n see that for some languages other than Turing machi matter anad nfodr whoicthhsuechralcaonsgtauntasgpeds-up theor
.
t of all, superlinear time refers to time bounds that
(not just WH1LE) 3 Superlinear time
Jon war soo do hold
14.
of changi Section 18 nes, consta
em does d
Firs
linear functions; this is true e.g. for f(n) = 2 · n, f(n) = n · log n or
Let us now consider other languages and other (super linear) ti The original question we tackle in this Chapter are now rephrased
9
ed for the for varying
onstants in
icker. But
grow at lea
Let f and g be time bound functions that are su
programs decide more problems with time bound
bound g if f grows (asymptotically) faster than g? What does “smaller” mean for
The asymptotic complexity is built into the kind of general (super-linear) time bounds?
e been using so far. Time bounds describe how run
f,g time bounds (functions mapping natural
ut d•ata size grows to an arbitrarily large amount. T
numbers to natural numbers)
investigation about feasibility of problems concerns Let f and g be time bounds, i.e. functions of type

f grows (asymptotically) much faster than g.
g “smaller than” f if 

ws (much) faster than g if eventually f(n) g(n) fo
re indicates “much larger”). In other words if • f grows much faster than g if eventually f(n) > g(n)
for all n “large enough”: lim g(n) = 0 asymptotic n!1 f(n)
growth
e that this does not make any statement about wh
hav inp our
perlinear. f than wit
complexit ning time
his is fair e
large input N ! N. We
groralln“la (he
Not ether f(n) small n. We just know that10if n gets really big (in other words,
6

15.2. BEYOND LINEAR TIME CHAPTER 15. HIERARCHY THEOREMS
time constructible. What does that mean? Informally, f is time constructible means
more than the order of f (n) steps”. [5, Chap. 19.5]. This is an intuitive restriction Time constructible bounds
that we can “find out how much time f (n) is available by a computation not taking
and many familiar functions are time constructible (in particular linear functions, polynomials, exponentials, and arithmetic combinations of those, see Exercise 5).
Definition 15.3. A time bound f : N ! N is called time constructible if there is a program pandaconstantc>0suchthatforalln0wehavethat
JpK(pnq) = pf(n)q and timep(d)  c⇥ f(|d|) for all d
where pnq denotes the encoding of number n in the datatype of the language used.
This definition works for any programming language as every language should be
The time bound is computable.
able to encode natural numbers (thus the language superscripts have been omitted).
The time it takes to compute the time bound is itself bounded 
 Non-computable functions (like the Busy-Beaver function) cannot be time con-
by the time bound up to a constant factor.
structible. Also, we need to have enough time to produce the result, thus sub-linear (small growing) functions are often not time constructible. In the linear Hierarchy Theorem 15.2 this condition was implicit as linear functions are automatically time constructible.
Theorem 15.3 (Asymptotic Hierarchy Theorem WH1LE). If functions f and g are
time constructible, f (n) n, g(n) n and11 g 2 O ( f ) then it holds that:
CHAPTER 13. COMPLEXITY CLASSES 13.4. BIG-O AND LITTLE-O
TIMEWH1LE(O(f))\TIMEWH1LE(O(g)) 6= 0/ CHInAoPuTrEqRue1s3t.foCrOinMfePaLsEibXleITpYroCblLeAmSs,ScEoSnstant fac1t3o.r4s. aBreIGo-nOlyAreNlDevaLnITt fToLrEs-mOall
values anyway. For instance, it makes a big difference to us whether a program Proof. WeproceedanalogouslytotheLinearTimeHierarchyTheorem15.2,i.e.we runsInfooru1r qseuceostndfoorrin5f0easeicbolenpdrso.bHleomwse,vceor,nisftawntefcaocntosrisdearelaorngleyirnepleuvtaanntdfotrhusms laollng
need to find a problem A such that:
ruvnalnuiensgatnimywesa,y.itFworonin’sttamnackee, iatnmyadkiefsfearebnicgedtioffeursenwcheetoheursawphreotghrearma rpurnosgrfaomr 1 creunntusrfyoro1r5se0ccoenndtuorie5s0,sbeoctohnadrse.eHqouwaellvyeru,sieflwesesctonussi.derlargeinputandthuslong
1. A is decidable by a WH1LE program decA
ruTnnhieng“Btiimg-eOs1,”itnowtoanti’otnmwakilel abneyudseifuelretnoceintdoicuastewthethqeWurHalaiLtEpyrogframn arlugnosriftohrm1in
2. decA 2 WH LEtime(b⇥ f (n)), in other words A 2 TIME 1 (b ⇥ f (n)) and thus
tecremntsuoryfoasry5m0pcte1onttiucrwieso,rsbto-tchasaereruenqutiamlley.uselesstous.
We say g(n) 2 1O ( f (n))), or equivalently g 2 O ( f ), to mean that g(n) approxi- Big-O
A2TIMEWH LE(O(f))
The “Big-O” notation will be useful to indicate the quality of an algorithm in
3. if A 2 TIMEWH LE(O(g)) with a decision procedure p 2 WH1LEtime(O(g)), then terms of asymptotic worst-case runtime.
mately (up to a constant factor) grows at most as quickly as f (n). So g can grow
we get a contradiction if we ask p 2 A using decision prcedure p, thus askig We say g(n) 2 O ( f (n))), or equivalently g 2 O ( f ), to mean that g(n) approxi-
even more slowly than f . WH1 LE
mactteulyal(lyupJptoKa con(stpa)nt=fatcrutoer?) grows at most as quickly as f(n). So g can grow Definition 13.3 (Big-O). Let f : N ! N be a function. The order of f , short O( f ),
even more slowly than f .
To obtain the appropriate decA we modify diag from Fig. 15.2 changing the as-
is the set of all functions defined below:
Definition 13.3 (Big-O). Let f : N ! N be a function. The order of f , short O( f ),
signment to variable Timebound to
is the set of all functions defined below:
{g:N!N|8n>n0.g(n)c⇥f(n)forsomec2N\{0}andn0 2N} Timebound := n;
{g:N!N|8n>n .g(n)c⇥f(n)forsomec2N\{0}andn 2N} 00
In other words O ( f ) are those functions that up to constant factors grow at most as where f is the program that computes time bound f . This program exists since f is
fast as f (or, in other words, not faster) and are thus at leas1t “not worse” a runtime time constructible. Again, we can compile diag into a WH LE program decA and
In other words O ( f ) are those functions that up to constant factors grow at most as bound than f (maybe even “better”). For g 2 O( f ) we also say g isO( f ).
thus define
fast as f (or, in other words, not faster) and are thus at least “not worse” a runtime
WH1 LE
InboDunefd. t1h3a.n3 fth(emnaybsepevcAiefin=e“s{btehdtete|ruJ”pd)p.eeFcroArbKogu2ndOf(odfr)“w=smetaraulsleon}suamybgeirs”Oo(nfe).ignores for 0
the purpose of asymptotic complexity considerations and c allows one to ignore ExInacDtleyf.a1s3.f3otrhethne sLpienceiafiresTtihmeeupHpieerrbarocuhnyd fTohre“osrmemall nwuembcaenrs”poronveeigtnhoartesdfeorcA
0c
constant differences that in the limit are negligible as well. Recall that lim = 0 time(b⇥ f (n)) n!•
1n the purpose of asymptotic complexity considerations and c allows one to ignore
2 WH LE , for some b, so condition (2) is met. For condition (3) that A 2/ whatever1 constant c is. c
constWaHnt LdEifferences that in the limit are negligible as well. Recall that lim = 0
n!•
TIME (O(g)), as before, we carry out a proof by contradiction and assunme A 2
whatever constant c is.
Example 13.1. Consider the following examples of O( ):
Example 13.1. Consider the following examples of O( ):
1. n is O n2 as it grows “at most like n2” In this case the constant factor can be
196
2
oneasobviouslynn foralln. 12
1. n is O n2 as it grows “at most like n2” In this case the constant factor can be
2. 3n2 is On2, as it grows “like n2 up to a constant factor.” This can be seen by one as obviously n  n2 for all n.

settingn fromDef.13.3to0andcto3.
2. 3n2 is O0 n2 , as it grows “like n2 up to a constant factor.” This can be seen by
3. 34n2 + 23n + 12 is On2 as it grows “like n2 up to a constant factor.” This can setting n0 from Def. 13.3 to 0 and c to 3.
222
3.b3e4snee+n2b3yns+et1ti2ngisnO0 fnromaDsietf.gr1o3w.3st“olik0eanducptto6a9c.oTnhstiasnstuffaficctoers.”aTshwisechaanve
should be familiar from module Program Analysis
222
3b4enseen23bny se1tt2ing n69nfrofmorDaellf.n13.30 tboec0aaunsde 3c5tno 69.2T3hnis su1f2fifcoersnas w1e. have

“(The runtime of) quick sort is On .”
!!
3. n is not in o(34n +23n+12) as limn!• = and thus not 0. 22
where n is understood to be the size of the array to be sorted (i.e. the input). But on where n is understood to be the size of the array to be sorted (i.e. the input). But on
the other hand, when talking about concrete programs, we are able to say the other hand, when talking about concrete programs, we are able to say
“The WHILE-program p that “does” quick sort is in WHILEtime(9n2+37).” time(9n2 +37)
“The WHILE-program p that “does” quick sort is in WHILE .” although we might only be interested in the fact that
although we might only be interested in the fact that
“The WHILE-program p that “does” quick sort is in WHILE were f (n) 2 O n .”
time(f) 2 “The WHILE-program p that “does” quick sort is in WHILEtime( f ) were f (n) 2 O n2 .”
From these examples we see the advantage of allowing the uLse of “Big-O” and Generalising TIME (f)
From these examples we see the advantage of allowing the use of “Big-O” and we wo1u3ld.4l.ikBeItoGm-OakAeNitDavaLilIaTbTleLtoE-thOe definitionCoHfAcoPmTpEleRxi1ty3c.laCssOTMIMPLEE:XIT
we would like to make it available to the definition of complexity class TIME: Definition13.4.Foranytimedprogrammin[glanguageLwedefineanothercom-
Definition 13.4. For any timed programming language L we define another com- “(The runtime of) quick sort is O n2 .”
plexity class using “Big-O” as follows: plexity class using “Big-O” as follows:
where n is understood to be the size of the array to be sorted (i.e. the i L[L
13.4. BIG-OANDLITTILME-EOL(O(f))C=HAPTERTI1M3.ECLO(gM)PLEXITYCLASSES TIME (O(f))= TIME (g)
the other hand, when talking about concrete programs, we are able to s
g2O(f) 2 g2O(f)
“(The runtime of) quick sort is O n .” time(9n2+37)
“The WHILE-program p that “does” quick sort is in WHILE .”
This is the class of problems L-decidable in O(f), that is with a runtime bound This is the class of problems L-decidable in O(f), that is with a runtime bound
where n is understood to be the size of the array to be sorted (i.e. the input). But on asymptotically growing, up to constants, not more than f .
asymptotically growing at mos like f , up to constants. although we might only be interested in the fact that
the other hand, when talking about concrete programs, we are able to say
Lemma 13.5. If g 2 O( f ) then TIME(g) ✓ TIME(O( f )). 2
Lemma 13.5. If g 2 O(f) then TIME(g) ✓ TIME(Otim(ef(9)n).+37) time(f) “The “WTHIhLeEW-pHrIogLraEm-pprothgarta“mdoeps”thqautic“kdsoerts”is qinuWicHkIsLoErt is in WH.”ILE
were f (n) 2 This definition relaxes TIMEL(f) with given worst case time
Proof. Exercise 14. Proof. Exercise 14.
although we might only be interested in the fact that
From these examples we see the advantage of allowing the use of
bound f in the spirit of asymptotic complexity.
time( f ) 2 “TwheWwHIoLuEl-dprloigkraemtop tmhata“kdeoeist” aquviackilsaobrtlies itnoWtHhIeLEdefiniwtieorenf o(nf) c2oOmnple.”xity class
Toeexprresstthedualconcept of O(),namelytthattaffunccttiioonnggggrroowssaasysymmpp–
totottiiccaalllymuchffasterthanafunction f,oneusestthe“Liitttllee–o””nnootatatitoionnddeefifinneeddaass From these examples we see the advantage of allowing the use of “Big-O” and
Definition 13.4 (TIME( f )). We define another complexity class usin ffoollloowss::
we would like to make it available to the definition of co[mplexity class TIME:
Y CLASSES
nput). But on ay
On2.” “Big-O” and
TIME:
g “Big-O” as
ows asymp- on defined as
hen o(g) are mally we can
forall nN
ords, g grows n grows very
follows:
Deefiniittiion13..5(Little-o). Let f andgbef[functionsoffttypeeN!N..TThheennoo((gg))aarere
13
Definition13.4(TIME(f)). WedTeIfiMneEan(Ooth(efr)c)om=plexityclTasIsMusEin(gg“)Big-O”as those functions that eventually grow much slower than f . Formally we can define
thfooslelofwusn:ctions that eventually grow much slower than f. Formally we can define
g2O(f) this as follows: TIME(O ( f )) = TIME(g)
this as follows:
f f
o(g) iffforall00suchthatforalln0wehavethat JpK(pnq) = pf(n)q and timep(d)  c⇥ f(|d|) for all d
where pnq denotes the encoding of number n in the datatype of the language used. This definition works for any programming language as every language should be able to encode natural numbers (thus the language superscripts have been omitted).
Non-computable functions (like the Busy-Beaver function) cannot be time con-
Hierarchy Theorem
Arbitrary
structible. Also, we need to have enough time to produce the result, thus sub-linear
(small growing) functions are often not time constructible. In the linear Hierarchy
Theorem 15.2 this condition was implicit as linear functions are automatically time constructible.
Theorem 15.3 (Asymptotic Hierarchy Theorem WH1LE). If functions f and g are time constructible, f (n) n, g(n) n and g 2 o( f ) then it holds that:
TIMEWH1LE(O(f))\TIMEWH1LE(O(g)) 6= 0/
Proof. WeproceedanalogouslytotheLinearTimeHierarchyTheorem15.2,i.e.we
If f grows asymptotically faster than g (under given assumptions need to find a problem A such that:
on f and g) then we can decide more problems with WH1LE 1. A is decidable by a WH1LE program decA
tipmre(ob⇥grf(an)m)sintimeO(f)thanO(g).
2. decA 2 WH1LE , in other words A 2 TIMEWH1LE(b⇥ f(n)) and thus
A 2 TIMEWH1LE(O( f ))
asymptotic
growth in time
time(O (g))
3. if A 2 TIME (O(g)) with a decision procedure p 2 WH LE , then
1
Proof simiWlHarLtEo Linear 1
Time Hierarchy Theorem
we get a contradiction if we ask p 2 A using decision prcedure p, thus askig actually p WH1LE (p) = true?
JK
To obtain the appropriate decA we modify diag from Fig. 15.2 changing the as- signment to variable Timebound to
Timebound := n;
where f is the program that computes time bound f . This program exists since f is
15
time constructible. Again, we can compile diag into a WH1LE program decA and
thus define
MA = { d | JdecAKWH1LE (d) = true }
ore Theorems
Exactly as for the Linear Time Hierarchy Theorem we can prove that decA
)
2 WH1 LEtime(b⇥ f (n) , for some b, so condition (2) is met. For condition (3) that A 2/
TIMEWH1LE(O(g)), as before, we carry out a proof by contradiction and assume A 2
• Similar Hierarchy Theorems hold for SRAM
and TM
• Other fascinating results (Gap-theorem, Blum’s speedup theorem, Levin’s optimality theorem). Not enough time here, but recommended to interested reader.
196
16

Hierarchy
Runtime Complexity Hierarchy – 
 “Runtime Complexity” Hierarchy – 

for classes of WH1LE-programs for Animals

Time(d*2n)
Time(d*nk
logn ) Time(d*n3)
Time(c*n2)
Time(c*n*log n)
Time(a*249*n)
Time(a*n)
Time(a)

exponentially
slow …
cubic
quadratic
larger constant
linear: fast
constant: super
fast
17
END
© 2008-21. Bernhard Reus, University of Sussex
Next time: important Next time: problems and their
? complexity classes
18
proper inclusion