CS计算机代考程序代写 database compiler DNA Java computational biology Slide 1

Slide 1

Why Study the Theory of Computation?
Implementations come and go.
Chapter 1

IBM 7090 Programming in the 1950’s
ENTRY      SXA     4,RETURN
           LDQ     X
           FMP     A
           FAD     B
           XCA
           FMP     X
           FAD     C
           STO     RESULT
RETURN     TRA     0
A          BSS     1
B          BSS     1
C          BSS     1
X          BSS     1
TEMP       BSS     1
STORE      BSS     1
           END

Programming in the 1970’s
(IBM 360)
//MYJOB    JOB (COMPRESS),
‘VOLKER BANDKE’,CLASS=P,COND=(0,NE)
//BACKUP  EXEC PGM=IEBCOPY
//SYSPRINT DD  SYSOUT=*
//SYSUT1   DD  DISP=SHR,DSN=MY.IMPORTNT.PDS
//SYSUT2   DD  DISP=(,CATLG),
DSN=MY.IMPORTNT.PDS.BACKUP,
//         UNIT=3350,VOL=SER=DISK01,
//         DCB=MY.IMPORTNT.PDS,
SPACE=(CYL,(10,10,20))
//COMPRESS EXEC PGM=IEBCOPY
//SYSPRINT DD  SYSOUT=*
//MYPDS    DD  DISP=OLD,DSN=*.BACKUP.SYSUT1
//SYSIN    DD  *
COPY INDD=MYPDS,OUTDD=MYPDS
//DELETE2 EXEC PGM=IEFBR14
//BACKPDS  DD  DISP=(OLD,DELETE,DELETE),
DSN=MY.IMPORTNT.PDS.BACKUP

Guruhood

*
Returns 1 if the largest element in a three-element vector is greater than the sum of the other two. Otherwise, returns 0.

Applications of the Theory
FSMs for parity checkers, vending machines, communication protocols, and building security devices.
Interactive games as nondeterministic FSMs.
Programming languages, compilers, and context-free grammars.
Natural languages are mostly context-free. Speech understanding systems use probabilistic FSMs.
Computational biology: DNA and proteins are strings.
The undecidability of a simple security model.
Artificial intelligence: the undecidability of first-order logic.

Limitations of Mathematics

This sentence is false.

Limitations of Computing

Is my program correct?

Languages and Strings
Chapter 2

(1) Lexical analysis: Scan the program and break it up into variable names, numbers, etc.
(2) Parsing: Create a tree that corresponds to the sequence of operations that should be executed, e.g.,
/

+ 10

2 5
(3) Optimization: Realize that we can skip the first assignment since the value is never used and that we can precompute the arithmetic expression, since it contains only constants.
(4) Termination: Decide whether the program is guaranteed to halt.
(5) Interpretation: Figure out what (if anything) useful it does.
Let’s Look at Some Problems
int alpha, beta;
alpha = 3;
beta = (2 + 5) / 10;

A Framework for Analyzing Problems

We need a single framework in which we can analyze a very diverse set of problems.

The framework we will use is

Language Recognition

A language is a (possibly infinite) set of finite length strings over a finite alphabet.

Strings
A string is a finite sequence, possibly empty, of symbols drawn from some alphabet .
•  is the empty string.
• * is the set of all possible strings over an alphabet .
Alphabet name Alphabet symbols Example strings
The English alphabet {a, b, c, …, z} , aabbcg, aaaaa
The binary alphabet {0, 1} , 0, 001100
A star alphabet { ,  ,  , , , } , , 
A music
alphabet
{♩,♪,♫,♬,♭,♮,♯,|}
, ♪,♪|♬♬♬

Functions on Strings
Length: |s| is the number of symbols (characters, letters) in s.

|| = 0
|1001101| = 7

#c(s) is the number of times that c occurs in s.

#a(abbaaa) = 4.

More Functions on Strings

Concatenation: st is the concatenation of s and t.

If x = good and y = bye, then xy = goodbye.

Note that |xy| = |x| + |y|.

 is the identity for concatenation of strings. So:

x (x  =  x = x).

Concatenation is associative. So:

s, t, w ((st)w = s(tw)).

More Functions on Strings
Repetition (or power): For each string w and each natural number i, the string wi is:

w0 = 
wi+1 = wi w

Examples:

a3 = aaa
(bye)2 = byebye
a0b3 = bbb

More Functions on Strings
Reverse: For each string w, wR is defined as:

if |w| = 0 then wR = w = 

if |w|  1 then:
a   (u  * (w = ua)).
So define wR = a u R.

Concatenation and Reverse of Strings
Theorem: If w and x are strings, then (w x)R = xR wR.

Example:

(nametag)R = (tag)R (name)R = gateman

Concatenation and Reverse of Strings
Proof: By induction on |x|:

|x| = 0: Then x = , and (wx)R = (w )R = (w)R =  wR = R wR = xR wR.

n  0 (((|x| = n)  ((w x)R = xR wR)) 
((|x| = n + 1)  ((w x)R = xR wR))):

Consider any string x, where |x| = n + 1. Then x = u a for some character a and |u| = n. So:

(w x)R = (w (u a))R rewrite x as ua
= ((w u) a)R associativity of concatenation
= a (w u)R definition of reversal
= a (uR wR) induction hypothesis
= (a uR) wR associativity of concatenation
= (ua)R wR definition of reversal
= xR wR rewrite ua as x

Relations on Strings
aaa is a substring of aaabbbaaa

aaaaaa is not a substring of aaabbbaaa

aaa is a proper substring of aaabbbaaa

Every string is a substring of itself.

 is a substring of every string.

The Prefix Relations

s is a prefix of t iff: x  * (t = sx).

s is a proper prefix of t iff: s is a prefix of t and s  t.

Examples:

The prefixes of abba are: , a, ab, abb, abba.
The proper prefixes of abba are: , a, ab, abb.

Every string is a prefix of itself.

 is a prefix of every string.

The Suffix Relations

s is a suffix of t iff: x  * (t = xs).

s is a proper suffix of t iff: s is a suffix of t and s  t.

Examples:

The suffixes of abba are: , a, ba, bba, abba.
The proper suffixes of abba are: , a, ba, bba.

Every string is a suffix of itself.

 is a suffix of every string.

Defining a Language
A language is a (finite or infinite) set of strings over a finite alphabet .

Examples: Let  = {a, b}

Some languages over :
,
{},
{a, b},
{, a, aa, aaa, aaaa, aaaaa}

The language * contains an infinite number of strings, including: , a, b, ab, ababaa.

Example Language Definitions
L = {x  {a, b}* : all a’s precede all b’s}

, a, aa, aabbb, and bb are in L.

aba, ba, and abc are not in L.

What about: , a, aa, and bb?

Example Language Definitions
L = {x : y  {a, b}* : x = ya}

Simple English description:

The Perils of Using English
L = {x#y: x, y  {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}* and, when x and y are viewed as the decimal representations of natural numbers, square(x) = y}.

Examples:

3#9, 12#144

3#8, 12, 12#12#12

#

More Example Language Definitions
L = {} = 

L = {}

English
L = {w: w is a sentence in English}.

Examples:

Kerry hit the ball.

Colorless green ideas sleep furiously.

The window needs fixed.

Ball the Stacy hit blue.

A Halting Problem Language
L = {w: w is a C program that halts on all inputs}.

Well specified.
Can we decide what strings it contains?

Prefixes

What are the following languages:

L = {w  {a, b}*: no prefix of w contains b}

L = {w  {a, b}*: no prefix of w starts with a}

L = {w  {a, b}*: every prefix of w starts with a}

Using Repetition in a Language Definition
L = {an : n  0}

Languages Are Sets
Computational definition:

• Generator (enumerator)

• Recognizer

Enumeration
Enumeration:

• Arbitrary order

• More useful: lexicographic order
• Shortest first
• Within a length, dictionary order

The lexicographic enumeration of:

• {w  {a, b}* : |w| is even} :

How Large is a Language?

The smallest language over any  is , with cardinality 0.

The largest is *. How big is it?

How Large is a Language?
Theorem: If    then * is countably infinite.

Proof: The elements of * can be lexicographically enumerated by the following procedure:
• Enumerate all strings of length 0, then length 1, then length 2, and so forth.
• Within the strings of a given length, enumerate them in dictionary order.

This enumeration is infinite since there is no longest string in *. Since there exists an infinite enumeration of *, it is countably infinite.

How Large is a Language?
So the smallest language has cardinality 0.

The largest is countably infinite.

So every language is either finite or countably infinite.

How Many Languages Are There?
Theorem: If    then the set of languages over  is uncountably infinite.

Proof: The set of languages defined on  is P(*). * is countably infinite. If S is a countably infinite set, P(S) is uncountably infinite. So P(*) is uncountably infinite.

Diagonalization
Integers – countable
Rational numbers – countable
Irrational numbers – uncountable
Proof idea:
Assume they are countable: n1, n2, n3, …
Construct N as follows:
First decimal of N ≠ first decimal of n1
Second decimal of N ≠ second decimal of n2
and so on
N ≠ ni for any i ≥ 1

Functions on Languages
• Set operations
• Union
• Intersection
• Complement

• Language operations
• Concatenation
• Kleene star

Concatenation of Languages
If L1 and L2 are languages over :

L1L2 = {w  * : s  L1 (t  L2 (w = st))}

Examples:

L1 = {cat, dog}
L2 = {apple, pear}
L1 L2 = {catapple, catpear, dogapple,
dogpear}

L1 = a* L2 = b*
L1 L2 =

Concatenation of Languages
{} is the identity for concatenation:

L{} = {}L = L

 is a zero for concatenation:

L  =  L = 

Concatenating Languages Defined Using Variables
The scope of any variable used in an expression that invokes replication will be taken to be the entire expression.

L1 = {an: n  0}
L2 = {bn : n  0}

L1 L2 = {anbm : n, m  0}

L1L2  {anbn : n  0}

Kleene Star
L* = {} 
{w  * : k  1
(w1, w2, … wk  L (w = w1 w2 … wk))}

Example:
L = {dog, cat, fish}
L* = {, dog, cat, fish, dogdog,
dogcat, fishcatfish,
fishdogdogfishcat, …}

The + Operator
L+ = L L*

L+ = L* – {} iff   L

L+ is the closure of L under concatenation.

Concatenation and Reverse of Languages
Theorem: (L1 L2)R = L2R L1R.

Proof:
x (y ((xy)R = yRxR))
(L1 L2)R = {(xy)R : x  L1, y  L2} (Def. of concat. and reverse)
= {yRxR : x  L1, y  L2} (Theorem 2.1)
= L2R L1R (Def. of concat. and reverse)

What About Meaning?
AnBn = {anbn : n  0}.

Do these strings mean anything?

Syntax = form
Semantics = meaning

Semantic Interpretation
Functions
For “natural” languages:
English
DNA

For formal languages:

• Programming languages

• Network protocol languages

• Database query languages

• HTML

• BNF

The Big Picture
Chapter 3

A decision problem is simply a problem for which the answer is yes or no (True or False). A decision procedure answers a decision problem.

Examples:

• Given an integer n, does n have a pair of consecutive
integers as factors?

• The language recognition problem: Given a
language L and a string w, is w in L?

Our focus
Decision Problems

The Power of Encoding
Everything is a string.

Problems that don’t look like decision problems can be recast into new problems that do look like that.

The Power of Encoding
Pattern matching:

• Problem: Given a search string w and a web
document d, do they match? In other words,
should a search engine, on input w, consider
returning d?

• The language to be decided: { : d is a
candidate match for the query w}

The Power of Encoding
Does a program always halt?

• Problem: Given a program p, written in some
some standard programming language, is p
guaranteed to halt on all inputs?

• The language to be decided:

HPALL = {p : p halts on all inputs}

What If We’re Not Working
with Strings?
Anything can be encoded as a string.

is the string encoding of X.
is the string encoding of the pair X, Y.

Primality Testing

• Problem: Given a nonnegative integer n, is it
prime?

• An instance of the problem: Is 9 prime?

• To encode the problem we need a way to encode
each instance: We encode each nonnegative
integer as a binary string.

• The language to be decided:

PRIMES = {w : w is the binary encoding of
a prime number}.

• Problem: Given an undirected graph G, is it connected?

• Instance of the problem:

1 2 3

4 5

• Encoding of the problem: Let V be a set of binary numbers, one for
each vertex in G. Then we construct G as follows:
• Write |V| as a binary number,
• Write a list of edges,
• Separate all such binary numbers by “/”.

101/1/10/10/11/1/100/10/101

• The language to be decided: CONNECTED = {w  {0, 1, /}* : w =
n1/n2/…ni, where each ni is a binary string and w encodes a
connected graph, as described above}.
The Power of Encoding

• Protein sequence alignment:

• Problem: Given a protein fragment f and a complete
protein molecule p, could f be a fragment from p?

• Encoding of the problem: Represent each protein
molecule or fragment as a sequence of amino acid
residues. Assign a letter to each of the 20 possible
amino acids. So a protein fragment might be
represented as AGHTYWDNR.

• The language to be decided: { : f could be a
fragment from p}.
The Power of Encoding

Casting multiplication as decision:

• Problem: Given two nonnegative integers,
compute their product.

• Encoding of the problem:
Transform computing into verification.

• The language to be decided:

L = {w of the form:
x=, where:
is any well formed integer, and
integer3 = integer1  integer2}

12×9=108
12=12
12×8=108
Turning Problems Into Decision Problems

Casting sorting as decision:

• Problem: Given a list of integers, sort it.

• Encoding of the problem: Transform the sorting
problem into one of examining a pair of lists.

• The language to be decided:

L = {w1 # w2: n1
(w1 is of the form ,
w2 is of the form , and
w2 contains the same objects as w1 and
w2 is sorted)}

Examples:
1,5,3,9,6#1,3,5,6,9  L
1,5,3,9,6#1,2,3,4,5,6,7  L
Turning Problems Into Decision Problems

By equivalent we mean that either problem can be reduced to the other.

If we have a machine to solve one, we can use it to build a machine to do the other using just the starting machine and other functions that can be built using a machine of equal or lesser power.
The Traditional Problems and their Language Formulations are Equivalent

Consider the multiplication example:
L = {w of the form:
x=, where:
is any well formed integer, and
integer3 = integer1  integer2}

Given a multiplication machine, we can build the language recognition machine:

Given the language recognition machine, we can build a multiplication machine:
An Example

Languages and Machines

Finite State Machines
(Finite Automata)
An FSM to accept a*b*:
An FSM to accept AnBn = {anbn : n  0}

Pushdown Automata
A PDA to accept AnBn = {anbn : n  0}
Example: aaabb

Stack:

Another Example

Bal, the language of balanced parentheses

Trying Another PDA
A PDA to accept strings of the form:

AnBnCn = {anbncn : n  0}

Turing Machines
A Turing Machine to accept AnBnCn:

Turing Machines
A Turing machine to accept the language:

{p: p is a Java program that halts on input 0}

Rule of Least Power: “Use the least powerful language suitable for the given problem.”
Languages and Machines

Grammars, Languages, and Machines
Language
Grammar
Machine

Generates
Accepts

Three Computational Issues
• Decision procedures

• Nondeterminism

• Functions on languages