1
• •
•
Big picture
2 DFAs
3
• •
• •
• •
Finite set of states: one start, some accept
DFA reads a string one character at a time, changing states by following the transition function
After reading the last char in the string, if the DFA is in an accept state it accepts and otherwise it rejects
The language of the DFA M is written L(M), and is the set of strings accepted by the DFA
A language (=set of strings) is regular if it is the language of some DFA Regular languages have lots of closure properties: if L1,L2 are regular,
then so is L1 ∩L2,L1 ∪L2,LC1 ,L∗1,… Non-regular languages
• Some languages cannot be accepted by DFAs
• The classic example: {0n1n | n ∈ N}
• but generally anything that requires “counting”
• Looks can be deceiving: {0n0n | n ∈ N} is regular!
1
Review
Peter Dixon April 30, 2021
Formal models of computation: finite-state automata, Turing machines
Limits of those models: non-regular and undecidable languages, uncom- putable functions
Tools for proving things: simulating one model in another, reductions between languages, Kolmogorov complexity bounds
• There are several techniques for proving languages aren’t regular: pump- ing lemma, myhill-nerode, and ordinal extensions
3.1 Myhill-Nerode
• • •
• •
• •
3.2
•
• • •
4
• • • • • •
An equivalence class is a set of things that are “equal” in some sense A DFA has equivalence classes
The eq. classes of a DFA are the set of strings that stop in the same state. Number of classes = number of states
A language has equivalence classes
The equivalence classes of a language are strings where any extension of the string is “the same”
i.e. x ≡ y if, no matter what z is, xz and yz are both in L or both not in L
If a language has infinite equivalence classes, it can’t have a DFA
Ordinal extensions
Define the first extension of a string x w.r.t a language L: the first string z such that xz ∈ L
The set of first extensions: take all x in Σ∗, figure out their first extension
If the set of first extensions is infinite, the language can’t be regular
Similar idea to Myhill-Nerode, but here you start with a single string, and look for the first thing that gets you into the language. In Myhill-Nerode, you start with a pair of strings, and look for any thing that puts one into the language and the other one not
Turing machines
WHAT IF dfa BUT had memory
Turing machines have an infinite tape that they can read and write on
Initially the tape contains the input string
Turing machines don’t halt until they decide to
TMs could accept, reject, or loop forever
Can also think about TMs as computing functions: whatever is on the tape when the TM halts is the output
2
4.1 Deciders vs Recognizers
• The language L(M) of a TM M is the set of strings accepted by M
• M is a recognizer (or accepter) for L(M)
• A language is recognizable/c.e./r.e. if it’s L(M) for some M
• If M also rejects every string not in L (which is equivalent to saying M halts on all strings), then M is a decider and L(M) is decidable
5 Undecidability
• Some languages cannot be decided by TMs
• Classic example: Halting problem {⟨M, x⟩ | M halts on input x}
• Also classic example: the complement of the halting problem {⟨M, x⟩ | M loops forever on input x}
• If L is undecidable, then LC is undecidable 5.1 Hardness
• Two important classes of undecidable problems: CE-hard and coCE-hard
• CE-hard means “if you could decide this, you could decide every CE lan-
guage”. Similar for coCE-hard
• The halting problem is the classic CE-hard language
• If L is CE-hard, then LC is coCE-hard
• L is CE-complete if L is CE-hard and also CE (example: halting prob- lem). Similarly for coCE-complete. If L is CE-complete then LC is coCE- complete.
• LisCEandcoCE⇔Lisdecidable
• L can’t be both CE-complete and coCE-complete (if any L was, then you
could decide the halting problem)
• There are problems that are CE-hard and coCE-hard, like the join of HP and HPcomplement (you proved this in hw)
3
5.2 Computability
• M computes a function f if on any input x, M halts with f(x) on the tape
• Simple examples: a TM that writes 0 over every symbol then halts com- putes the function f(x) = 0|x|
• A TM that halts without changing the tape computes the function f(x) = x
• A TM that erases the tape then halts computes the function f(x) = λ
• Some functions cannot be computed by TMs
• Classic example: f(⟨M,x⟩) =the number of steps that M(x) halts in (0 if it loops forever)
5.3 Reduction
• • •
• • •
• •
6
• •
Reduction is a way of changing which problem you’re looking at by turning it into a different problem
Languages A,B: A ≤M B means “A reduces to B” means there’s a computable function f s.t. x ∈ A ⇔ f(x) ∈ B
If you could solve B you could solve A: get x as input, compute f(x), run the solver for B on f(x)
Ifitaccepts,thenf(x)∈B,sox∈A
Ifnot,thenf(x)̸∈B,sox̸∈A
If A is CE-hard then B is CE-hard: if you could solve B, then you could solve A, and if you could solve A, then you could solve every CE-hard problem. So if you could solve B, you could solve every CE-hard problem
If B is CE then A is CE: if you could solve B, you could solve A; you actually can solve B so you actually can solve A
If B is CE-complete and A is coCE-complete, then you can’t reduce A to B or B to A
Kolmogorov complexity
You can make a Turing machine that starts on a blank tape and eventually stops and outputs something
You can do this to output any string x (by just making a bunch of tran- sitions for each character)
4
• You can make smaller TMs for some strings: you can print a lot of 0s with a loop (you have to be careful about breaking the loop somehow, of course)
• So there must be a “smallest” TM that prints x. C(x) is the size of the smallest TM
• We’ll measure size by length of binary representation, which will depend on exactly what binary representation you use – the optimality theorem says different representations only differ by a constant, though
• If you can make a machine that gets y as input and outputs x, then you can make a machine that gets no input and outputs x: run something that outputs y, then run the machine that turns y into x
• If f(y) = x, then C(x) ≤ C(y)+the size of the TM that computes f is
• Not so interesting for a single x, y, but pretty interesting if you look at a whole set of x and y – basically, any computable function can’t “change” the complexity very much
• C(x) is uncomputable
7 Nondeterminism
• Some additional power you can give to DFAs and TMs
• Can try multiple paths on a particular string (e.g. a DFA can have two
0-transitions from the same state), and accepts if any path accepts
• Doesn’t actually add any power to DFAs (NFAs can still only accept regular languages), but it is pretty convenient
• Doesn’t actually add any power to TMs (NTMs can still only decide decid- able languages and recognize c.e. languages), but it is pretty convenient
• (Not important for this course but) DOES add power to CFGs
• (Not important for this course but) NO ONE KNOWS if it can make TMs
run “significantly” faster (that’s P=NP)
7.1 Equivalence to determinism
• DFA = NFA: subset construction
• Given a NFA with states Q, make a DFA with 2|Q| states where each DFA
state corresponds to a combination of NFA states
• If NFA could choose to go to p or q, DFA goes to the combination p, q
• TM = NTM: same thing, actually
5
8
• • • •
Regular expressions
Another sort of computation: a pattern that matches a set of strings, and operations that combine patterns
Ex. @ matches every string, 1 matches the string 1, ∼ 1 matches every- thing but 1
The set of languages that can be matched by regular expressions is the set of regular languages
You can build a regular expression that mimics a DFA, and a NFA that mimics a regular expression
6