CMPSC 461: Programming Language Concepts Assignment 3 Solution
Problem 1 [8pt] Write down the set of strings recognized by the following regular expressions. If the set is infinite, write down the first 4 shortest elements.
a) (4pt) abc|(ed|f)|a {“abc”, “ed”, “f”, “a”}
b) (4pt) a*(ab|b)*
{“”, “a”, “b”, “ab”, . . . }
Problem 2 [10pt]
a) (5pt) Build a regular expression that captures any number in the following formats: 3.1415926
-212.45
126
1.9e10
For simplicity, assume only a nonnegative integer may appear after ‘e’. Use ‘\.’ for symbol ‘.’.
Solution: (-?)([1-9][0-9]*| 0)(\. [0-9]+)?(e[0-9]+)?
b) (5pt) Build a regular expression that captures all non-empty sequences of letters other than “aaa”. For your convenience, you may use a “not” operator that takes a set of letters as argument and matches any other letter. For instance, not(abc) matches any letter other than a, b and c. Use ‘.’ to match any letter.
Solution: (not(a).*) | a not(a).* | ε | aa (ε | a.+ | not(a).*)
Problem 3 [15pt] Consider the context free grammar:
S ::= + S S | ∗ S S | x
where x is a terminal for letter x, and a string “+x+++xxxx”.
a) (5pt) Write down the leftmost derivation for the string.
S⇒+SS⇒+xS⇒+x+SS⇒+x++SSS⇒+x+++SSSS⇒+x+++xSSS⇒+x++ +xxSS⇒+x+++xxxS⇒+x+++xxxx
b) (5pt) Draw the concrete parse tree for the string. S
+SS x+SS
+SSx +SSx xx
1/2
c) (5pt) Is this grammar ambiguous or unambiguous? If it is ambiguous, give an equivalent unambiguous grammar; otherwise, briefly describe how to implement a parser that scans a string once, and then decides if the string is derivable from this grammar.
It is unambiguous. Multiple answers will be accepted.
One implementation is to check the characters from left to right, and build the parse tree from top to bottom. For each character being scanned, there is only one unique way to expand the current right- most non-terminal node in the tree. The string is accepted when the tree is complete exactly when all characters are scanned.
Problem 4 [10pt] For each of the following grammars state whether it is ambiguous or unambiguous. If it is ambiguous give an equivalent unambiguous grammar. If it is unambiguous, state all the precedences and associativities enforced by the grammar. Assume Id generates identifiers.
a) E ::= E + F | F
F ::= F ∗ F | Id | (E)
Solution: ambiguous. E ::= E + F | F
F ::= F ∗ G | G
G ::= Id | (E)
b) E::=F∩E|F∪E|F F ::= F ∧ G | F ∨ G | G G ::= Id | (E)
Solution: unambiguous. ∩, ∪ are right-associative, ∧, ∨ are left-associative. ∧ and ∨ have higher prece- dence than ∩ and ∪.
Problem 5 [7pt] Consider the following grammar: G ::= S
S ::= A M
A ::= a E | b A A
M ::= S | ε
E ::= a B | b A | ε
B ::= b E | a B B
where a and b are terminals for letters a and b respectively. Define in English the language that the grammar generates.
Solution: by definition, G generates one to multiple A’s. A generates a string where a appears one more time than b. Hence, G generates all strings (of a and b) that contain more a than b.
Assignment 3 Solution, Cmpsc 461 2/2