Confluence
/ ! ” #
$
! Course Announcements 2019 January 2019 Calendar
Discussion of Final Exam
Erlang task due in course git reposit Haskell task due in course git reposi
, How to use git
Keeping Up With The Wiki
, OLD Course Announcements from J
, Prolog task due in course git reposit
! Ruby task due in course git reposito
How to start a program
Ruby Task Laundry List
The Making of Phase 1 of Ruby t The Making of Phase 2 of Ruby ta The Making of Phase 3 of Ruby ta
Software environment of CS3342 co
, Standard setup for course BitBucket
Warmup task due in course git repo
, Week 01: Jan 9 10 of 2019
, Week 02: Jan 16 17 of 2019
Week 03: Jan 23 24 of 2019
, Week 04: Jan 30 31 of 2019
, Week 05: Feb 6 7 of 2019
, Week 06: Feb 13 14 of 2019
, Ruby material Prolog Material Erlang Material
, Haskell Material
, Implementing Language Interpreters a
, Static Code Analysis and CASE Tools
, Misc
–
.
/… ⋆F &W ‘S (
/ Ruby task due in course git repository by end of day on 14 Feb 2019
The Making of Phase 1 of Ruby task due in course
git repository by end of day on 14 Feb 2019
Robert Webber 24 2019 thru April
What we want to do is write some Ruby code that manipulates regular expressions. However the regular expression notation in Chapter 1 of the textbook is complicated
and instead use capital letters to represent common character ranges of interest in language design and use lower case e to represent epsilon.
n 2018 # e represents epsilon
# L represents letters [a-zA-Z]
ry by en#d oDf draeypornes7eMntars20di19gits [0-9]
# Q represents question mark ? in input
ry by end of day on 21 Mar 2019
enough that you need Chapter 2 of the textbook to analyze it. So the first step is to invent a new regular expression notation. The first thing is to get rid of the character range notation
ory by end of day on 9 Apr 2019
)
y by end of day on 14 Feb 2019
# E represents exclamation mark ! in input
# P represents period . in input
# C represents comma , in input
# K represents colon : in input
ask due #in Mcourersperegsiternetpsosmitionruysbyseingdn o-f diany oinp1u4tFeb 2019 # A represents addition operator + in input
k due in course git repository by end of day on 14 Feb 2019
# B represents binary operators * / % ^ in input sk due in#coXursrepgriterseepnotsistorlyebfytepnadroefndtahyeosnis14(Feibn20i1n9put
# Y represents right parenthesis ) in input
mpute server used for marking
# S represents assignment operator = in input repositor#iesT represents less than operator < in input
# U represents greater than operator > in input
itory by end of day on 24 Jan 2019
# Z represents anything else
While this makes things simpler the tricky analysis is caused by the use of infix notation
https://en.wikipedia.org/wiki/Infix_notation for the regular expression operators. Instead we will use a prefix notation https://en.wikipedia.org/wiki/Polish_notation which is easier to process. Indeed we will use the list-prefix notation of Lisp
https://en.wikipedia.org/wiki/Lisp_programming_language#Lists . To keep life simple we will only support three regular expression operators
# . represents concatenation of one or more subtrees in regular expression
# + represents the concatenation of one or more copies of the first subtree
# | represents any one of the subtrees
Referring back to Working out a lexical analysis example our regular expression example d Compilers
[a-zA-Z][a-zA-Z0-9]* would now be written
(.L(|e(+(|LD))))
We will compute the meaning of the whole from the meaning of the parts and so the first step is to create the tree that corresponds to this formula. Note that these operators are not binary and so we are talking about a general tree i.e. an ordered rooted directed
tree https://en.wikipedia.org/wiki/Tree_graph_theory . The label of each vertex in the tree I refer to as it’s root. The children of each vertex I refer to as its subtrees. For our example the corresponding tree might be visualized
. L|
e+ |
LD
where | is an operator or and + has just one subtree – L e L and D are the leaves of this tree with no subtrees and period is the operator labeling the root of the entire tree.
So now the question becomes how can I tell if you do this right. Well one approach is to reconstruct the original regular expression from the tree you produce. In your Bitbucket repository under due190214 you should now find a directory app containing:
phase_1_helpers.rb phase_1_sample_tests.rb
In phase_1_helpers.rb we find the code
class TreeHolder
def initialize(tree)
@tree = tree
end
def to_s
subtrees_string = subtrees_to_s if subtrees_string == “”
@tree.root else
“(” + @tree.root + subtrees_string + “)”
end
end
private
def subtrees_to_s
result = “”
@tree.subtrees.each do | subtree |
result = result + (TreeHolder.new(subtree).to_s)
end
result end
end
So the nodes of the tree object you create must respond to the messages root label of that vertex and subtrees returning a list of the subtrees of that node. The subtrees are basically the same structure as a tree i.e. they too have a root and subtrees – for leaves subtrees returns the empty list. An example of how this can be used to test your code is shown in phase_1_sample_tests.rb
require_relative “phase_1_helpers”
require_relative “../lib/due190214”
@total = 0
@pass = 0
def assert(message, test)
@total = @total + 1 if test then
@pass = @pass + 1 else
puts message
end
end
@total = 0
@pass = 0
test_case = “(.(|LD)D(|LD(.DQ)))” assert(“can handle (.(|LD)D(|LD(.DQ)))”,
TreeHolder.new(PrefixToTree.new(test_case).to_tree).to_s == test_case puts @pass.to_s + ” passed out of ” + @total.to_s + ” tests.”
So the code that does the conversion should be in a class PrefixToTree. It takes in the regular expression string on new. It outputs the corresponding tree on to_tree . Note that I got my message just backwards in that I should be indicating that something failed to work rather than that it worked. What you want to do is copy my app directory into another directory such as app2 where you can change things the way you want them without worrying about my updates to the repository overwriting your work. Note that your solution to this phase and the rest of the assignment should be in the due190214/lib/due190214.rb file. Note that your code should pass any properly formed lisp prefix regular expression and not just the example shown above.
Note: this test script is rather simple. For example if the code being tested throws an exception then the tester exits.
* Marco Manuel + 68
Carolyn Owen
Is it possible to get further clarification on the lines
# + represents the concatenation of one or more copies of the
first subtree
# | represents any one of the subtrees
?
As I read it the + symbol means that there are one or more copies of the leftmost subtree and | is being used like we would see an or operator so leftmost rightmost or both subtrees. Is this correct?
Thanks! Carolyn
29 2019 Carolyn Owen
I believe that the part that’s giving me most trouble is the interpretation of
+
| LD
under this system.
29 2019
Robert Webber
Yes notationally that is a bit unfortunate. The | there is the or alternator and not a downward connector. So basically the subtree that + is being applied to is
| LD
which is the subtree that corresponds to the regular expression |LD. If I had put whitespace in my regex notation this might be easier to read as | L D – either way it indicates that the match must be against either a letter or a digit and the the original expression you presented is saying one or more characters each of which is either a letter or a digit.
When I refer to the operation being applied to a subtree I am really referring to the interpretation of the operator on the original regular expression when it is being expanded to see if it matches a string. We don’t actually end up copying subtrees or anything like that. Indeed the subtrees are just a way of conveniently clarifying the meaning of the regular expression in a hierarchical fashion that can then guide its conversion to an NFA.
29 2019 Robert Webber
Actually for phase 1 the meaning of the operators doesn’t really affect anything unless you are trying to do some error checking on the input – not required but it is often useful for catching mistakes in test data. The operator meanings become central to phase 2 The Making of Phase 2 of Ruby task due in course git repository by end of day on 14 Feb 2019 . While the notation is different the operator meanings are meant to track those given them in the textbook – just there they are infix rather than prefix and parenthesis carry a different meaning the prefix notation is essentially fully parenthesized with no redundant parentheses.
29 2019 Carolyn Owen
Ok thank you I think that does help my understanding. To be sure then in this step we are implementing a tree structure and the method to fill the tree PrefixToTree.
29 2019
Robert Webber
IF by `fill the tree’ you mean labelling the nodes of the tree with characters from the original regular expression so as to guide the interpretation of the tree as a representation of the original regular expression then yes.
29 2019 Robert Webber
Hmmm it might also be worth mentioning that PrefixToTree is a class and not a method. It is an example of a class that very much has a `single responsibilty’. The actual method that gets the work done is to_tree which is supposed to be defined in the class PrefixToTree.
29 2019 Carolyn Owen
Yes sorry I got myself a bit turned around there. Thank you for the quick replies! I think that clears up all of my questions.
29 2019 Logan Mitchell Gunn
In the file phase_1_sample_tests.rbyou have the prefix expression “. |LDD|LD.DQ”
Are you able to provide us with a visual representation of what this expression would look like as a tree for reference?
04 2019
Robert Webber
If we think of it growing left to right and then top down rather than top down and then left to right we see it is
. (|LD) D
(|LD(.DQ))
which in turn would expand to
.|L D
D |L
D (.DQ)
recalling | is or and . is concatenate which in turn would expand to
.|L D
D |L
D .D
Q
I’m not sure I fully understand how this tree is built. If we look at it top down is the following correct?
. |D
LD| LD.
DQ
Or should the bottom right subtree all be under the D.
ie. Is the second . operator on the same ‘level’ as the L and D notation or is it below the D
edit. And should | be considered a subtree of D?
08 2019
Robert Webber
The top most concatenation operator . has three subtrees – your drawing has two. Since D is not an operator in a properly formed regular expression it would never have a subtree. The subtree you indicate for D should actually be a sibling of D – to the right of D indicating that it is processed after D has been processed. I find this odd because you handled the same situation correctly when you expanded |LD.DQ
08 2019 Lin-Sheng Ding
So just to confirm visually from the top-down the tree should look like this?:
Kevin Tianhao Sun 5e14 Robert Webber
referring to the original .|LDD|LD.DQ your drawing is not correct. The |LD subtree should be on the left rather than the right where it now is. Similarly the |LD.DQ subtree should be on the right rather than where it is now on the left. For concatenate order matters. I don’t like the ordering of your |LD.DQ tree but | is commutative so the order doesn’t matter as much.
Lin-Sheng Ding 5 Lin-Sheng Ding
True I forgot about that aspect just wrote it out yesterday in rough so can see it visually and not take up space in my brain.
Much thanks for the response.
5
Jacob Abraham Neiman
I am having a strange error when i am attempting test. `subtrees_to_s’: undefined
method `subtrees’ for #
happening . I have tried googling the solution no results seem to help. I am wondering what is causing the issue.
Joshua Bowman 05 2019 Robert Webber
in subtrees_to_s subtrees is being sent to the tree that to_tree returned when to_tree was sent to PrefixToTree. you might find the discussion
in Ruby Task Laundry List useful.
05 2019
Joshua Bowman
I have reviewed the ruby task laundry list however I am still confused. I am also getting a similar error: undefined method `subtrees’ NoMethodError. Since my to_tree method was just returning an
array that makes sense to me that arrays do not have a subtrees method so it would be my assumption that I should make a tree class that has the methods: subtrees and root and then return an instance of this class from the to_tree method but are we suppose to just implement one class PrefixToTree so is it that to_tree should be returning an instance of the class prefixToTree? where am I going so wrong?
08 2019
Robert Webber
I was following you up til when you said “so is it that to_tree should be returning an instance of the class prefixToTree”. What in what was written leads you to the notion that to_tree is returning an instance of the class where it is defined? Sometimes that sort of thing happens such as the + method for integers but you also have the to_s method for integers that doesn’t return an integer. So what are you reading that you see as supporting the view expressed in this last bit quoted above?
08 2019
Joshua Bowman
I read that we can’t use require_relative and we are suppose to only use 1 file due190214.rb and I can’t put two classes in one file how can I make a tree class to make a tree object with methods
09 2019
Robert Webber
Why do you say you can’t put two classes in one file? My sample solution has many class definitions in the same file and I ran into no problems. I tend to define classes before I use them so I don’t know how Ruby handles out of order class definitions.
09 2019
Joshua Bowman
it’s because my class name did not start with a capital can we pretend this never happened.
09 2019
Robert Webber
Sure and I won’t tell you about my attempts to make . an operator in prolog.
04 2019 Ralph Barac
.
|
D
|
.
L
D
L
D
D
Q
2
Xiaoou Li
In ruby how to code decreased loop? such that
fori=10i>0i– in java 07 2019
09 2019
Ralph Barac
i += 1 or i -= 1 should work Ruby doesn’t seem to increment like some other
popular languages ie i++ or i– 07 2019
Xiaoou Li
Yes I know while loop is able to do that but how to do it using for
loop?
for i in 100…0 puts i
end
It’s not working
07 2019 Ralph Barac
for i in 1..10 do puts i
end
should work
07 2019 Robert Webber
Actually you don’t want for loops. each traverses arrays quite nicely and is much less error prone. See the discussion
in https://stackoverflow.com/questions/2032875/syntax-for-a-for-loop-in- ruby .
However there are many ways of doing things in Ruby as the discussion of decreasing loops in https://stackoverflow.com/questions/8926477/how-to- write-negative-loop-in-ruby-like-fori-index-i-0-i illustrates.
08 2019 Xiaoou Li
Does the input always start and end with the same pair of parentheses compare with .|LDD|LD.DQ in the file Is .|LDD|LD.DQ valid?
07 2019
Robert Webber
Your second example is not valid. If an operator is involved then the prefix lisp notion will always start with an open parenthesis followed by the operator followed by the parameters to that operator followed by a closed parenthesis.
You can see this if you look at the code in the to_s method in TreeHolder that I provided in this wiki page. The only way it can print something without starting with an open parenthesis is if the string representing all the subtrees is the empty string.
08 2019 Hao Ming Wang
Hi I was wondering if we can assume that the input is valid or do we need to check if it’s valid?
Thank you!
08 2019
Robert Webber
I think this issue was first raised in Re: The Making of Phase 1 of Ruby task due in course git repository by end of day on 14 Feb 2019 but it is clearer in the discussion of the requirements of the three phases in Ruby Task Laundry List where the requirements for each stage include a line along the lines of:
o. Note your code should be working for any re_string that corresponds to a properly formed regular expression not just the sample test cases provided.
08 2019 Matthew Costa
Hello Sir I’m wondering if the type of tree is important as long as you can traverse it and work with it consistently throughout the phases for example would an Expression tree be okay to use as the tree or is there another specific type your looking for.
Thanks.
08 2019
Robert Webber
I don’t know what you mean by an Expression tree but if you look at the requirements in Ruby Task Laundry List you will see that they are focussed around objects that respond to certain messages and don’t break the provided testing code. https://en.wikipedia.org/wiki/Duck_typing
08 2019 Hazem Moharram
hello
is the string “.A” a valid edge case? Notice that there is no opening bracket in the beginning i.e. I’m not asking about “.A”. And how are we supposed to handle errors in the test cases?
09 2019
Hazem Moharram
Please note that I’m assuming in the question above that there will be an error in the test case for example “.A”
09 2019
Robert Webber
It is an interesting assumption but I am curious what text I have written on the assignment you would see as supporting the assumption.
09 2019 Robert Webber
The issue of error handling was addressed in Re: The Making of Phase 1 of Ruby task due in course git repository by end of day on 14 Feb 2019 a comment posted yesterday referencing material posted earlier. Not sure what you mean by `valid edge case’. .A is not a properly formed regular expression in the notation being used for regular expressions in this assignment.
09 2019 Pu Huang
Hi Professor Webber
I wonder if we write our code in the helper or we create a ruby file and write the
code there.
09 2019
Robert Webber
See note 1 near top of Ruby task due in course git repository by end of day on 14 Feb 2019 .
09 2019 Nicholas Elder
@ Robert Webber is D a valid tree. Is .D P a valid tree? there is a space in there Is +DQ a valid tree?
10 2019
Robert Webber
As in my reply to you Re: Ruby task due in course git repository by end of day on 14 Feb 2019 space is not part of the notation.
+DQ is a valid tree but it would translate to the same NFA as +D as per the definition of how + works.
D I would not view as a valid tree. While the phase 1 tester is ok with it in phase 2 and 3 it doesn’t really work because D is not an operator see Re: The Making of Phase 1 of Ruby task due in course git repository by end of day on 14 Feb 2019
10 2019 Dennis Ljeti
@ Robert Webber Following your laundry list: subtrees returning something that behaves like an array of trees where a tree is something like what to_tree is returning
In my method subtrees in my Tree class whenever I attempt to populate the array with tree objects the program loops until the stack runs too deep. How should I approach the subtrees return?
10 2019
Robert Webber
If you look at the to_s method in TreeHolder you will see it is expecting that has it walks the tree you are building it expects to reach nodes that have no subtrees. If you are running out of stack on construction most likely you aren’t handling the construction of these leaf nodes right.
10 2019 Pu Huang
Hi Professor Webber
It seems to me that in the helper ruby code tree is presumably a defined class however it doesnt exist in the ruby default object types. I wanted to confirm that if we need to implement a tree class that has a root and a array of subtree in the helper method.
10 2019
Robert Webber
Right. See also Ruby Task Laundry List . As Ruby is a duck typed language https://en.wikipedia.org/wiki/Duck_typing I don’t care about the name of the class that to_tree is returning an instance of I just care about the messages it responds to.
10 2019 Nicholas Elder
@ Robert Webber I dont see a proper formal definition of what a valid re_string corresponding to this language is. Can you please provide a CFG or a larger set of sample expressions so we can understand what inputs to expect. I’ve read everything on here and don’t have a proper understanding still.
Right now the only instruction I see is “its a regular expression mapped to this special language format we have” I don’t think that suffices.
Pu Huang 10 2019 Robert Webber
Proper formal definitions are very rare in programming tasks. Indeed for stuff like this once you have the formal definition you no longer have any real need for a programmer as a computer can take the definition on to a working program – one such example being Prolog as we will soon see.
Anyway there is a pretty detailed explanation if you look at the accompanying wikipedia links:
So the first step is to invent a new regular expression notation. The first thing is to get rid of the character range notation and instead use capital letters to represent common character ranges of interest in language design and use lower case e to represent epsilon.
While this makes things simpler the tricky analysis is caused by the use of infix notation https://en.wikipedia.org/wiki/Infix_notation for the regular expression operators. Instead we will use a prefix notation https://en.wikipedia.org/wiki/Polish_notation which is easier to process. Indeed we will use the list-prefix notation of Lisp
https://en.wikipedia.org/wiki/Lisp_programming_language#Lists . To keep life simple we will only support three regular expression operators …
we have a bunch of examples scattered throughout the pages and discussion of this assignment
and of course we have to_s in TreeHolder that shows how to go from a tree back to a properly formatted regular expression
So basically it boils down to understanding prefix notation where operators have a known number of parameters and then seeing how Lisp uses parenthesis to handle the case where the number of parameters of an operator is unknown – both of which are discussed in the wikipedia links.
10 2019 Xiaoou Li
Does a-zA-Za-zA-Z0-9* match NIL? 10 2019
Robert Webber
If by NIL you mean the three letter identifier then yes it matches.
10 2019 Xiaoou Li
I mean NULL like nothing 10 2019
Robert Webber
If you mean the empty string like epsilon then the answer is no that regular expression requires at least one letter at the start of an identifier.
10 2019 Xiaoou Li
Lishan Huang Hi professor
this is the definition you give us
# . represents concatenation of one or more subtrees in
regular expression
and a-zA-Za-zA-Z0-9* is .L|e+|LD
which represents “L” and “|e+|LD” are two subtrees of root “.”
so the input empty string does match “|e+|LD” in the other word the empty string is one subtree in regular expression
10 2019
Robert Webber
If you look at the definition of concatenation in chapter 1 of the compiler design textbook you will find it doesn’t work the way you explain it above. `one or more’ here doesn’t mean you get to pick one or more from a bunch it means that the bunch might contain one or more. what you do with the bunch brings us back to the definition in chapter 1 of the compiler design textbook.
10 2019 Xiaoou Li
so .AB and .AB match the same thing in your definition
10 2019
Robert Webber
Neither of those actually exist. The way the concatenation of A and B is written is as .AB
10 2019
I add a line puts TreeHolder.newPrefixToTree.newtest_case.to_tree.to_s before the assert and the result shows that
o t
a o r
s
s
n
.|LDD|LD.DQ
)