⻚页⾯面 / … / Prolog task due in course git repository by end of day on 7 Mar 2019
the making of phase 1 of Prolog task due in course git
repository by end of day on 7 Mar 2019
由 Robert Webber创建, 最后修改于昨天4:11 下午
Below I show the testing code for phase 1 of the prolog task. Like the Ruby assignment, my sample testing code is in
app/tester.pl and your solution code is expected to be in lib/due190307.pl .
TOC For this document:
1) Running the assignment
2) Automating the testing of your prolog
3) New Regular Expression Notation
4) Phase 1 tester.pl code
4.1) Preliminary Comments on Task
4.2) The Testing Code
5) PHASE 1 LAUNDRY LIST:
6) Hint on Using Prolog with Phase 1 of this assignment
1) Running the assignment
Unlike Ruby, instead of running the testing code from the app directory, one brings together the testing and solution by typing:
gprolog –consult lib/due190307.pl –consult app/tester.pl
which puts you in the prolog interpreter where you can explore your code. solution, a session with the interpreter might look like:
| ?- test_count_successful_tests(N).
test_count_successful_tests(N).
N=2
(3 ms) yes
| ?- test_example_run_tests(S).
test_example_run_tests(S).
At the final stage of checking my phase 1
S = [[97],letter concat (letter star),my_succeed] ? ;
;
S = [[97],letter concat (letter star),my_succeed] ?
(2 ms) yes
| ?- exit.
exit.
uncaught exception: error(existence_error(procedure,exit/0),top_level/0)
| ?- quit.
quit.
uncaught exception: error(existence_error(procedure,quit/0),top_level/0) | ?- C-c C-c
Prolog interruption (h for help) ? e
e
Sigh, for some reason I frequently forget how to exit prolog – it turns out if you hit cntrl-c twice it brings up the interruption prompt to which e will exit (and h will remind you when you forget e).
2) Automating the testing of your prolog
Of course, hopefully by now you know the benefits of being able to automate your testing rather than retype it over and over. The following works:
echo ‘test_count_successful_tests(A).’ | gprolog –consult lib/due190307.pl –consul I found the following predicates useful while testing and so provide them to you:
test_count_successful_tests(N) which tells you how many of your tests succeeded
test_example_run_tests(Status) which shows you which tests succeeded
test_debug_run_tests(Status) which shows you which tests succeeded and what the NFA was that you built as well as what the input string was transformed to (this latter being done by my provided code).
As with the Ruby code, I had to write something that would match a string against an NFA.
test_nfa_match(String, nfa(start(State),X,Y)) matches a string against an NFA
test_nfa_match_helper(String, nfa(start(State),X,Y), State) makes test_nfa_match easier to write by adding a
specification of which state to start the match from (so matching substrings is like matching the whole string, just with different starts)
test_categorize converts strings to lists of categories
test_category converts a character to a category
test_category_helper converts all the characters to categories except those going to the category undefined (Z in the Ruby notation)
The actual test cases are presented like:
test_case(“a”,letter concat (letter star), my_succeed). In the Ruby assignment, this would have been equivalent to:
test_case = “(.L(|e(+L)))”
target = “a”
test_case_tree = PrefixToTree.new(test_case).to_tree
test_case_nfa = TreeToNFA.new(test_case_tree).to_nfa assert(“NFA can match a against (.L(|e(+L)))”,
NFAScanner.new(test_case_nfa).match(target))
In the test_case , my_fail is used when the match should fail and my_succeed is used when the match should succeed.
3) New Regular Expression Notation
You will notice that the regular expression notation has changed from the Ruby assignment. This is because Prolog handles the conversion from infix notation to a tree for us. Consider the following:
unix: cat directives.pl
:- op(300, yfx, ‘concat’).
:- op(500, yfx, ‘or’).
:- op(400, yf, ‘plus’).
:- op(400, yf, ‘star’).
:- op(400, yf, ‘optional’).
unix: gprolog –consult directives.pl
GNU Prolog 1.4.5 (64 bits)
…
| ?- X = letter concat ( letter star ), X = concat(Y).
X = letter concat ( letter star ), X = concat(Y).
no
| ?- X = letter concat ( letter star ), X = concat(Y,Z).
X = letter concat ( letter star ), X = concat(Y,Z).
X = letter concat (letter star)
Y = letter
Z = letter star
yes
| ?- X = letter concat ( letter star ), X = concat(Y,star(Z)).
X = letter concat ( letter star ), X = concat(Y,star(Z)).
X = letter concat (letter star)
Y = letter
Z = letter
(1 ms) yes
| ?- X = letter concat ( letter star ), X = concat(Y,Z), Z = star(W).
X = letter concat ( letter star ), X = concat(Y,Z), Z = star(W).
W = letter
X = letter concat (letter star)
Y = letter
Z = letter star
yes
so, basically, using the ops directives in directives.pl (which are also in the provided tester.pl), when I type
letter concat ( letter star )
prolog is automatically converting it into a tree notation of
concat(letter, star(letter))
The prolog tree notation looks a lot like the prefix lisp notation of the Ruby assignment except that the operator is just before the associated parenthesis contained parameters rather than being the first item in the parenthesis list. In prolog language this is a complex term. In our Ruby notation, . could take an unlimited number of parameters, but in our prolog notation being derived from infix, concat takes two parameters. However, prolog also handles associativity of binary operators so
| ?- X = a concat b concat c, X = concat(Y,Z).
X = a concat b concat c, X = concat(Y,Z).
X = a concat b concat c Y = a concat b
Z=c
(1 ms) yes
| ?- X = a concat b concat c, X = concat(concat(A,B),C).
X = a concat b concat c, X = concat(concat(A,B),C).
A=a
B=b
C=c
X = a concat b concat c
yes | ?-
You can use parenthesis as you normally would in the infix version and they will disappear in the complex term version (although they will have, in the process, guided the shape of the complex term).
The operators to be supported in the assignment are:
:- op(300, yfx, ‘concat’) this is . in the Ruby assignment restricted to binary for
:- op(500, yfx, ‘or’)
:- op(400, yf, ‘plus’)
:- op(400, yf, ‘star’)
this is | in the Ruby assignment restricted to binary for this is + in the Ruby assignment as a suffix operator bin this is * from the textbook as a suffix operator binding
:- op(400, yf, ‘optional’) this is ? from the textbook as a suffix operator binding The leaves of our regular expressions in this assignment will be
epsilon (e in Ruby assignment)
letter (L in Ruby assignment)
digit (D in Ruby assignment) white_space (new handling ” ” and “\n”) question_mark (Q in Ruby assignment) exclamation_mark (E in Ruby assignment) period (P in Ruby assignment)
comma (C in Ruby assignment)
colon (K in Ruby assignment)
semicolon (new handlling “;”)
minus_sign (M in Ruby assignment) plus_operator (A in Ruby assignment) binary_operators (B in Ruby assignment) left_parenthesis (X in Ruby assignment) right_parenthesis (Y in Ruby assignment) equal (S in Ruby assignment)
less_than (T in Ruby assignment)
greater_than (U in Ruby assignment)
undefined (Z in Ruby assignment)
Prolog’s operator notation takes care of creating trees for regular expressions for us, so phase 1 starts with considering how to convert a regular expression tree to an NFA. As with the ruby assignment, testing is done by seeing if the resulting NFAs except the same strings as the regular expressions would have.
4) Phase 1 tester.pl code tester.pl contains:
4.1) Preliminary Comments on Task
% note: all the predicates defined in this file start with the prefix test_
% you are not allowed to reference any of these predicates in your own
% solution. if test_ appears anywhere in your lib/due190307.pl file
% (even in a comment or string constant), it will be assumed you are
% trying to compromise the testing process and marked accordingly.
% A scanner will be defined as a list of scanner terms
% where a scanner term is scanner(Token, RegularExpression)
% where Token is the name the scanner returns if RegularExpression
% % % %
matches an input string. if there are multiple such matches, the Token in the first scanner term in the list to match is the one that should match.
where RegularExpression is a regular expression term defined later.
% from NFA definition in text:
% corresponds to term nfa(Start, States, Transitions)
% Start is a state
% States is a list of state terms
% Transitions is a list of next terms
% a state term has the form state(Name, Status)
% where Name is a name given the state
% Status is either
% not_accepting if it is not an accepting state
% Token if it is an accepting state, when which Token
% has been matched at this point.
% a next term has the form next(From, To, Category)
% where From is a state term in States
% To is a state term in States
% Category is a character category or the term epsilon
% the categories will be
% letter
% digit
% white_space
% question_mark
% exclamation_mark
% period
% comma
% colon
% semicolon
% minus_sign
% plus_operator
% binary_operators for * / % ^
% left_parenthesis
% right_parenthesis
% equal
% less_than
% greater_than
% undefined
% the regular expression operators will be:
% concat for concatenation as a binary infix operator
% plus for one or more copies of its parameter as a postfix operator
% star for zero or more copies of its parameter as a postfix operator
% opertional for zero or one copies of its parameter as a postfix operator
% or for from one or the other as a binary infix operator
% (they are suffixed with underscore to avoid conflicts with existing operators)
As noted in Re: the making of phase 1 of Prolog task due in course git repository by end of day on 7 Mar 2019 , These were notes I wrote before doing the implementation and so have drifted a bit from the actual testing code,
however, the nfa term still has three parts:
The first part however, is not a State, but the term start(State) where State is filled in with the `name’ of the start state. This structure is used by my test_nfa_match predicate to find the start of the automata that is being matched against.
The States list maps nicely to the book definition of an NFA. However, my testing code never checks directly to see what you put in there. Keeping a list of state(Name,Status) values there would work fine, but it is over- specification in the sense that you could store other structures there that could also work fine. The main thing is that somewhere there has to be the information that allows your accepting predicate to tell if a given state is an accepting state or not. And in phase 2 (and 3), for the accepting states, you also need to know what token type they are accepting.
The transitions() structure matters and is used by test_nfa_match_helper to explore paths thru the nfa.
As to the structure of a state name, all it has to be is something that = works with. In an instance of an nfa, I would expect it to be grounded, i.e., having no free variables.
4.2) The Testing Code
:- op(300, yfx, ‘concat’).
:- op(500, yfx, ‘or’).
:- op(400, yf, ‘plus’).
:- op(400, yf, ‘star’).
:- op(400, yf, ‘optional’).
test_categorize([], []).
test_categorize([H_in|T_in], [H_result| T_result]) :- test_category(H_in, H_result),
test_category(X,Y) :- test_category_helper(X,Y).
test_category(X,undefined) :- \+ test_category_helper(X,_).
test_category_helper(Char, letter) :- [ACode] = “a”, ACode @=< Char, [ZCode] = "z",
test_category_helper(Char, letter) :- [ACode] = "A", ACode @=< Char, [ZCode] = "Z",
test_category_helper(Char, digit) :- [ZeroCode] = "0", ZeroCode @=< Char, [NineCode]
test_category_helper(Char, white_space) :- [Char] = " ".
test_category_helper(Char, white_space) :- [Char] = "\n".
test_category_helper(Char, question_mark) :- [Char] = "?".
test_category_helper(Char, exclamation_mark) :- [Char] = "!".
test_category_helper(Char, period) :- [Char] = ".".
test_category_helper(Char, comma) :- [Char] = ",".
test_category_helper(Char, colon) :- [Char] = ":".
test_category_helper(Char, semicolon) :- [Char] = ";".
test_category_helper(Char, minus_sign) :- [Char] = "-".
test_category_helper(Char, plus_operator) :- [Char] = "+".
test_category_helper(Char, binary_operators) :- [Char] = "*".
test_category_helper(Char, binary_operators) :- [Char] = "/".
test_category_helper(Char, binary_operators) :- [Char] = "%".
test_category_helper(Char, binary_operators) :- [Char] = "^".
test_category_helper(Char, left_parenthesis) :- [Char] = "(".
test_category_helper(Char, right_parenthesis) :- [Char] = ")".
test_category_helper(Char, equal) :- [Char] = "=".
test_category_helper(Char, less_than) :- [Char] = "<".
test_category_helper(Char, greater_than) :- [Char] = ">“.
test_count_successful_tests(N) :- setof(A, test_example_run_tests(A), L), length(L,N
test_example_run_tests(Status) :-
test_case(InputString, RegularExpression, my_succeed),
Result = my_succeed,
regular_expression_to_nfa(RegularExpression, NFA),
test_categorize(InputString, TransformedString),
test_nfa_match(TransformedString, NFA),
Status = [InputString, RegularExpression, Result].
test_example_run_tests(Status) :-
test_case(InputString, RegularExpression, my_fail),
Result = my_fail,
regular_expression_to_nfa(RegularExpression, NFA),
test_categorize(InputString, TransformedString),
\+ test_nfa_match(TransformedString, NFA),
Status = [InputString, RegularExpression, Result].
test_debug_run_tests(Status) :-
test_case(InputString, RegularExpression, my_succeed),
Result = my_succeed,
regular_expression_to_nfa(RegularExpression, NFA),
test_categorize(InputString, TransformedString),
test_nfa_match(TransformedString, NFA),
Status = [InputString, RegularExpression, TransformedString, NFA,
test_debug_run_tests(Status) :-
test_case(InputString, RegularExpression, my_fail),
Result = my_fail,
regular_expression_to_nfa(RegularExpression, NFA),
test_categorize(InputString, TransformedString),
\+ test_nfa_match(TransformedString, NFA),
Status = [InputString, RegularExpression, TransformedString, NFA,
test_nfa_match(String, nfa(start(State),X,Y)) :- test_nfa_match_helper(String, nfa(s
test_nfa_match_helper(“”, NFA, State) :- epsilon_closure(State, NFA, Closure),
member(PossibleFinish, Closure),
accepting(NFA, PossibleFinish).
test_nfa_match_helper([H|T], nfa(X,Y, transitions(Transitions)), State) :-
member(next(State, NextState, H), Transitions),
test_nfa_match_helper(T, nfa(X,Y, transitions(Transitions)), NextState).
test_nfa_match_helper(String, nfa(X,Y,transitions(Transitions)), State) :-
epsilon_closure(State, nfa(X,Y,transitions(Transitions)), Closure),
member(NextState, Closure),
NextState \= State,
member(next(State, NextState, epsilon), Transitions),
test_nfa_match_helper(String, nfa(X,Y,transitions(Transitions)), NextState).
test_case(“a”,letter concat (letter star), my_succeed).
test_case(“a”,letter concat letter star, my_fail).
% test_case(“a”,(letter concat letter) star, my_fail).
% identical to previous test case after operator processing and so doesn’t % show up as distinct in count of tests passed.
5) PHASE 1 LAUNDRY LIST:
what you are writing in due190307.pl for phase 1 is the definition of the predicates
regular_expression_to_nfa(RegularExpression, NFA)
epsilon_closure(State, NFA, Closure)
accepting(NFA, State)
The value of RegularExpression will be bound before the predicate is invoked by the regular expression notation described earlier. Again what your definitions are supposed to do is to cause my code to do what it is supposed to do, i.e., successfully match on the NFA what would successfully match the RegularExpression and fail to match the NFA what would fail to match the RegularExpression.
accepting is true if State is an accepting state in NFA.
In the Ruby assignment, there were constraints on the NFA in terms of methods it had to respond to. Here the restrictions are on the shape of the complex term that represents an NFA. Specifically, it looks like:
nfa(start(State),Y,transitions(Transitions))
where Transitions is a list of next terms so that I can do:
member(next(State, NextState, H), Transitions)
member(next(State, NextState, epsilon), Transitions)
and Closure that epsilon_closure produces is also a list that supports:
member(NextState, Closure)
6) Hint on Using Prolog with Phase 1 of this assignment
Note, prolog builtins used in the test code are:
:-
,
op
=
\=
[]
[|]
[,]
% comments
\+
“string”
@=<
setof
length
member
Note the prolog builtins used in my solution code for phase 1 were:
:-
,
_
=
\=
[]
[,]
[|]
is evaluates complex term
+ if functor of complex term evaluates to addition 1 evaluates to 1
append
sort
member
findall sortof that leaves in duplicates
14 评论
The builtins are defined in http://www.gprolog.org/manual/gprolog.html . A lot of other stuff is in there too. Most of it bad for an assignment like this one, but useful for other more complicated tasks. Remember prolog is about separating logic from control and in this assignment there should be very little reason to specify control (hence most of the classic impure parts of prolog are not listed among the builtins above. But, as with the Ruby assignment, badly written code that follows the assignment restrictions and passes the test data will be marked as if it were well written code.
⽆无标签
Jake Ryan Nemiroff 发表:
Hi Professor, I just want to make sure I'm understanding properly.
Our task is to write the definitions for the following:
regular_expression_to_nfa(RegularExpression, NFA)
epsilon_closure(State, NFA, Closure)
accepting(NFA, State)
we do this by defining scanner, token, etc that you have mentioned above in the comments of your code?
and for the following: nfa(start(State),Y,transitions(Transitions)
was this just to illustrate the difference between what it would've looked like in ruby and prolog, or were we supposed to define this as well.
Thanks! Jake
Robert Webber 发表:
You are not being asked to define scanner, token, ? . Scanner shows up in phase 2. Those three
predicates are what you are being asked to define in phase 1.
nfa(... is missing a closing parenthesis. It is provided to show you the parts of the nfa complex term that the test_ code accesses. In Ruby, this would be like saying that an instance of NFA has to respond to methods start and transitions, ...
Ralph Barac 发表:
So in terms of Prolog, an NFA is a fact or facts in your knowledge base and start and transitions are
your queries on that knowledge base?
And more generally, regular_expressions_to_nfa builds up your knowledge base based on (obviously) the input RegEx?
Also, what is an example of a valid regular expression parameter for that predicate?
edit:
Also, for the test case given letter concat (letter star) where a= letter so the simplified regex is a.(a*)
would the NFA look like:
State 0 ----e--→State 1 ----a--->State 2—-e–→State 3 (accepting) <----e ---
Robert Webber 发表:
I don't know what you mean by it being `fact in the knowledge base'. The nfa is a computational result much like the result of appending two lists is another list that is passed back in a parameter but would be garbage collected when no longer in use. None of the prolog built-ins indicated do anything with a `knowledge base' at run time. The only knowledge base would be the rules provided, those you write, and the test cases. None of these is an NFA.
letter concat (letter star) is a valid regular expression in prolog with the provided ops settings. your code would see it as concat(letter,star(letter)) – the conversion from infix to prefix being done automatically by prolog.
The NFA is constructed from the regular expression with no reference to the string it will match against. So a would not appear in it, instead just letter.
The NFA you show (once the a/letter business is fixed) and assuming I am reading the back arrow right, would accept the regular expression indicated. However, the regular expression indicated has two operators and so if you were following the textbook construction, it would be more complicated than this. I.e., there would be a part of the machine representing letter and another part representing letter* and something gluing those two parts together.
Robert Webber 发表:
In prolog, you define predicates which are relations. In many ways they are similar to methods, but more
general.
In prolog, the things being passed around are called terms. The kind of terms that are called `complex terms' are most like objects. However, in prolog your don't `define' term structures. If one were to draw an analogy with object-oriented programming, in prolog you don't define term structures, but instead have predicates that construct them in various ways. In object-oriented languages, this is the difference between class-based languages and prototype-based languages. In the 7 Languages 7 Weeks textbook, Io is used as an example of prototype-based object-oriented language. Probably none of you will ever program in Io (I certainly never have). However, there is a common prototype-based object-oriented language that many of you have programmed in (and doubtless more will) called
JavaScript. http://raganwald.com/2013/02/10/prototypes.html
So nfa, start, transitions, and such are not something you define so much as a part of the protocol that my predicates use to communicate with your predicates.
John Taylor Jewell 发表:
Can you give some clarification what Token is? I do not understand what you mean by: "Token is the name the
scanner returns if RegularExpression matches the input string" What do you mean by "name"?
Robert Webber 发表:
As mentioned in Re: the making of phase 1 of Prolog task due in course git repository by end of day on 7 Mar 2019 , scanner and hence the related token don't become relevant until phase 2, which was put up Sunday the making of phase 2 of Prolog task due in course git repository by end of day on 7 Mar 2019 .
Basically we are aiming for automata that accept multiple token types like identifiers and floats all with just one automata. To do this, the accepting states need to have an associated label so one knows that when one is in this state one not just accepted, but accepted a particular token type such as an identifier.
John Taylor Jewell 发表:
"state term has the form state(Name, Status) where Name is a name given the state"
Does this mean Name is just an arbitrary value to uniquely identify a given state fact? For example, you could have state(1, not_accepting), state(2, not_accepting) , etc.
Robert Webber 发表:
There is no `state fact'. state is just one of the lower parts of a structure whose top level functor is nfa and
is passed among predicates.
Your reference is to comments in tester.pl. These were notes I wrote before doing the implementation and so have drifted a bit from the actual testing code, however, the nfa term still has three parts:
The first part however, is not a State, but the term start(State) where State is filled in with the `name' of the start state. This structure is used by my test_nfa_match predicate to find the start of the automata that is being matched against.
The States list maps nicely to the book definition of an NFA. However, my testing code never checks directly to see what you put in there. Keeping a list of state(Name,Status) values there would work fine, but it is over-specification in the sense that you could store other structures there that could also work fine. The main thing is that somewhere there has to be the information that allows your accepting predicate to tell if a given state is an accepting state or not. And in phase 2 (and 3), for the accepting states, you also need to know what token type they are accepting.
The transitions() structure matters and is used by test_nfa_match_helper to explore paths thru the nfa.
As to the structure of a state name, all it has to be is something that = works with. In an instance of an nfa, I would expect it to be grounded, i.e., having no free variables.
Robert Webber 发表:
I have now updated the discussion of tester.pl above to reflect these clarifications (and in passing broke section 4 into 4.1 and 4.2). I expect to include these comment clarifications/corrections in the next version of the tester.pl code uploaded.
Robert Webber 发表:
If you think about writing something like the append predicate, there are no `facts' involved. This assignment is like that, you are just defining predicates. The only facts in the whole thing are the test_cases in tester.pl. There could also be base case facts like append([],X,X). But there is no notion that the list elements are somehow facts floating around in the system and neither would the nfa or the states be facts.
John Taylor Jewell 发表: Question Part 1:
I am not sure I am following what you are saying: "There is no `state fact'. state is just one of the lower parts of a structure."
Are you implying there would be no rule of the form: state(Name, Status) :- .....
in our program that would generate various instances (facts) of the state predicate like state(0, not_accepting), state(1, not_accepting) ? If not, what do you then mean by: "There is no `state fact'.
state is just one of the lower parts of a structure." Question Part 2:
"The first part however, is not a State, but the term start(State) where State is filled in with the `name' of the start state. This structure is used by my test_nfa_match predicate to find the start of the automata that is being matched against."
So if I am reading this correctly, this means that:
For example if we had the start state of an NFA as: state(start_state, not_accepting). The nfa would be: nfa(start(start_state), Y, transitions(Transitions))
nfa(start(State),Y,transitions(Transitions))
Hazem Moharram 发表: Hello,
| ?- test_example_run_tests(S).
test_example_run_tests(S).
S = [[97],letter concat (letter star),my_succeed] ? ;
;
S = [[97],letter concat (letter star),my_succeed] ? (2 ms) yes
In your example of running test cases it shows that the list "[97]" matches "letter concat (letter star)" Is this an error in your test case?
Also this test case doesn't appear in the tester.pl.
In addition to that, shouldn't the first member of S be a string like "a" for example?
Robert Webber 发表:
If you look at the ASCII chart (man ascii on unix), you will see that 97 is the decimal character code for lower case a. So prolog views "a" as syntactic sugar for the list presented above.
采⽤用⼀一个授予Department of Computer Science, The University of Western Ontario的免费Atlassian Confluence Community License。今天评估Confluence。