3/29/2020 Random Text Generator
The Problem
Random Text Generator
Earlier in class we looked at simple grammars and production systems. This assignment uses grammars to generate somewhat random output (in contrast to a compiler which is a sentence recognizer).
The idea is that we can take a grammar like the one contained in this ‘grammar’ file where each string of the form
{
and ;
but ;
yet ;
}
Where:
each production is bracketed with {}
the first line in each production holds the name of the non-terminal left-hand-side of the production (i.e.,
each of the possible productions follows, terminated by a ‘;’ (though shown on separate lines this is not a requirement)
So the production above corresponds to
Your task is to write a text generator that performs the following actions:
Reads a grammar file (provided as a command-line argument)
Generates text by starting with the start symbol and generating output by replacing each non- terminal with a randomly chosen production for that non-terminal.
Obviously the production chosen may be a mixture of terminals and non-terminals so the process continues until there are no more non-terminals to process. In addition, we can weight the probability of a particular production being chosen by repeating it more than once in the list of productions for a particular non-terminal. Productions can be recursive, but must have at least one non-recursive form to prevent infinite loops.
Implementation Details
I have provided a Java class to parse these input files and build a ‘symbol table’ for you (here is the Javadoc if you are interested).
As to the process of actually generating text, this is rather easy to do. We could do this recursively, but that might be too inefficient for long texts so it’s best to simulate recursion via a stack. I leave the algorithm to you as a simple exercise…
The grammar files will be carefully constructed to insure that:
< and > symbols are only used to demarcate non-terminals,
non-terminals will always have identical case with no whitespace between the < and > symbols, and ‘;’ characters only appear as production terminators, never part of the text.
dhansen.cs.georgefox.edu/~dhansen/Classes/Languages/Protected/Assignments/generator.html 1/2
3/29/2020 Random Text Generator
Submit:
All other punctuation symbols and characters should be considered to be terminal symbols in the production.
Your program named Generate.java (do not submit my .java file)
Your grammar file (even if previously submitted alone). The convention is to name them with a .g suffix. Please use a unique descriptive filename to avoid naming conflicts with other students.
dhansen.cs.georgefox.edu/~dhansen/Classes/Languages/Protected/Assignments/generator.html 2/2