Assignment 7
Due: 11:00 AM on Wednesday, November 26, 2008
Question
An elementary translator between two dialects of the same language can
be specified by i) a set of mapping rules of the form "X" => "Y" indicating
that text X in dialect #1 maps onto text Y in dialect #2, ii) a (possibly
empty) list of interjections that are randomly inserted before
the beginnings of sentences in dialect #1 to create valid sentences in
dialect #2, and iii) a mechanism that applies (i) and (ii) above to a piece
of text from dialect #1 to create the corresponding piece of text in
dialect #2. The mechanism in (iii) first applies the rules in (i) in the
order in which they are specified to a sentence of dialect #1 text before
inserting a random interjection from the list in (ii) in front of that
translated sentence. Note that the list in (ii) is augmented by (iii) to
include the empty string, which simulates the condition in which no
interjection is added. Parts (i) and (ii)
can be stored in a dialect translation file
in which the list in (i) is specified by the command INTERJECTIONS = {"i1",
"i2", ..., "in"}, the interjection and rule commands may have an arbitrary
amount of blank or tab characters between their parts, and there may be
an arbitrary number of blank lines and comment-lines (indicated by initial
hash marks (#)) which are ignored by the translation mechanisms in (iii).
Write and document a Python script ditrans.py which, given a
dialect translation file and a textfile in dialect #1 as command-line
arguments, prints each line of text as given and as translated into
dialect #2. The translation mechanism must be implemented using functions
from the re and random modules. You may assume that each line
in the given textfile corresponds to one or more sentences in dialect #1,
and that no sentence is broken across lines. In honor of
International Talk Like a Pirate Day,
your script must work on datafiles
e2p1.txt,
e2p2.txt, and
text1.txt
to produce the output given in typescript-file
ditrans.script (note that as interjections are
picked at random, the starts of your translated sentences will probably
differ from those in the provided script file).
You may assume that all dialect translation and text files are valid.
Hints
You may find it useful to recognize and parse the interjection-list and
rules in a dialect translation file using the regular-expression matching
functions in re before applying this list and rule-set (again using
re functions) to the provided text.
You may find the eval() of use in transforming the string
representation of the interjection-list to a list of strings.
You may find Diary script nameparse4.py useful in
implementing random interjection selection.
Submission
Please hand in printed copies of all of your Python script files.
You must also submit these files electronically using the
submit-assignment command.
Note that each script file must have the following comment
block at the top, where the X's are replaced with the appropriate
information, followed by a docstring briefly describing the program in that
script. For instance, my script for this assignment would
begin with the following comment block:
#########################################################
## CS 2500 (Fall 2008), Assignment #7 ##
## Script File Name: ditrans.py ##
## Student Name: Todd Wareham ##
## Login Name: harold ##
## MUN #: 8008765 ##
#########################################################
You do not have to develop your code on our CS departmental systems.
However, as your code will be compiled and tested on our CS departmental
systems as part of the assignment marking process,
you should ensure that your code compiles and runs correctly on at
least one of these systems.
- Nov 21, 12:10pm
Submission deadline for Assignment #7 extended to Wednesday,
Nov 26 (I know I said Monday, Nov 24, in class, but on further
thought, this gives people more time to come talk to me if
necessary).
- Nov 13, 9:15am
Assignment #7 posted.
Created: November 13, 2008
Last Modified: November 21, 2008