Assignment 7
Due: 11:00 AM on Wednesday, November 16, 2011
Question
Write and document a Python script PyText2.py which re-implements
the text comparison program described in Assignment #6
using the following functions:
- createCharSet(T): Return the sorted set of characters associated with text
T.
- createWordSet(T): Return the sorted set of words associated with text
T.
- createDigramMatrix(T): Return the nested-list representation of the
digram matrix associated with text T.
- printCharSet(text, textName): Print the character set associated with the text
named textName in the stored-text structure text.
- printWordSet(text, textName): Print the word set associated with the text named
textName in the stored-text structure text.
- printDigramMatrix(text, textName): Print the digram matrix associated with
the text named textName in the stored-text structure text.
- computeNgramVector(text, textName, n): Return a dictionary representation of
the n-gram vector associated with text named textName in the
stored-text structure text.
- computeNgramSim(ngV1, ngV2): Return the similarity of the
given dictionary representations ngV1 and ngV2 of
n-gram vectors.
All text-storage and comparison functionality of the dictonary representation of stored
texts must occur within these specified functions.
This re-implementation must also include all types of error-checking as specified in the
script-file below.
Your script must work on datafiles
nm1.dat,
nm2.dat,
nm3.dat,
nm4.dat,
tc1.dat,
tc2.dat,
tc3.dat,
tc4.dat, and
com1.txt
to produce the output given in typescript-file
PyText2.script.
Hints
Submission
Please hand in printed copies of all of your Python script files.
You must also submit these files electronically using the
submit-assignment command.
Note that each script file must have the following comment
block at the top, where the X's are replaced with the appropriate
information, followed by a docstring briefly describing the program in that
script. For instance, my script for Question #1 of this assignment would
begin with the following comment block:
#########################################################
## CS 2500 (Fall 2011), Assignment #7 ##
## Script File Name: PyText2.py ##
## Student Name: Todd Wareham ##
## Login Name: harold ##
## MUN #: 8008765 ##
#########################################################
If you choose to base your script on any answer-scripts for previous
assignments, please note this in the docstring.
You do not have to develop your code on our CS departmental systems.
However, as your code will be compiled and tested on our CS departmental
systems as part of the assignment marking process,
you should ensure that your code compiles and runs correctly on at
least one of these systems.
- Nov 10, 8:35am
Fixed major errors in both Assignment #7 answer code and supplied script file
"PyText2.script".
- Oct 24, 10:10am
Assignment #7 posted.
Created: October 24, 2011
Last Modified: November 10, 2011