Computer Science 2500, Fall '08
Course Diary


Copyright 2008 by H.T. Wareham
All rights reserved


Week 1, Week 2, Week 3, Week 4, Week 5, (Class Exam #1 Notes),
Week 6, Week 7, Week 8, Week 9, (Class Exam #2 Notes),
Week 10, Week 11, Week 12, Week 13, (Final Exam Notes),
Week 14, (end of diary)


Week 1 [Lang, Sections 1 and 2; Class Notes]

Friday, September 5 (Lecture #1)
[Lang, Sections 1 and 2; LutzL, Sections 1-3; Class Notes]


Week 2 [Lang, Sections 2, 3.1.1, 3.1.2, 3.1.4, and 3.3.2 and Appendix B.2; Class Notes]

Monday, September 8 (Lecture #2)
[Lang, Section 2; Class Notes]

Wednesday, September 10 (Lecture #3)
[Lang, Sections 3.1.1, 3.1.4, and 3.3.2 and Appendix B.2; LutzL, Sections 9 and 14; Class Notes]

Friday, September 12 (Lecture #4)
[Lang, Sections 3.1.2 and 3.1.4; LutzL, Sections 9-12; Class Notes]


Week 3 [Lang, Sections 3.1.2 and 3.2.1-3.2.3; Class Notes]

Monday, September 15 (Lecture #5)
[Lang, Section 3.1.2; LutzL, Sections 12 and 13; Class Notes]

Wednesday, September 17 (Lecture #6)
[Lang, Sections 3.1.2 and 3.2.2; LutzL, Sections 4, 5, and 13; Class Notes]

Friday, September 19 (Lecture #7)
[Lang, Sections 3.2.1 and 3.2.3; LutzL, Section 6; Class Notes]


Week 4 [Lang, Sections 3.1.5, 3.2.3, and 3.2.6-3.2.8; Class Notes]

Monday, September 22 (Lecture #8)
[Lang, Sections 3.1.5, 3.2.3, and 3.2.6-3.2.8; LutzL, Sections 6 and 7; PyCook, Sections 3.12-3.14; Class Notes]

Wednesday, September 24 (Lecture #9)
[Lang, Sections 3.2.6-3.2.8; LutzL, Section 7; Class Notes]

Friday, September 26 (Lecture #10)
[Lang, Sections 3.2.6-3.2.8; LutzL, Section 7; Class Notes]


Week 5 [Lang, Sections 3.1.5, 3.2.4, and 3.2.10; Class Notes]

Monday, September 29 (Lecture #11)
[Lang, Sections 3.1.5 and 3.2.4; LutzL, Sections 8 and 9; Class Notes]

Wednesday, October 1

Wednesday, October 1 (Lecture #12)
[Lang, Section 3.2.4; LutzL, Sections 8 and 9; Class Notes]

Friday, October 3 (Lecture #13)
[Lang, Sections 3.2.4 and 3.2.10; LutzL, Sections 8 and 9; Class Notes]


Week 6 [Lang, Sections 3.2.4, 3.2.5, and 3.2.10; Class Notes]

Monday, October 6 (Lecture #14)
[Lang, Sections 3.2.4 and 3.2.10; LutzL, Sections 5, 8, and 9; Class Notes]

Wednesday, October 8

Friday, October 8 (Lecture #15)
[Lang, Sections 3.2.5; LutzL, Sections 5 and 8; Class Notes]


Week 7 [Lang, Section 3.2.5, 3.3, and 8.3.1-8.3.2; Class Notes]

Monday, October 13

Wednesday, October 15 (Lecture #16)
[Lang, Sections 3.2.5; LutzL, Sections 8 and 9; Class Notes]

Friday, October 17 (Lecture #17)
[Lang, Sections 3.3 and 8.3.1-8.3.2; LutzL, Sections 15-17; LutzP, Section 19; Class Notes]


Week 8 [Lang, Sections 3.3 and 3.4; Class Notes]

Monday, October 20 (Lecture #18)
[Lang, Section 3.3; LutzL, Sections 15-17; Class Notes]

Wednesday, October 22

Friday, October 24 (Lecture #19)
[Lang, Section 3.3; LutzL, Sections 15-21; Class Notes]


Week 9 [Lang, Sections 3.3 and 3.4; Class Notes]

Monday, October 27 (Lecture #20)
[Lang, Section 3.3; LutzL, Sections 15-21; Class Notes]

Tuesday, October 28

Wednesday, October 29 (Lecture #21)
[Lang, Section 3.4; Class Notes]

Friday, October 31 (Lecture #22)
[Lang, Section 3.4; Class Notes]


Week 10 [Lang, Sections 3.4 and 8.2; Class Notes]

Monday, November 3 (Lecture #23)
[Lang, Sections 3.4 and 8.2; PyNut, Section 9; Class Notes]

  • Basic Python II: Modules (Cont'd)
    • Accessing System Files and Directories: The os, shutil, and glob Modules (Cont'd)
      • Selective directory listing with glob
        • listdir() is well and good for listing all files in a directory. However, we often only want files of a particular type, e.g., Python scripts, files whose names start with capital letters.
        • glob(pat) in module glob returns a list of all files in the current working directory whose names match the pattern pat.
          • Patterns incorporate ordinary symbols and various pattern-specifiers, e.g., ? (any single character), * (0 or more characters), [x ... z] (any one of characters x .. y) [x-y] (any one of characters in Unicode / ASCII range x - y).
      • Example: Listing all readable files in the current directory with a specified extension (listDir2.py)
    • Pattern-Matching: The re Module
      • The elementary pattern-matching in glob would be a useful thing to have for string-processing in general. Such a general facility is provided by the re module, in which patterns are represented as regular expressions.
      • What is a regular expression?
        • A regular expression specifies a set of strings; if a given string s is in that set, s is said to match the regular expression.
        • At its most basic, a regular expression is a sequence of units, where each units specifies a choice of one or more things that are repeated some number of times.
          • Something (entity): Symbol
          • Choice:
            • . (any character except \n)
            • [s] (any character in string s)
            • [x-y] (any character between characters x and y inclusive in Unicode)
            • [^s] (any character not in string s)
            • \w (any word character, e.g., [a-aA-Z0-9])
            • \W (any non-word character, e.g., [^\w])
            • \d (any digit character, e.g., [0-9])
            • \D (any non-digit character, e.g., [^\d])
            • \s (any space character, e.g., [ \t\n])
            • \S (any non-space character, e.g., [^\s])
          • Repetition (quantifiers):
            • * (0 or more occurrences)
            • + (1 or more occurrences)
            • ? (0 or 1 occurrences, e.g., optional)
            • {m} (m occurrences)
            • {m,} (m or more occurrences)
            • {m,n} (m to n inclusive occurrences)
            Note that special status of symbols overridden inside square brackets ("a+" vs. "[a+]"). To avoid proliferation of backslashes used to create escape-versions of characters, use raw strings (r'...').
        • Example: Recognizing integers.
        • Example: Recognizing exponential numbers.

Wednesday, November 5

  • Class Exam #2

Friday, November 7

  • Instructor off-campus; lecture canceled

Week 11 [Lang, Section 8.2; Class Notes]

Monday, November 10

  • Instructor off-campus; lecture canceled

Wednesday, November 12 (Lecture #24)
[Lang, Section 8.2; PyNut, Section 9; Class Notes]

  • Basic Python II: Modules (Cont'd)
    • Pattern-Matching: The re Module (Cont'd)
      • What is a regular expression? (Cont'd)
        • Units can be built out of other units by grouping with parentheses; aside from aiding clarity, such groups can also be accessed by position going from left to right in the expression (\i, for i greater than or equal to 1), or even given names (which we will not get into in this course).
          • Such backreferences allow reference to previous matches later in pattern, i.e., "(\d+)X\1".
          • If you do not want parentheses to be interpreted as a group, use "(?: ...)".
        • Example: Recognizing (and breaking down) proper names.
        • In languages like Python, regular expressions are augmented to consider matches of one or more substrings inside a string.
          • Can specify where in string match must occur to be valid, e.g.,
            • \A (beginning of string or after \n)
            • ^ (beginning of string)
            • \Z (end of string or before \n)
            • $ (end of string)
          • If multiple matches are possible, can override greedy default (longest) to match shortest (trailing ? on quantifier).
        • Example: Recognizing name at start (end) of file vs. start (end) of any line in file.
        • Example: Recognizing XML-tagged entities
      • Regular expression matching in Python: The match object
        • Describes result of matching a particular regular expression against a particular string.
        • Variables:
          • string (string on which match was performed)
          • re (re-object used to make match)
          • pos (requested start-index of match)
          • endpos (requested finish-index of match in string)
        • Functions:
          • Matched-group characteristics:
            • group(gid=0) (returns string matched by group gid (whole string matched if no gid specified) or none if no match by group gid)
            • groups() (returns tuple of all strings matched by groups, with None if no match by group gid)
            • start(gid=0) (returns start-index of string matched by group gid (start of whole match if no gid specified) or -1 if not match by group gid)
            • end(gid=0) (returns finish-index of string matched by group gid (finish of whole match if no gid specified) or -1 if not match by group gid)
            • span(gid=0) (returns (m.start(gid), m.end(gid)))
          • Applying matched groups: expand(s) (return copy of s in which all backreferences to matched groups are replaced)
      • Applying regular expressions
        • Creating regular expression objects: r = re.compile(pattern {,flags})
          • A regular-expression object (re) has variable pattern giving the pattern-string from which it was created.
          • During compilation, can also specify various flags that modify interpretation of pattern, e.g.,
            • re.IGNORECASE (make match case-insensitive)
            • re.DOTALL (allow .-character in pattern to also match \n)
            • re.VERBOSE (ignores whitespace / #-comments in pattern)
            • re.MULTILINE (makes 6 and $ function ,like \A and \Z)

Friday, November 14 (Lecture #25)
[Lang, Section 8.2; PyNut, Section 9; Class Notes]

  • Basic Python II: Modules (Cont'd)
    • Pattern-Matching: The re Module (Cont'd)
      • Applying regular expressions (Cont'd)
        • Services on regular expression objects:
          • Apply re to string:
            • r.match(s, start=0, end=sys.maxint) (returns match-object for match of r to s starting at s-index start and finishing before s-index end, and None if no match of r in s)
            • r.search(s, start=0, end=sys.maxint) (returns match-object for match of r to s starting at or after s-index start and finishing before s-index end, and None if no match of r in s)
            • r.findall(s) (return list of (non-overlapping) substrings of s matched by r)
          • Manipulate string using re:
            • r.split(s) (return a list of substrings of s matched by non-overlapping matches of r in s; compare with s.split())
            • r.sub(repl, s) (if repl is string, return copy of s in which all matches with r are replaced by repl (with backreferences to groups in r in repl replaced appropriately); if repl is function-object that takes match-object as only parameter, return copy of s in which all matches with r are replaced with string returned by repl(m))
            • r.subn(repl, s) (returns 2-tuple (r.sub(repl, s), n) where n is number of matches of repl in s)
        • The services above are available through the re module itself, if the regular expression is given as a pattern to the function, e.g., re.sub(r, repl, s); however, these versions lack some of the functionality available in the re-object versions, e.g., cannot specify start / end positions for matches in string.
      • Example: Breaking exponential numbers into parts (expflt1.py)
      • Example: Extracting real-number values of exponential numbers (expflt2.py)
      • Example: Breaking proper names into parts (nameparse1.py) [names.txt]
      • Example: Rewriting proper names (Version #1: match-expand version) (nameparse2.py)
      • Example: Rewriting proper names (Version #2: re-sub string version) (nameparse3.py)
      • Example: Rewriting proper names (Version #3: re-sub function version) (nameparse4.py)
      • Example: Jazzing up of all proper names in an annotated file (jazzname.py) [aname.txt]
      • Example: Counting the number of proper names in an annotated file (countname.py)

Week 12 [Lang, Section 6; Class Notes]

Monday, November 17 (Lecture #26)
[Lang, Section 6; PyNut, Section 17; PyProg, Sections 8 and 9; Class Notes]

  • Basic Python II: Modules (Cont'd)
    • Went over answers to Class Exam #2.
    • GUI Development: The Tkinter Module
      • Tkinter as framework
        • Frameworks are OOP constructs which allow re-use of both code and design; are typically used to simplify the creation of specific complex applications, e.g., abstract data types, GUIs, Web servers.
        • General characteristics of frameworks:
          • Extendability (create entities by extending classes in framework)
          • Inversion of control (to use framework, only need to specify basic appearance and behavior of application -- framework controls actual creation and execution of specified mechanism)
      • Over the next several lectures, as we look at various GUI features in Tkinter, compare these with what you need / want to implement for Assignment #8 to figure out those features you need to master.

Wednesday, November 19 (Lecture #27)
[Lang, Section 6; PyNut, Section 17; PyProg, Sections 8 and 9; Class Notes]

  • Basic Python II: Modules (Cont'd)
    • GUI Development: The Tkinter Module
      • Core Tkinter GUI entities:
        • Containers (windows / panels)
        • Widgets: Widget appearance / behavior + layout in container
      • Keep in mind that the next several lectures are a very basic overview of Tkinter -- there are many more features in Tkinter than we will cover here, not only in terms of extra widgets and containers, but also extra parameters and methods for those widgets and containers we do describe. More details can be found on the Python Tkinter Wiki and in the Tkinter reference manual (thanks to Jason Gedge for pointing these out).
      • Focus today on single-container windows today; look next lecture at nested containers.
      • Generic single-window GUI script structure:

          from Tkinter import *
          
          root = Tk()
          
          ## Tkinter variable setup
          
          ## Callback functions
          
          ## GUI widgets
          
          root.mainloop()
          

      • Root Window setup
        • Create root window: root = Tk()
        • Customize appearance of window by calling methods relative to the created root window-object, e.g., root.title(S)
        • Once GUI is set up, trigger execution of GUI using root.mainloop() (see below).
      • Basic widgets:
        • Create widget-objects by calling various functions. First parameter of each of these functions is always the container in which the widget is placed; remaining parameters (typically specified in keyword-fashion) specify appearance and behavior of widget.
        • Information passed in and out of widgets via special Tkinter variables (IntVar(), StringVar(), DoubleVar(), BooleanVar()), which are manipulated using methods get() and set().
        • Information-display widgets:
          • Two Forms:
            • Label(parent, text="Text")
            • Label(parent, textvariable=svar)
            Former good for small (possibly multi-line, with newline-embedded text) static text displays, and latter good for dynamic text displays.
        • Information-entry widgets
          • Entry(parent, textvariable=svar)
            • Models single-line text-entry field.
            • Text associated with / entered into field is stored in string-variable svar.
          • Checkbutton(parent, variable=ivar, text="Text")
            • Models on/off button.
            • Integer-variable has value 1 (0) if button (not) pressed down with mouse click.
          • Radiobutton(parent, variable=type-var, value=type-val, text="Text")
            • Models one of a set of radio buttons, i.e., a set of buttons in which only one member can be pressed down at a time.
            • Group of radio buttons specified as set of radio-button widgets operating off the same variable type-var (which, as the name suggests, may be of any valid Tkinter variable type).
            • Variable type-var has value associated with currently depressed radio button in group.
          • Scale(parent, label="Text", variable=dvar, from_=dvalL, to=dvalU, tickinterval=dvalI, resolution=dvalR, showvalue=YES, orient=str)
            • Models entry of floating-point value by slider-scale in range dvalL to dvalU inclusive. Slider tick-interval is dvalI.
            • Double-variable dvar has value associated with slider-position as rounded to nearest floating-point number modulo resolution dvalR.
            • Orientation of slider can be 'horizontal' or 'vertical'.
        • Command-activation widgets:
          • We will only consider one such widget, the control-button.
          • Syntax: Button(parent, text="Text", command=func-name)
          • Elementary event-handling done inside this widget using the command parameter, i.e., function func-name is executed when button is pressed down.
            • Due to the framework-structure of Tkinter (in which control is handed by Python to Tkinter and events are handed back from Tkinter to Python for processing), event-handling is also known as callback processing and event-handler functions are known as callback functions.
            • Default is that callback functions have no parameters; if parameters are necessary, enclose callback function in a "helper" lambda function (making sure that parameters are interpreted correctly at call-time).
          • To exit GUI, may want to define button with callback function set to either root.quit (resume execution of script after root.mainloop() (see below)) or sys.exit (terminate GUI and script).

Friday, November 21 (Lecture #28)
[Lang, Section 6; PyNut, Section 17; PyProg, Sections 8 and 9; Class Notes]

  • Basic Python II: Modules (Cont'd)
    • GUI Development: The Tkinter Module (Cont'd)
      • Widget layout in container
        • Done by calling layout-method relative to each each widget.
        • Layout methods:
          • pack(expand={YES, NO}, fill={BOTH, X, Y}, side={TOP, BOTTOM, LEFT, RIGHT})
          • grid(row=int, column=int)
          Can implement exact placement by pixel-position using place(), but this is very complex to use -- pack() and grid() are usually preferable.
        • When using pack(), establish contents of top and bottom sides before adding contents of left and right sides; otherwise, horizontal extent of window may be misjudged by Tkinter and contents may be mixed up.
        • When using pack(), can control (to a degree) placement of widgets when window is resized using parameters expand and fill.
        • Example: Stacked radiobutton group
        • All positions in a specified grid need not be filled when using grid() -- will fill unused positions in with white space automatically.
        • Example: Compass radiobutton group
        • All positions in a specified grid need not be
        • Should not mix layout-types in a single container -- results may be unexpected.
      • Once all widgets set up (including Tkinter variables and callback functions) and configured to have the root window as their parent container, call root.mainloop() to hand control to Tkinter and trigger GUI creation and execution.
      • Example: Basic Tkinter GUI (pack()-layout, no-resizing) (GUI1.py)
      • Example: Basic Tkinter GUI (pack()-layout, automatic resizing) (GUI2.py)
      • Creating nested containers
        • Create nested containers using Frame()
        • Syntax: frame = Frame(parent)
        • Make sure that nested frame is assigned appropriate parent and is located (using pack(), grid(), or place()) appropriately in that parent-frame.
        • A useful idiom for structuring GUIs with nested frames is to define the root container and when defining each nested frame in turn, define that frame, define and pack all widgets in that frame (making sure that frame is the parent of each widget), and then place the nested frame appropriately in the parent, e.g.,

            ...
            root = Tk()
            ...
            tnf = Frame(root)
            
            WIDGET(tnf).pack(side=LEFT)
            WIDGET(tnf).pack(side=RIGHT)
            
            tnf.pack(side=TOP)
            
            bnf = Frame(root)
            
            WIDGET(tnf).grid(row=1, column=2)
            WIDGET(tnf).grid(row=2, column=1)
            WIDGET(tnf).grid(row=2, column=3)
            WIDGET(tnf).grid(row=3, column=2)
            
            bnf.pack(side=BOTTOM)
            ...
            

      • Example: Basic Tkinter GUI (mixed layout in nested frames) (GUI3.py)
      • Two general GUI operation modes:

        • Make and trigger GUI such that all input is accumulated by GUI, and when input is gathered, use callback to trigger root.quit to return to main program to do processing and output, e.g..

            from Tkinter import *
            
            ## General processing functions
            
            def runGUI():
            
                root = Tk()
            
                ## Tkinter variable setup
            
                ## Callback functions (GUI exit using root.quit)
            
                def makeGUI():
            
                    ## GUI widgets
            
                makeGUI()
                root.mainloop()
                return values
            
            ## Script main program
            
            values = runGUI()
            
            ## Process values returned from GUI
            

        • Make and trigger GUI such that all I/O and processing is handled via callback in GUI, and at end of session, use callback to trigger sys.exit(), e.g.,

            from Tkinter import *
            
            ## General processing functions
            
            def runGUI():
            
                root = Tk()
            
                ## Tkinter variable setup
            
                ## Callback functions (GUI exit using sys.exit)
            
                def makeGUI():
            
                    ## GUI widgets
            
                makeGUI()
                root.mainloop()
            
            ## Script main program
            
            runGUI()
            

        The former is good for adding pretty front-ends to existing scripts, while the latter is more suited to interactive sessions alternating I/O and processing.
      • Example: Basic front-end GUI (GUI4a.py)
      • Example: Basic interactive-session GUI (GUI4b.py)
      • Example: Extracting real-number values of exponential numbers (front-end GUI) (expfltGUI1.py)
      • Example: Extracting real-number values of exponential numbers (interactive-session GUI) (expfltGUI2.py)

Week 13 [Lang, Section 6; Class Notes]

Monday, November 24 (Lecture #29)
[Lang, Sections 4.1, 4.3.1, and 8.2; PyNut, Sections 9, 15, and 16; Class Notes]

  • Basic Python II: Modules (Cont'd)
    • General Numerical Processing: The math, cmath, random, and gmpy Modules
      • Many commonly-used mathematics functions are given in the math module; where applicable, versions of these functions for complex numbers are given in the cmath module.
        • These modules also have variables giving the values of mathematical constants e and pi (which, oddly enough, have the names e and pi).
      • The random module provides many functions associated with uniform distributions, e.g.,
        • seed(x=None) (sets seed to hashable object x; otherwise, sets seed to platform-specific source of randomness, e.g., system time (latter done automatically when random module is loaded)).
        • random() (returns a random float in the range 0 to 1 inclusive)
        • uniform(l, u) (returns a random float in the range l to u inclusive)
        • choice(S) (returns random element from sequence e S)
        • sample(S, k) (returns list of k randomly-selected elements from sequence S)
        • shuffle(S) (does in-place random shuffle of elements in mutable sequence S)
        The random module also offers these services relative to other commonly-used distributions, e.g., Gaussian, exponential; if you are manipulating such distributions, do consult the documentation on this module to see if what you need has already been written.
      • The gmpy module implements efficient arbitrary-precision integer and float types, as well as a rational-number type.
    • Efficient Manipulation of (Numerical) Multidimensional Arrays: The NumPy Module
      • As noted earlier in this course, nested lists in Python allow easy implementation of multidimensional numerical array processing; however, large-scale numerical processing done in this fashion is very slow. The flexibility of Python multidimensional lists (heterogeneous, non-contiguous memory storage, mutable) is purchased at the expense of processing efficiency!
      • The multidimensional array type ndarray underlying NumPy (by virtue of being homogeneous, immutable (sort of; see below), and based on a contiguous chunk of memory) regains efficiency at the expense of flexibility.
      • An ndarray s an n-dimensional array of fixed size in which each element is of a fixed array-specific numerical type. The number of dimensions is the ndarray's rank, and the number of elements along a particular dimension is that dimension's length. Each ndarray has the following associated variable-attributes:
        • shape: Tuple giving lengths of array dimensions.
        • ndim: Number of array dimensions.
        • size: Number of elements in array, i.e., product of shape-tuple elements.
        • dtype: Object describing numeric type of array elements, drawn from set {byte, int, float, complex, uint8, uint16, uint64, int8, int16, int32, int64, float32, float64, float96, complex64, complex 128, complex192}.
        • itemsize: Number of bytes required to store a single array element.
      • There is no ndarray literal. However, there are a variety of ways of creating ndarrays:
        • Create a one-dimensional array which is subsequently reshaped to have multiple dimensions (see below), e.g.,
          • arange(l, u, i): A one-dimensional ndarray with integer elements l through u inclusive relative to increment i.
          • linspace(l, u, n): A one-dimensional ndarray with n elements evenly spaced between l and u inclusive.
          Note that l, u, and i can be either integer or floating-point; however, given difficulties with trying to get exact floating-point quantities under fixed floating-point precision, linspace() is safer for generating floating-point sequences.
        • Create an ndarray from a nested-list representation L of a multidimensional array (array(L {, dtype}), where dtype is one of the numerical element-types described above).
        • Create a special-purpose ndarray with a specified shape and type via function stype(shape-tuple {, dtype}), where dtype is one of the numerical element-types described above and stype is one of ones (all ones), zeros (all zeroes), or empty (arbitrary-value).
      • ndarray shape can be modified using a.transpose() (return view of a with reversed shape-tuple of a), a.reshape(s) (return view of a with shape-tuple s, where produce of a.shape = product of s), a.resize(s) (reshape a according to s in-place), and a.ravel() (re-shape a in-place to one-dimensional array of elements in a in enumeration-order of a (rightmost index changes fastest)).
        • Note that views are not copies; are rather references to same areas of memory with different indexing-rules.

Monday, November 24

  • Final Exam Notes
    I'm still making up your final exam but I am now pretty sure of the format; marks may be adjusted among parts of the exam and there may be one more short programming question, but on the whole, what is below describes your final exam. The exam will be closed-book. It will be 120 minutes long and has a total of 120 marks (this is not coincidental; I have tried to make the number of marks for a question equivalent to the number of minutes it should take you to do it). There will be 4 questions:

    • Give short code-fragments (5 parts; 60 marks)
    • Describe the output of a Tkinter script (15 marks)
    • Write a full script (20 marks)
    • Write portions of a full script (2 parts; 25 marks)

    Topics include all material covered up to and including GUI design using the Tkinter module. You may also find the following of use:

    • Final exam (Winter 2008) (8 pages: PDF)
    • Answers to final exam (Winter 2008) (7 pages: PDF)

    I hope the above helps, and I wish you all the best of luck with this exam and your other exams.

Wednesday, November 26 (Lecture #30)
[Lang, Sections 4.1 and 4.3.1; PyNut, Section 16; Class Notes]

  • Basic Python II: Modules (Cont'd)
    • Efficient Manipulation of (Numerical) Multidimensional Arrays: The NumPy Module (Cont'd)
      • Access elements and slices of ndarrays using nested list indexing and slice syntax (a[i][j][k]) or collapsed version of same (a[i,j,k]), which is more efficient.
        • Can also extract list of arbitrary elements using a Boolean matrix B of the same shape as the operand (a[B]) or a Boolean expression on ndarray x which is evaluated element-wise on x (a[BExp], e.g., a[a < 10]) (see below).
        • Note that in ndarrays (unlike lists), slices do not produce copies, but are rather references to areas of memory. To create true copies, use a.copy().
      • Operators:
        • Symbolic: Standard Python arithmetic operators
          • Are applied element-wise to create matrices of same size (and upcasted type) as operands if argument-matrices of same size.
          • If operand matrices are not of same size, these matrices are augmented to be the same. This is called broadcasting, and as the rules of broadcasting are intricate, they will not be covered here.
          • Can do operations in-place, e.g., a += b.
          • Note that a * b does not give conventional matrix multiplication; need special function (see below).
        • Relational: Standard Python relational operators
          • Are applied element-wise to generate matrices of boolean values from operand matrices(which may themselves be augmented by broadcasting if necessary).
          • Can use these boolean matrices as indices (see above) or as input to matrix-to-scalar summary functions (see below).
        • Function:
          • Most math and cmath functions are available in forms that operate on ndarrays; f(a) returns copy of ndarray a as modified element-wise by function f().
          • dot(a, b) returns matrix resulting from conventional matrix-multiplication of a and b.
          • There is also a group of matrix-to-scalar functions that summarize matrices, e.g., min(), max(), sum(), prod().
          • Are many others ...
      • Syntax modifications:
        • Enumerate over rows / outermost ndarray index: for r in a: ...
        • Enumerate over elements of ndarray: for e in a.flat: ...
        • Enumerate over index-element pairs of ndarray: for ind, elm in ndenumerate(a): ...
          • As convenient as it is, ndenumerate(a) is much slower than enumeration via a.flat.
        • Display ndarray: print a
          • If space not sufficient to display full basic-matrix slice, will replace central elements of slice with dots to indicate missing elements.

Friday, November 28 (Lecture #31)
[Class Notes]

  • Basic Python II: Modules (Cont'd)
    • Efficient Manipulation of (Numerical) Multidimensional Arrays: The NumPy Module (Cont'd)
      • NumPy supplies special functions "a.dump(f)" and "a.load(f)" to write ndarrays to / read arrays from file f in space-efficient pickle format.
      • A special NumPy sub-type Matrix is supplied for high-speed 2-D ndarray operations. Note that under Matrix, the *-operator corresponds to matrix multiplication.
    • Matlab-like Plotting: The Matplotlib Module
      • Yet another framework-module like Tkinter; in this case, one sets up a specification of a graph to be plotted, and then one triggers the actual plotting and creation of either a plot-file or a display window. The specification mimics that used in the plotting functions implemented in the Matlab system.
      • As with Tkinter, keep in mind that this lecture is a very basic overview of Matplotlib -- there are many more features in Matplotlib than we will cover here, as well as extra parameters and functionality for those features that we do describe. More details can be found on the Matplotlib reference page.
      • General structure of a simple Matplotlib plotting script:

          from pylab import *
          
          ## Set up plot-data in 1-D list(s)
          
          ## Describe plotting surface characteristics
          
          ## Describing / creating plotted line(s)
          
          ## Display or save plotted figure
          

      • Setting up plot-data
        • Arrays of x/y-coordinates are stored internally in Matplotlib as 1-D NumPy arrays. However, the functions that use these co-ordinates will accept sequences (lists or tuples) and do appropriate conversions.
        • You can create NumPy arrays of x- or y-co-ordinates directly using arange(l, u{, i}); such arrays are useful when specifying the plotted points in x-data/function format (see below).
      • Describing a plot surface
        • Immediate attributes of the plot surface can be set by various functions, e.g.,
          • xlabel(s): Set (horizontal) x-axis label to s
          • ylabel(s): Set (vertically-rotated) y-axis label to s
          • title(s): Set title of plot (centered above plot) to s
        • Can also control the portion of the co-ordinates that are plotted using xlim(l, u) and ylim(l, u). If these are present, they must occur after the descriptions of individual plot lines (see below).
      • Describing and creating an individual line-plot
        • General syntax: (p =} plot(point-spec, line-spec)
        • The plotted x/y points can be specified in three ways:
          • y-data, e.g., plot(y): A y-coordinate list or array is given, and x-coordinates in the range 0,...,len(y)-1 are generated automatically and paired with the appropriate y-values.
          • x/y-data, e.g., plot(x, y): x- and y-coordinate lists or arrays are given and automatically paired in zip-fashion.
          • x-data/function, e.g., plot(x, f(x)): An x-coordinate NumPy array is given with a function f(), and y-coordinates are generated automatically using f() and paired with the appropriate x-values. The function may be expressed as a Python function object or a NumPy array-expression written in terms of x, e.g., x * x, (2 * x) + 1, 2 ** x.
        • Plot-line characteristics are expressed in terms of a three-region string in which the first, second, and third regions are codes for the requested line color, line style, and x/y-point marker style. The most commonly-used codes are as follows:
          • Line color: b (blue), g (green), r (red), c (cyan), m (magenta), y (yellow), k (black)
          • Line style: - (solid), -- (dashed), -. (dash-dot), : (dotted), null string (no line connecting point-markers)
          • x/y-point marker style: . (point), o (circle), ^ (triangle), s (square), D (diamond), p (pentagon), h (hexagon) + (plus-sign), x (cross)
      • A plot-window displaying the described plot is created using show() (this transfers control to the plot-window; on termination of this window, control is passed back to the plot-generating Python script). The plot can also be saved to a file using savefig(filename.ext), where ext specifies the format in which the plot is saved.
      • Example: Plotting a y-data line (mplot1.py)
      • Example: Plotting an x/y-data line (mplot2.py)
      • Example: Plotting an x-data/function line (mplot3.py)
      • Example: Generalized plotting of an x-data/function line (gmplot1.py) [gmplot1_1.dat, gmplot1_2.dat]
      • Advanced plotting
        • Multiple single-line plots on one surface: An n x m grid is implicitly specified using calls to subplot().
          • Prior to each plot() call, have a call subplot(n, m, i) which specifies the grid dimensions (rows x columns) and the index-position i in which that plot is placed (i is in the range 1, ..., m * n and indicate positions starting at the upper left-hand corner and moving left to right and down the rows to the lower right-hand corner).
          • Individual x- and y-labels of the sub-plots may be set by placing the appropriate xlabel() and ylabel() calls between the subplot() and plot() calls.
        • Multi-line plots: In this case, create a list P of n plot-objects via the appropriate calls to plot along with an n-length list L of strings (possibly containing embedded LaTeX code) describing the individual plot-linesa and call function legend(P, L, loc=loc-str), which generates a single plot in which all specified lines are plotted and a legend is placed on the plot in the position specified by loc-str, e.g., "upper right", "lower center", lower left".
      • Example: Sub-plotting several x-data/function lines on a 2 x 2 plot-surface grid (mplot4.py)
      • Example: Multi-plotting several x-data/function lines on a single plot surface (mplot5.py)
      • Example: Generalized multi-plotting of several x-data/function lines on a single plot-surface (gmplot2.py) [gmplot2_1.dat]
    • Two lessons can be drawn from the last three lectures:
      • If you are doing numerical processing, get familiar with the various numeric-processing libraries in Python.
      • Modules like gmpy and NumPy can be seen as temporary, given that the efficiency concerns that motivated their creation may be irrelevant or may not matter as much in future as computers get faster; however, they are certainly necessary now, to enlarge the potential user-base for Python to hard-core numerical processing folk (the Fortran / C / C++ Brigade).

Week 14 [Lang, Section 6; Class Notes]

Monday, December 1 (Lecture #32)
[Lang, Section 8.10 and Appendix B.4; PyCook, Section 8; PyNut, Section 18; Class Notes]

  • Basic Python II: Modules (Cont'd)
    • Testing, Debugging, and Optimizing Python Scripts: The doctest, timeit, profile, and pstats Modules
      • Testing, debugging, and optimizing are the activities underlying the Ordered Holy Trinity of Programming: Make it work, make it right, make it fast.
      • Systematic testing is typically done by making sure that a program produces correct answers relative to a specified set of test cases (if correctness is judged by a test case producing the same answer as that produced by a previous program thought to be correct, this is called regression testing).
      • Testing can be done relative to individual program units (typically functions) or the system as a whole; focus here on the former.
      • Simple unit testing: The doctest module
        • Good for testing test-cases that are simple outputs of functions.
        • To invoke, have as main program import of doctest and statement doctest.testmod(); this will locate all examples in doc-strings with associated outputs and automatically run examples and compare against outputs, flagging those that differ.
          • Make sure given outputs for examples in doc-strings are themselves correct! This can be ensured by cut-and-paste of examples and outputs from Python interpreter session.
        • Example: Testing number-string generation (numstr1.py)
      • Once you have isolated a problem by testing and can trigger it when necessary with one or more test cases, debugging is in order.
      • At heart, debugging is essentially interrogating various objects at specific points in a program run to see if they are what you think they should be. This is most simply done with print statements; however, there are several modules that allow more advanced forms of interrogation, e.g., inspect, pdb.
      • Much of Python is optimized already, meaning that code will often be fast enough. If you must optimize, do the following in order:

        • Make sure there is a speed problem, i.e., run speed benchmark tests.
        • Find out what parts of the code are taking the most time, and are hence worth optimizing (profiling).
        • Do large-scale optimization, i.e., choose better algorithms.
        • Do small-scale optimization, i.e., choose better statements/ constructs.

      • If programs or program portions are executed a large number of times, an ordinary wristwatch suffices to do gross benchmarking.
      • Distinguish several types of execution time:
        • elapsed time : wallclock time
        • system time : time spent by operating system doing I/O
        • user time : time spent processing data
        • CPU time : total execution time (system + user)
      • Precisely measuring execution time of code-segment: The os-times() function
        • To use this function, import os, call t0 = os.times() before the segment of interest, and t1 = os.times() after the segment.
        • Given t0 and t1, for that segment, elapsed time = t1[4] - t0[4], user time = t1[0] - t0[0], system time = t1[1] - t0[1], and CPU time = system time + user time.
        • Example: Timing number-string generation #1 (numstr2.py)
      • Precisely measuring CPU time of statement(s): The timeit module
        • To use this module after import, set up a Timer object by specifying one more statements and the setup-statements for those statements, and then calling timeit relative to the Timer object with the requested number of iterations of the statements.
        • Example: Timing number-string generation #2 (numstr3.py)
        • There is also a command-line version of timeit (see doc-string of timeit module and page 484 of PyNut for details).

Wednesday, December 3 (Lecture #33)
[Lang, Section 8.10 and Appendix B.4; PyCook, Section 8; PyNut, Section 18; Class Notes]

  • Basic Python II: Modules (Cont'd)
    • Testing, Debugging, and Optimizing Python Scripts: The doctest, timeit, profile, and pstats Modules (Cont'd)
      • Profiling code performance: The profile and pstats modules
        • The run-function in profile runs a particular command via exec() and stores profiling information on that command in a specified file; the information in such files may then be sorted and/or reduced prior to display using the Stats object in pstats.
        • To profile programs, you may find it useful to create a special main-program function that is callable via exec().
        • Example: A profiler for program matm3.py (adapted from matm2.py in Assignment #6, Question #2) (profile_matm3.py) [matm3.py, m1.mat, m2.mat, m3.mat, com1.txt]
      • More often than not, large-scale optimization via better choice of algorithms suffices to handle problems identified by profiling. If further optimization is necessary, the following are some common ways of obtaining additional speedup:

        • Use join() to create strings by repeated concatenation (O(n^2) to O(n)).
          • ... Though this only matters when the string-lists are of sufficient length! (try for n = 5, 50, 100, and 250 in numstr2.py and numstr3.py above). This is not surprising, given the leading constant and trailing terms hidden by O() notation.
        • Use "decoration" instead of special sort() comparators (5x speedup).
        • Avoid "from X import *" where possible.
        • Replace loops over lists with list comprehensions.
        • In multi-nested loops, "hoist" code that does not depends on inner loop indices to outer loops (this includes use of global variables).
        • Inline short functions
        • Avoid module prefixes for frequently-called functions by using "from X import Y".

        Before starting any time-consuming optimization, it is always worth doing a timing study to make sure that the effort is truly worthwhile in terms of potential increases in code performance (see repeated-string example above).
  • Course Evaluation Questionnare (CEQ)


References

  • Langtangen, H.P. (2008) Python Scripting for Computational Science (Third Edition). Texts in Computational Science and Engineering no. 3. Springer; Berlin. (Abbreviated above as Lang)
  • Loui, R.P. (2008) "In Praise of Scripting: Real Programming Pragmatism." IEEE Computer, 41(7), 22-26. [PDF]
  • Lutz, M. (2008) Learning Python (Third edition). O'Reilly. (Abbreviated above as LutzL)
  • Lutz, M. (2006) Programming Python (Third edition). O'Reilly. (Abbreviated above as LutzP)
  • Martelli, A. (2006) Python in a Nutshell (Second edition). O'Reilly. (Abbreviated above as PyNut)
  • Martelli, A., Ravenscroft, A.M., and Ascher, D. (2005) Python Cookbook (Second edition). O'Reilly. (Abbreviated above as PyCook)


(end of diary)


Created: September 2, 2008
Last Modified: December 1, 2008