Computer Science 2500, Winter '08
Course Diary


Copyright 2008 by H.T. Wareham
All rights reserved


Week 1, Week 2, Week 3, Week 4, Week 5, (Class Exam #1 Notes),
Week 6, Week 7, Week 8, Week 9, (Class Exam #2 Notes),
Week 10, Week 11, Week 12, Week 13, (Final Exam Notes),
(end of diary)


Week 1 [Lang, Sections 1 and 2 and Section 3.1.4; Class Notes]

Monday, January 7 (Lecture #1)
[Lang, Section 1; LutzL, Section 1; Class Notes]

Wednesday, January 9 (Lecture #2)
[Lang, Section 2; LutzL, Sections 2 and 3; Class Notes]

Friday, January 11 (Lecture #3)
[Lang, Section 3.1.4; LutzL, Sections, 9; Class Notes]


Week 2 [Lang, Sections 3.1.2 and 3.1.4; Class Notes]

Monday, January 14 (Lecture #4)
[Lang, Sections 3.1.2 and 3.1.4; LutzL, Sections 9 and 10; Class Notes]

Wednesday, January 16 (Lecture #5)
[Lang, Section 3.1.2; LutzL, Sections 12 and 13; Class Notes]

Friday, January 18 (Lecture #6)
[Lang, Section 3.1.2; LutzL, Sections 12 and 13; Class Notes]


Week 3 [Lang, Sections 3.1.1, 3.2.1-3.2.3, and 3.2.6-3.2.8; Class Notes]

Monday, January 21 (Lecture #7)
[Lang, Sections 3.1.1 and 3.2.1-3.2.3; LutzL, Sections 4-6 and 14; Class Notes]

Wednesday, January 23 (Lecture #8)
[Lang, Sections 3.2.1-3.2.3; LutzL, Sections 4-7; PyCook, Sections 3.12-3.14; Class Notes]

Friday, January 25


Week 4 [Lang, Sections 3.2.4 and 3.2.6-3.2.8; Class Notes]

Monday, January 28 (Lecture #9)
[Lang, Sections 3.2.6-3.2.8; LutzL, Section 7; Class Notes]

Wednesday, January 30 (Lecture #10)
[Lang, Sections 3.2.4 and 3.2.6-3.2.8; LutzL, Sections 7-9; Class Notes]

Friday, February 1 (Lecture #11)
[Lang, Section 3.2.4; LutzL, Sections 8 and 9; Class Notes]


Week 5 [Lang, Sections 3.2.4 and 3.2.10; Class Notes]

Monday, February 4 (Lecture #12)
[Lang, Sections 3.2.4 and 3.2.10; LutzL, Sections 8 and 9; Class Notes]

  • Basic Python I: Object-Types (Cont'd)
    • Sequences (Cont'd)
      • Operators: (Cont'd)
        • Function: (Cont'd)
          • Iteration over sequences: min(), max(), sum()
            • Assume unnested sequence; will ignore subsequences or crash in the presence of subsequences.
          • Combine sequences: zip()
          • Apply function to each element of sequence: map()
      • Syntax modifications:
        • Sequence assignment
        • for-in construct for iteration over sequences
        • String formatting allows multiple-target binding if right-hand side is sequence of appropriate length, i.e, number of targets must match number of elements in sequence.
      • List comprehensions
        • Are actually sequence comprehensions.
        • Form: [op(x) for x in S {if cond(x)}]
        • Can run much faster than loop-version, as it is executing a construct directly in the interpreter rather than being interpreted on a statement-by-statement basis.
      • List representations in Python, and why you should care
        • Lists are implemented as arrays of references; hence, changes made to an object via one variable and its reference also show up for all variables that reference that object.
        • This explains the occasional differing of returned boolean value for "==" (which checks value equality) and "is" (which checks reference equality). This also opens a whole can of worms with respect to storage of the other types we have looked at.
          • Numbers stored as objects with multiple references!
          • Note odd behavior of these two on strings -- short strings stored as objects (which will give True under both) but longer strings may be stored as lists of objecvts (and hence will differ).
        • On the whole , "==" is safest -- save "is" for those (few) situations when you really care about object- rather than value-equality.
        • How then, do you copy lists?
          • Call to list() function, i.e., l2 = list(l1) [shallow]
          • Slice-copy, i.e., l2 = l1[:] [shallow]
          • Deep-copy, i.e., l2 = copy.deepcopy(l2)

Monday, February 4

  • Class Exam #1 Notes
    I'm making up the first in-class exam. The exam will be closed-book. It will be 50 minutes long and has a total of 50 marks (this is not coincidental; I have tried to make the number of marks for a question equivalent to the number of minutes it should take you to do it). There will be 4 short programming questions. Topics include all material covered up to the last lecture before the exam. I hope the above helps, and I wish you all the best of luck with this exam.

Wednesday, February 6

  • Class Exam #1

Friday, February 8 (Lecture #13)
[LutzL, Section 5; Class Notes]

  • Basic Python I: Object-Types (Cont'd)
    • Sets
      • Sets are mutable unordered heterogeneous collections of hashable-type objects in which two objects of equal value cannot occur, i.e., sets in Python are not multisets.
        • "Hashable" essentially means that there is a hash-function associated with that type which can produce an index-value for any object of that type. These values are used for very fast lookup.
        • All immutable types are hashable.
      • At present in Python, there is no set literal (this may change in Python 3.0; see LutzL, p. 109); sets are are created by calling set() with a list (or an expression that produces a list) as an argument.
        • If a string is passed in as the argument to set(), it creates the set of all unique characters in the string!
      • Operators:
        • Symbolic: in (memership), | (union), & (intersection), - (difference), ^ (symmetric difference), == (set equality), less-than-equal-to (subset), greater-than-equal-to (superset), less-than (proper subset), greater-than (proper superset)
          • For any modification-operation x above, can have x=, which is useful if we want to replace one of the sets in that operation by the result of that operation, e.g., s1 ^= s2.
        • Function:
          • Set attributes: len
          • Set modification: add, remove (error if not present), discard (no error if not present), pop (remove and return random element), clear
          • Set combination: union, intersection, difference, symmetric_difference, update
          • Set comparison: issubset, issuperset
          • Set copy: copy
        • Syntax modifications:
          • for-in construct for iteration over sets
      • Sets are very useful for maintaining collections of distinct objects, e.g., list of distinct birds observed during an expedition.

Week 6 [Lang, Sections 3.2.5, 3.3, and 8.3.1-8.3.2; Class Notes]

Monday, February 11 (Lecture #14)
[Lang, Sections 3.2.5; LutzL, Section 8; Class Notes]

  • Basic Python I: Object-Types (Cont'd)
    • Dictionaries
      • Dictionaries are mutable unordered heterogeneous collections of key : value pairs. Values can be any object (including nested dictionaries or lists), but keys must be hashable-type objects. Each key : value pair is known as an item.
      • Literals: {key1 : value1, key2 : value2, ...} or {} (empty dictionary)
        • Can also create dictionaries using dict(), which takes either lists of 2-tuples (in which the first value is interpreted as the item-key and the second value is interpreted as the item-value) or in a special keyword-form, e.g., dict(name = 'bob', age = 45), in which each x = y expression is interpreted as an item whose key is x and value is y.
        • The 2-tuple version of dict() allows handy creation of dictionaries using zip().
      • Operators:
        • Symbolic: in, del
        • Function:
          • Dictionary attributes: len, items, values, keys, sorted, d[k] (error if key k not in d), d.get(k,{x})
            • If dictionaries are nested, can access lower-level elements by d[kl1][kl2] ... syntax (analogous to nested-list element access).
          • Dictionary modification: d[k] = v (add / update), pop, clear
          • Dictionary combination: update
          • Dictionary copy: copy
      • Syntax modifications:
        • for-in construct for iteration over dictionary keys.
      • Dictionaries have many uses:
        • Storage of non-numeric indexed quantities (see example below)
        • Record-structures
        • Sparse data structures (use tuples as keys!)
      • Example: Solution for Question #4 of Class Exam #1 using dictionaries (authorCount2.py)

Wednesday, February 13 (Lecture #15)
[Lang, Sections 3.3 and 8.3.1-8.3.2; LutzL, Sections 9 and 15; LutzP, Section 19; Class Notes]

  • Basic Python I: Object-Types (Cont'd)
    • Storing Persistent Objects I: Basic Techniques
      • How do we store the various object-types we've seen so far between program executions? This is typically done in files. For now, let's look at the simplest types of file storage, and leave discussion of advanced file-indexing and database-style access for later in the course.
      • Can write string-representations of objects obtained using str() or repr() to text files; these can be read back i.n either with user-written parsing or the eval() function.
        • With str(), the representation of a string may leave out quotes; in this case, repr() is safer
        • eval() will actually execute any Python command given in the string-argument; should be used with extreme care.
      • More compact string-reprresentations of types can be obtained using the pickle module.
        • To use, import the pickle module ("import pickle") and use "pickle.dump(X,fout)" to write an object X to an open textfile fout; to retreive that object from the subsequently opened textfile fin, use "X = pickle.load(fin)".
        • Can store multiple objects in one file; just have to make sure that you re-load them in the same order in which they were dumped.
        • Pickle uses a technique called serialization to create these string representations, which are ideal for transfering Python objects over the Internet.
  • Basic Python II: Functions
    • Why use functions?
      • Single parameterized occurences of commonly-used pieces of code, which can lead to fewer errors if modifications are required, e.g., reading in textfiles and converting them to lists of words.
      • Hides low-level details, which makes calling code more readable, e.g., hide pickling load and dump commands inside databased load and dump functions.
      • Allows specification of application-specific function libraries, which simplified application development, e.g., sparse 2D matrix manipulation.
    • Basic syntax: def f(x1, x2, ...): statements; return {something}
    • Interesting features of Python functions
      • Function parameter are not typed and can vary in number
        • Function parameter / call-argument matching can be done positionally or by parameter name (keyword form).
        • Non-keyword call-arguments matched positionally as far as possible, and remainder are placed in *X (if *X is included as a function parameter).
        • Keyword call-arguments not matched are placed in **X (if **X is included as a function parameter).
        • With overloaded / polymorphic operators, allows true multi-type functions (sort of like Java generics).
      • Function parameters can have default values set in the parameter list itself (analogue of keyword-form in function call).
      • Function parameter values set "by assignment"
        • Attempts to change immutable objects in a function will cause errors; hence, for all practical purposes, immutable objects are passed by value and mutable objects are passed by reference.

Friday, February 15

  • Lecture cancelled

Week 7 [Lang, Sections 3.2.5, 3.3, and 8.3.1-8.3.2; Class Notes]

Monday, February 18

  • Midterm break; no lecture

Wednesday, February 20

  • Midterm break; no lecture

Friday, February 22 (Lecture #16)
[Lang, Sections 3.3; LutzL, Sections 15-17; Class Notes]

  • Basic Python II: Functions (Cont'd)
    • Interesting features of Python functions (Cont'd)
      • Function return values are not typed and can vary in number
        • Any comma-separated list will be treated as a tuple; hence, even though technically one thing is returned, you can return any number of things (as items in that tuple).
      • Functions are objects
        • At most basic, this means functions can be assigned to variables and stored / passed around like other objects (which makes map() much more powerful, for instance).
        • Functions can now be defined anywhere in the code, even inside conditional statements or loops, , conditional definition of a function itself rather than implementing conditional behavior by conditional statements inside a function.
        • To trigger function-object f, use apply(f, pargs {, kargs}) or f(arg1, arg2, ..., argn) to trigger function object; apply is particularly useful if you do not know the number of arguments at coding-time.
          • The features of apply() are so convenient that they are being made part of the core language syntax, namely f(*parg, **karg), where parg and karg are a list and a dictionary, respectively. As apply() will be eliminated in future versions of python, should get used to using the new syntax.
          • Note how this new syntax is consistent with how excess arguments are handled in current Python function calls.
        • Can embed functions into lists and dictionaries to create jump tables, which specify (by index or key) the actions to be performed in a particular situation.
        • Critical to define functions used at top of file; if they are at the bottom, will not be accessible to main program.
        • What is main program now? For our purposes at the moment, it is the block of non-function code at the end. However, once we start creating scripts consisting purely of functions, this will have to be modified slightly.
    • Variable scope
      • Python maintains a hierarchy of variable namespaces; the sam,e variable-name may exist in multiple namespaces, each with a different associated ibject and type-interpretation.
      • The LEGB Rule.
        • Specifies namepsace-order in which Python looks for variable-interpretations (local (function), enclosing function (in reverse nesting order), global (module), built-in (Python language)).
      • Can be short-circuited with the "global" statement; however, you really shouldn't ...
    • Lambda functions
      • Syntax: lambda arg1, arg2, ..., argn: expression
      • Essentially, allows the definition of very short anonymous functions.
      • Why not just use a def?
        • Can appear anywhere an expression does, e.g., function-argument to function (like map), jump tables.
        • Allows functions to be defined closer to where they are used (code proximity).

Week 8 [Lang, Section 3.4; Class Notes]

Monday, February 25 (Lecture #17)
[LutzL, Sections 18-21; Class Notes]

  • Basic Python II: Modules
    • What Is (and Isin't) A Module in Python
      • A module is collection of variable-names and their associated objects; these variable-object pairs are known as attributes.
      • A module in Python can correspond not just to a Python script but to a collection of functions and/or data structures written in another languages such as C or Fortran that are accessed by Python scripts.
      • A module is more than an included library or a compile-time directive (in that it is an assignment-like statement that is executed) and less than a true OOP-style object or class (in that it does not implement the privacy portion of encapsulation, or force data in a module to be manipulated purely by functions in that module).
    • Accessing Module Attributes via Import
      • Three syntactic variants:
        • import X
        • from X import Y {as Z}
        • from X import *
      • The first variant makes all attributes Y of module X accessible by the syntax X.Y, the second adds attribute Y of X to the calling module's namespace such that it can be accessed directly as Y (or Z, if the as-clause is used), and the third adds all attributes Y in X to the calling module's namespace directly.
      • An import-statement does three things in order: finds the requested module, compiles it to bytecode (maybe), and executes its statements (from top to bottom).
        • Finding is done in local directory or under guidance of Python path list (stored in sys,path).
        • Compilation (to a .pyc bytecode file) is done if script does not contain a main program (see below) and .pyc file does not exist or changes have been made to file since previous .pyc creation.
        • Execution creates all functions and objects specified by the script.
        • Is this convenient? Lord yes. Is it time-consuming? Again, Lord yes. This is why, in situations where imports of the same module occur multiple times, e.g., the interactive interpreter environment, all three steps are only done on first import and subsequent imports only link to the established module-object.
          • Problematic if relying on value-initialization via import.
          • Can get around this (to a degree) with reload().
      • An import gives access to and the ability to change all imported attributes -- this cannot be overriden.
      • The from-versions actually invoke full imports, so they do not save time by selective import. Be careful using these (particularly from *), as they will overwrite the values of variables in the calling module with the same name as imported attributes.
        • With from *, can prevent import by either naming a variable with an initial underscore, or restricting the imported attributes to those in list __all__.
        • Note that this is not a private declaration, as stuff hidden in this manner can still be made accessible by a regular import statement, i.e., you can hide but you can still be run.
      • Imports of whole directories of modules at once also possible, and is desirable in larger Python systems -- however, we will not cover such package imports in this course.
    • Coding Modules
      • Use __main__ to delimit main-program code (by using the 'if __name__ == "__main__":' construct to delimit code that is run if the script is run stand-alone mode; consider using this in tandem with a main() function).
        • If module consists purely of functions that are not run in stand-alone mode, e.g., a math function library, use the main program to store module self-test code.
      • Use _X and __all__ to limit namespace pollution from imports.
      • Follow standard software engineering practice, e.g.,
        • Minimize coupling of modules via use of "global" objects/ data structures to pass information between modules.
        • Maximum coherence of modules by making sure attributes in a module have a common sensibly-defined purpose and that these attributes associated with this purpose are not split across multiple modules.

Wednesday, February 27 (Lecture #18)
[Lang, Section 3.4; Class Notes]

  • Basic Python II: Modules (Cont'd)
    • Given that we now know about how modules work in Python, let's spend the next few lectures looking at services provided by some of the standard Python modules.
      • To get a quick overview of variables and functions associated with a module X, from within the Python interpreter, import that module ("import X") and print its associated doc-string ("print X.__doc__").
    • Accessing Python interpreter internals: The sys Module
      • Attributes we have seen so far: argv, path, exit(), getrefcount()
      • Variables stdin, stdout, and stderr store the file-objects associated with where interpreter input comes from and interpreter output and error messages go. The defaults for these are the keyboard and terminal screen, respectively -- however, these can be changed, e,g., redirect error messages to a specific file.
        • The original version of each stream X is stored in __X__, and can be recovered; however, Python being Python, you can change these too ...
      • Example: Fun and games with stdin, stdout, and stderr ( sysRedirect.py)
    • Accessing System Files and Directories: The os, shutil, and glob Modules
      • The os module operates on an abstract file system which is a directory-tree with non-empty directories as internal nodes files (and empty directories) as leaves.
        • Can designate any directory as a current working directory (cwd).
        • Can designate directory-paths linking entities in the tree. Each such path can be cracked into a directory-path and an entity-name (with the latter being empty if the entity is a directory).
        • Each entity in the tree has a unique associated directory-path from the root-directory to that entity (absolute path).
        • Each entity in the tree has a unique associated directory-path from a designated cwd to that entity (relative path).
      • By operating on an abstract file-system whose path-specifics are stored as variables, e.g., path separator, the os module can be customized to allow generic file and directory access on many types of operating systems.
        • This in turn allow you to write operating-system invariant code! (provided, of course, that all file-mannipulation is done using os-module variables and functions).
      • Services provided:
        • Variables: name, curdir, pardir, sep, extsep
        • Functions:
          • Entity characteristics: stat(), listdir()
          • Change entity characteristics: chmod(), rename()
          • Create / delete entities: remove(), mkdir(), rmdir()
            • rmdir() only removes an empty directory; to remove all files and directories in a directory, use shutil.rmtree().
            • To copy files and directories, use copy(), copy2(), and copytree() from the shutil module.

Friday, February 29

  • Lecture cancelled

Week 9 [Lang, Sections 3.4 and 8.2; Class Notes]

Monday, March 3 (Lecture #19)
[Lang, Sections 3.4 and 8.2; PyNut, Section 9; Class Notes]

  • Basic Python II: Modules (Cont'd)
    • Accessing System Files and Directories: The os, shutil, and glob Modules (Cont'd)
      • The os.path sub-module provides additional services for manipulating paths themselves
        • Characteristics of entity reached by path: exists(), getsize(), getmtime(), isfile(), isdir(), islink()
        • Path characteristics: abspath(), dirname(), basename(), split()
        • Path construction: join()
      • Traversing a directory tree
        • If all you want to do is visit and perform the same operation on each file in each file in a directory tree (possibly accumulating results from each file in a variable or list as you go), use os.path.walk().
          • Usage: os.path.walk(root, myfunc, arg), where myfunc has the form myfunc(arg, dirname, files) such that dirname is the directory being examined and files is a list of all files in that directory.
        • If you want to do something more complex, code up a traversal yourself using your favorite recursive tree-traversal algorithm as a template (one such example is on p. 124 in Section 3.4.7 of Langtangen).
      • Selective directory listing with glob
        • listdir() is well and good for listing all files in a directory. However, we often only want files of a particular type, e.g., Python scripts, files whose names start with capital letters.
        • glob(pat) in module glob returns alist of all files in current working directory whose names match the pattern pat.
          • Patterns incorporate regular symbols and various pattern-specifiers, e.g., ? (any single character), * (0 or more characters), [x ... z] (any one of characters x .. y) [x-y] (any one of characters in Unicode / ASCII range x - y).
    • Pattern-Matching: The re Module
      • The elementary pattern-matching in glob would be a useful thing to have for string-processing in general. Such a general facility is provided by the re module, in which patterns are represented as regular expressions.
      • What is a regular expression?
        • A regular expression specifies a set of strings; if a string is in that set, it is said to match the regular expression.
        • At its most basic, a regular expression is a sequence of units, where each units specifies a choice of one or more things that are repeated some number of times.
          • Something (entity): Symbol
          • Choice: ., [], [^], \w, \W, \d, \D, \s, \S, |
          • Repetition (quantifiers): *, +, ?, {m}, {m,}, {m,n}
          Note that special status of symbols overriden inside square brackets ("a+" vs. "[a+]"). To avoid ploferation of backslashes to avoid escape-versions of characters, use raw srtrings (r'...').

Wednesday, March 5 (Lecture #20)
[Lang, Section 8.2; PyNut, Section 9; Class Notes]

  • Basic Python II: Modules (Cont'd)
    • Pattern-Matching: The re Module (Cont'd)
      • What is a regular expression? (Cont'd)
        • Units can be built out of other units by grouping with parentheses; aside from aiding clarity, such groups can also be accessed by position going from left to right in the expression (\i, for i greater than 1), or even given names (which we will not get into in this course).
          • Allows one to refer to previous matches later in pattern, i.e., "(\d+)X\1".
        • In languages like Python, regular expressions are augmented to consider matches of one or more substrings inside a string.
          • Can specify where in string match must occur to be valid (^ / \A, $ / \Z).
          • If multiple matches are possible, can override greedy default (longest) to match shortest (trailing ? on quantifier).
      • Regular expression matching in Python: The match object
        • Describes result of matching a particular regular expression against a particular string.
        • Variables: string, re, pos, endpos
        • Functions:
          • Matched-group characteristics: group(), groups(), end(), start(), span(*)
          • Applying matched groups: expand()

Wednesday, March 5

  • Class Exam #2 Notes
    I've made up the second in-class exam. The exam will be closed-book. It will be 50 minutes long and has a total of 50 marks (this is not coincidental; I have tried to make the number of marks for a question equivalent to the number of minutes it should take you to do it). There will be 3 short programming questions. Topics include all material covered up to the last lecture this week. I hope the above helps, and I wish you all the best of luck with this exam.

Friday, March 7

  • Lecture cancelled

Week 10 [Lang, Sections 4.1, 4.3.1, and 8.2; Class Notes]

Monday, March 10 (Lecture #21)
[Lang, Sections 4.1, 4.3.1, and 8.2; PyNut, Sections 9, 15, and 16; Class Notes]

  • Basic Python II: Modules (Cont'd)
    • Pattern-Matching: The re Module (Cont'd)
      • Applying regular expressions
        • Services on regular expression objects:
          • Apply re to string (return match-object): match(), search()
          • Apply re to string (return list of (non-overlapping) matched substrings): findall()
          • Manipulate string using re: split(), sub(), subn()
        • Creating regular expression objects: compile()
          • A regular-expression object (re) has variable pattern giving the pattern-string from which it was created.
          • During compilation, can also specify various flags that modify interpretation of pattern, e.g., IGNORECASE, DOTALL, VERBOSE, MULTILINE.
        • The services above are available through the re module itself, if the regular expression is given as a pattern to the function; however, these versions lack some of the functionality available in the re-object versions, e.g., cannot specify start / end positions foir matches in string.
    • General Numerical Processing: The math, cmath, random, and gmpy Modules
      • Many commonly-used mathematics functions are given in the math module; where applicable, versions of these functions for complex numbers are given in the cmath module.
        • These modules also have variables giving the values of mathematical constants e and pi (which, oddly enough, have the names e and pi).
      • The random module provides many functions associated with uniform distributions, e.g., random(), sample(), uniform(), shuffle(), seed().
        • The random module also offers threse services relative to other commonly-used distributions, e.g., Gaussian, exponential; if you are manipulating such distributions, do consult the documentation on this module to see if what you need has already been written.
      • The gmpy module implements efficient arbitrary-precision integer and float types, as well as a rational-number type.
    • Efficient Manipulation of (Numerical) Multidimensional Arrays: The NumPy Module
      • As the last few course assignments have shown, nested lists in Python allow easy implementation of multidimensional numerical array processing; however, large-scale numerical processing done in this fashion is very slow.
        • The flexibility of Python multidimensional lists (heterogenous, non-contigous memory storage, mutable) is purchased at the expense of processing efficiency!
      • The multidimensional array type ndarray underlying NumPy (by virtue of being homogenous and immutable (sort of; see below) and based on a contiguous chunk of memory) regains efficiency at the expense of flexibility.
        • Modules like NumPy can be seen as temporary, given that their concerns over efficiency may not matter so much as computers get faster; however, they are certainly necessary now, in enlarging the potential user-base for Python to hard-core numerical processing folk (the Fortran / C / C++ Brigade).

Wednesday, March 12

  • Class Exam #2

Friday, March 14

  • Lecture cancelled

Week 11 [Lang, Sections 4.1 and 4.3.1; Class Notes]

Monday, March 17 (Lecture #22)
[Lang, Sections 4.1 and 4.3.1; PyNut, Section 16; Class Notes]

  • Basic Python II: Modules (Cont'd)
    • Efficient Manipulation of (Numerical) Multidimensional Arrays: The NumPy Module (Cont'd)
      • Array characteristics: shape, size, dtype, itemsize, tolist() / flat
      • Creating arrays: array() / reshape()

Wednesday, March 19

  • Instructor sick: Lecture cancelled

Friday, March 21

  • University closed: No lecture

Week 12 [Lang, Sections 4.1, 4.3.1, and 6; Class Notes]

Monday, March 24

  • Instructor sick: Lecture cancelled

Wednesday, March 26 (Lecture #23)
[Lang, Sections 4.1 and 4.3.1; PyNut, Section 16; Class Notes]

  • Basic Python II: Modules (Cont'd)
    • Efficient Manipulation of (Numerical) Multidimensional Arrays: The NumPy Module (Cont'd)
    • Matlab-like Plotting: The Matplotlib Module

Friday, March 28 (Lecture #24)
[Lang, Section 6; PyNut, Section 17; PyProg, Sections 8 and 9; Class Notes]

  • Basic Python II: Modules (Cont'd)
    • GUI Development: The Tkinter Module
      • Tkinter / Matplotlib as frameworks
      • Core GUI entities / activities:
        • Containers (windows / frames)
        • Widgets: Widget appearance / behavior + layout in container
      • Focus today on single-container windows today; look next lecture at nested containers.
      • Window setup: Tk()
      • Basic widgets:
        • Information-display widgets: Label
        • Information-entry widgets
          • Information stored in special Tkinter variables (IntVar(), StringVar(), DoubleVar(), BooleabVar()), which are manipulated using get() and set().
          • Several types: Entry, Checkbutton, Radiobutton, Scale
        • Command-activation widgets: Button
          • Elementary event-handling done inside this widget using the command option.
            • Due to frameowrk-structure of Tkinter (in which control is handed by Python to Tkinter and events are handed back from Tkinter to Python for processing), event-handlig is also known as callback processing and eventhandlers are known as callback functions.
          • Command takes as an argument a function with no parameters; if parameters must be added, enclose this function in a "helper" lambda function (making sure that parameters are interpreted correctly at call-time).
      • Widget layout in container done by each widget
        • Layout types: pack(), grad(), place()
        • Should not mix layout-types in a single container; moreover, with pack(), should be careful about mixing widgets side-types -- results may be unexpected.
      • Example: Basic Tkinter GUI #1 (sg1.py)
        • Note that once container and widgets specified, command root.mainloop() hands control to the window, and once this terminates, so does the script. Next lecture, we will look at how to modify this to allow a richer and more productive interaction between the GUI and the rest of the enclosing Python script.

Week 13 [Lang, Sections 6 and 8.10 and Appendix B.4; Class Notes]

Monday, March 31 (Lecture #25)
[Lang, Section 6; PyNut, Section 17; PyProg, Sections 8 and 9; Class Notes]

  • Basic Python II: Modules (Cont'd)
    • GUI Development: The Tkinter Module (Cont'd)
      • When using pack(), can control (to a degree) placement of widgets when window is resized using options expand and fill.
      • Example: Basic Tkinter GUI #2 (sg2.py)
      • Can create nested containers using Frame()
        • Make sure that on creation of nested frame, is assigned appropriate parent and that it is located (using pack(), grid(), or place()) appropriately in that parent-frame.
      • Two general GUI operation modes:

        • Make and trigger GUI such that all input is accumulated by GUI, and when input is gathered, use callback to trigger root.quit to return to main program to do processing and output.
        • Make and trigger GUI such that all I/O and processing is handled via callback in GUI, and at end of session, use callback to trigger sys.exit.

        The former is good for adding pretty front-ends to existing scripts, while the latter is more suited to interactive sessions alternating I/O and processing.
      • Example: Front-end / Interactive Session GUI (sg3.py)
        • Implements both operation modes discussed above.

Wednesday, April 2 (Lecture #26)
[Lang, Section 8.10 and Appendix B.4; PyCook, Section 8; PyNut, Section 18; Class Notes]

  • Basic Python II: Modules (Cont'd)
    • Testing, Debugging, and Optimizing Python Scripts: The doctest, timeit, profile, and pstats Modules
      • Testing, debugging, and optimizing are the activites underlying the Ordered Holy Trinity of Programming: Make it work, make it right, make it fast.
      • Systematic testing is typically done by making sure that a program produces correct answers relative to a specified set of test cases (if correctness is judged by a test case producing the same answer as that produced by a previous program thought to be correct, this is called regression testing).
      • Testing can be done relative to individual program units (typically functions) or the system as a whole; focus here on the former.
      • Simple unit testing: The doctest module
        • Good for testing test-cases that are simple outputs of functions.
        • To invoke, have as main program import of doctest and statement doctest.testmod(); this will locate all examples in doc-strings with associated outputs and automatically run examples and compare against outputs, flagging those that differ.
          • Make sure given outputs for examples in doc-strings are themselves correct! This can be ensured by cut-and-paste of examples and outputs from Python interpreter session.
        • Example: Testing number-string generation (numstr1.py)
      • Once you have isolated a problem by testing and can trigger it when necessary with one or more test cases, debugging is in order.
      • At heart, debugging is essentially interrogating various objects at specific points in a program run to see if they are what you think they should be. This is most simply done with print statements; however, there are several modules that allow more advanced forms of interrogation, e.g., inspect, pdb.
      • Much of Python is optimized already, meaning that code will often be fast enough. If you must optimize, do the following in order:

        • Make sure there is a speed problem, i.e., run speed benchmark tests.
        • Find out what parts of the code are taking the most time, and are hence worth optimizing (profiling).
        • Do large-scale optimization, i.e., choose better algorithms.
        • Do small-scale optimization, i.e., choose better statements/ constructs.

      • If programs or program portions are executed a large number of times, an ordinary wristwatch suffices to do gross benchmarking.
      • Distinguish several types of execution time:
        • elapsed time : wallclock time
        • system time : time spent by operating system doing I/O
        • user time : time spent processing data
        • CPU time : total execution time (system + user)
      • Precisely measuring execution time of code-segment: The os-times() function
        • To use this function, import os, call t0 = os.times() before the segment of interest, and t1 = os.times() after the segment.
        • Given t0 and t1, for that segment, elapsed time = t1[4] - t0[4], user time = t1[0] - t0[0], system time = t1[1] - t0[1], and CPU time = system time + user time.
        • Example: Timing number-string generation #1 (numstr2.py)
      • Precisely measuring CPU time of statement(s): The timeit module
        • To use this module after import, set up a Timer object by specifying one more statements and the setup-statements for those statements, and then calling timeit relative to the Timer object with the requested number of iterations of the statements.
        • Example: Timing number-string generation #2 (numstr3.py)
        • There is also a command-line version of timeit (see doc-string of timeit module and page 484 of PyNut for details).
      • Profiling code performance: The profile and pstats modules
        • The run-function in profile runs a particular command via exec() and stores profiling information on that command in a specified file; the information in such files may then be sorted and/or reduced prior to display using the Stats object in pstats.
        • To profile programs, you may find it useful to create a special main-program function that is callable via exec().
        • Example: A profiler for program s2m.py from Assignment #5 (profile_s2m.py)

Friday, April 4 (Lecture #27)
[Lang, Section 8.10 and Appendix B.4; PyCook, Section 8; PyNut, Section 18; Class Notes]

  • Basic Python II: Modules (Cont'd)
    • Testing, Debugging, and Optimizing Python Scripts: The doctest, timeit, profile, and pstats Modules (Cont'd)
      • More often than not, large-scale optimization via better choice of algorithms suffices to handle problems identified by profiling. If further optimization is necessary, the following are some common ways of obtaining additional speedup:

        • Use join() to create strings by repeated concatenation (O(n^2) to O(n)).
          • ... Though this only matters when the string-lists are of sufficient length! (try for n = 5, 50, 100, and 250 in numstr2.py and numstr3.py above). This is not suprising, given the leading constant and trailing terms hidden by O() notation.
        • Use "decoration" instead of special sort() comparators (5x speedup).
        • Avoid "from X import *" where possible.
        • Replace loops over lists with list comprehensions.
        • In multi-nested loops, "hoist" code that does not depends on inner loop indices to outer loops (this includes use of global variables).
        • Inline short functions
        • Avoid module prefixes for frequently-called functions by using "from X import Y".

        Before starting any time-consuming optimization, it is always worth doing a timing study to make sure that the effort is truly worthwhile in terms of potential increases in code performance (see repeated-string example above).

Monday, April 7

  • Final Exam Notes
    I've finished making up your final exam. The exam will be closed-book. It will be 120 minutes long and has a total of 120 marks (this is not coincidental; I have tried to make the number of marks for a question equivalent to the number of minutes it should take you to do it). There will be 7 short programming questions split over four questions (that is, two questions will have several small programs associated with them). Topics include all material covered up to the second in-class exam. You do not have to use any modules we have covered in class to answer the exam questions; however, the use of certain modules may lead to more concise code. Also keep in mind that in addition standard frequently-used features of Python, we have looked at some standard stuff that has more specialized uses, e.g., sets, list comprehensions, else-clauses for loops.

    I hope the above helps, and I wish you all the best of luck with this exam and your other exams.


References

  • Langtangen, H.P. (2006) Python Scripting for Computational Science (Second Edition). Texts in Computational Science and Engineering no. 3. Springer; Berlin. (Abbreviated above as Lang)
  • Lutz, M. (2008) Learning Python (Third edition). O'Reilly. (Abbreviated above as LutzL)
  • Lutz, M. (2006) Programming Python (Third edition). O'Reilly. (Abbreviated above as LutzP)
  • Martelli, A. (2006) Python in a Nutshell (Second edition). O'Reilly. (Abbreviated above as PyNut)
  • Martelli, A., Ravenscroft, A.M., and Ascher, D. (2005) Python Cookbook (Second edition). O'Reilly. (Abbreviated above as PyCook)


(end of diary)


Created: January 9, 2008
Last Modified: April 7, 2008