Computer Science 2500, Fall '11
Course Diary

Copyright 2011 by H.T. Wareham
All rights reserved

Week 1, Week 2, Week 3, Week 4, Week 5, (Class Exam #1 Notes),
Week 6, Week 7, Week 8, Week 9, (Class Exam #2 Notes),
Week 10, (Final Exam Notes),
Week 11, Week 12, Week 13, (end of diary)

Week 1 [LutzL, Sections 1-3; Class Notes]

Wednesday, September 7 (Lecture #1)
[LutzL, Sections 1-3; Class Notes]

Went over course outline.
Introduction to Python
- Python is a (primarily interpreted) scripting language, unlike traditional compiled languages like C, C++, and Java.
- Why Python?
  - Easy to learn and use: Python hides messy low-level details of programming such as variable (type) declaration, memory management, and construction and manipulation of advanced abstract data types like sequences and dictionaries, and lets users focus on programming what they need to do. Some claim this leads to shorter code that is more likely to be correct than that produced in other languages like Java and C (Loui (2008), p. 23).
  - Powerful: Advanced data types like sequences, sets, and dictionaries as well as operations on these data types are part of the core syntax, enabling easy manipulation of nested heterogeneous data. Object-oriented programming with classes, methods, and inheritance is also built in, but need only be used when necessary cf. Java. Custom applications supported by easily-accessible modules, e.g., numerical computing, graphical user interfaces (GUI).
  - Mixable: Built for easy interfacing with code written in other programming languages. This is useful for both creating wrappers around legacy code and developing new applications which exploit the strengths of different programming languages (Python as glue).
  - Free and well-supported: Python is open-source and has a very active development community. In addition to supporting core language evolution, this community is also very active in developing modules for particular applications, e.g., computational biology, numerical computing, natural language processing.
- The major downside with current versions of Python (as with scripting languages in general) is runtime and memory efficiency. However, given increasing computer speed and memory availability, special-purpose Python modules, e.g., numPy, integrated code components from high-speed and memory-efficient languages like C, and various optimization techniques, these are not insurmountable problems.
- Invoking Python on Unix
  - Using the interpreter (interactive mode).
    - Type "python" and then enter code to run after the ">>" Python command prompt.
    - Allows quick testing of Python code fragments; can also import larger pieces of code from system files and run them inside the interpreter.
  - Executing Python scripts (program mode)
    - Given a piece of Python code in system file X.py that takes command-line arguments a1, a2, ..., type "python X.py a1 a2 ..." to run X.py on those arguments.
    - Note implicit invocation of interpreter via initial "python" in the command above.
  - In this course, we will focus on running Python by executing scripts as described above. There are other manners of running Python on Unix, e.g., directly executable / compiled Python scripts, as well as under other operating systems such as Windows; for details, see LutzL, Chapters 2 and 3 and Appendix A.
- What Python can do for you: Some sample scripts
  - Example: Printing the square of 10 (sq.py).
    - Note two forms of documentation: line-comments (lines starting with "#") and docstrings (set of lines surrounded by triple quotes).
- This script is simple, but very limited. What if want to take the squares of one or more command line arguments? What about squaring numbers stored in a file? And how about some error-checking? We will look at how to handle these issues (and much more) in the next lecture.

Friday, September 9 (Lecture #2)
[Class Notes]

Introduction to Python (Cont'd)
- What Python can do for you: Some sample scripts (Cont'd)
  - Example: Printing the square of a command-line argument (sqarg1.py).
    - Uses system list-variable sys.argv
      - sys.argv[0]: Python script filename
      - sys.argv[i], i > 0: command-line argument i.
    - Note conversion using int() -- this means all command-line arguments are strings by default.
    - Note continuation of command over two lines using "\"-character at end of 1st line.
  - Example: Printing the square of a command-line argument (error-checking) (sqarg2.py).
    - Uses len(sys.argv) (number of elements in list sys.argv).
    - Note that statement-blocks associated with control structure statements like conditionals and loops are denoted by indenting. It takes a bit of getting used to, but this actually turns out to be a lot safer than the traditional begin/end and parenthesis-pair statement-block delimiters.
  - Example: Printing the squares of all command-line arguments (no error-checking) (sqargs1.py).
    - Note interesting for-loop -- is operating over a (sub-)list!
    - What happens if there are no command-line arguments? If a command line argument is not a number?
      - If there are no arguments, the command-line argument sub-list is empty and nothing is done, which is OK. However, if a command-line argument is not an integer, interesting things happen (try entering "python sqargs1.py a").
  - Example: Printing the squares of all command-line arguments (with error-checking) (sqargs2.py).
    - Uses try-except construct; note how much simpler this is than error-handling in other languages, e.g., Java.
  - Example: Printing the squares of all numbers in a single-column file (sqfile1.py) [sqfile1.dat].
    - Once again, the for-loop is iterating over a list -- however, in this case, it is the list of lines in a file.
    - Note how much simpler Python file I/O is than in other languages, e.g., Java.
  - Example: Printing the squares of all numbers in a file (sqfile2.py) [sqfile2_1.dat, sqfile2_2.dat].
    - Note how much simpler splitting lines into space/tab-separated arguments is in Python than Java.
  - Example: Counting the number of occurrences of the word "line" in a file (woc1.py) [woc.dat].
    - Note use of backslash to produce double quotes in output.
    - Note that simple string-comparison operators are built into Python cf. string comparison via compareTo() method in Java.
  - Example: Counting the number of case-sensitive and -insensitive occurrences of a command-line specified word in a file (woc2.py)
    - Note slightly different syntax for intermediate conditional clauses.
  - Example: Multiple transformations of 4-column datafile (i.e., delete 1st column, swap 2nd and 4th columns, scale 3rd column by factor) (filetran.py) [filetran.dat].
    - Note list-assignment statement -- have you ever seen anything like this before?
Having seen some of the neat things Python can do (in remarkably little (and eminently comprehensible) code), let's get a grounding in the fundamentals of Python, starting with documentation and I/O.

Week 2 [LutzL, Sections 9, 10, 12, and 15; Class Notes]

Monday, September 12 (Lecture #3)
[LutzL, Sections 9 and 15; Class Notes]

Basic Python I: Documentation
- The remainder of any line after a hash-mark ("#") is a comment; use this for in-line comments and comment-blocks in the code.
- Convention for scripts is to have triple-quoted string (docstring) at top of file with details on command-line arguments and anything else worth saying about the script, e.g., who wrote it, what it does. Such docstrings can be accessed by Python documentation software; this will be discussed more later in the course, when we get to defining functions and modules.
- Within the Python interpreter, to view the docstring associated with an imported program or function X, type "print X.__doc__".
Basic Python I: Text I/O
- Two types of I/O in Python: text and binary. Focus on text in this course (though binary will be mentioned briefly when we discuss object file storage in Python later in the course).
- Text I/O in Python is line-oriented.
- Can do text I/O wrt keyboard and screen in Python using raw_input() and print.
  - print Format: print arg1, arg2, ... {,}
    - Note that args can be string literals or any type in Java (which will automatically be converted to the appropriate strings).
    - print automatically adds a line-return; this can be suppressed using a trailing comma.
  - Function raw_input() returns a line of input from the keyboard with the final line-return stripped off; if a string-literal is given as a function parameter, that string is printed as a prompt.
  - Example: Echo uppercase version of keyboard input to screen (echo.py)
  - As of Python 3.0, print has been replaced by function print() and raw_input() has been replaced by input(); for details, see pages 297-302 and 49-50 in LutzL.
- Wrt files, we can read the lines in a textfile X very nicely now with the for line in open(X): construct -- however, for file writing and appending and the more complex types of file reading, we need to look at general file I/O methods.
  - Note that in pre-3.0 versions of Python, open() in the for-open() construct can be replaced by file; though the latter is arguably more readable, it will no longer be legal in +3.0 versions of Python, and hence the former should be used.
- File access commands:
  - f = open(filename) / f = open(filename, "r"): Open file for reading.
  - f = open(filename, "w"): Open file for writing
    - Careful! If file filename already exists, will erase contents and re-open for writing.
  - f = open(filename, "a"): Open existing file for writing to end of that file, i.e., appending.
  - f.close()
    - Python automatically closes all files when a script finishes, so you don't have to do this; however, it is still a good habit to clean up after yourself.
  - f.flush()
    - Write I/O buffer to file
  - open() actually has a third buffer-size argument; setting this to 0 means (in case of writing) given string is immediately written out (can be handy in long-running programs which, if they crash, can lose buffer contents -- however, doing this means your program loses the speed benefits of buffered I/O).
- Note access to a file-object in these commands; this is one of the inbuilt types in Python. We'll look at more of these types (namely, strings and the various types of numbers) a week or so from now.
- File read commands:
  - line = f.readline(): Read and return line from file.
    - Returns null / 0-length string "" at end of file.
  - lines = f.readlines(): Read and return file as list of lines.
  - line = f.read(): Read and return file as a string (which may contain multiple lines).

Wednesday, September 14 (Lecture #4)
[LutzL, Sections 9, 10, and 12; Class Notes]

Basic Python I: Text File I/O (Cont'd)
- All the preceding talk of lines in files begs the following question: what is a line in Python?
  - A line is a character-string with a special terminator-character (or characters).
  - Line-terminators in Python include '\0' (null / end-of-file), '\n' (newline), '\r' (carriage return), and '\r\n' (Microsoft end-of-line).
  - When Python reads a line from a file, it reads characters until it encounters a terminator and it includes that terminator at the end of the returned line (can get rid of this with rstrip(); this is particularly handy when using print, which adds its own newline).
- Example: Print file to screen using for line in file(X): construct (cat0.py)
- Example: Print file to screen using readline() (cat1.py)
  - Note use of while-loop
  - Python does not allow assignment statements inside conditions; hence, cannot use classic C / Java construct while (line = f.readline()) ...
- Example: Print file to screen using readlines() (cat2.py)
- Example: Print file to screen using read() (cat3.py)
- File write commands:
  - f.write(s): Write string s to file.
  - print >> f, e: Write string specified by expression e to file f.
- Until we look at string formatting commands later in this course, the print >> f command is a great shortcut for printing stuff to files.
- Example: File-copy using readlines and write (copy1.py)
- Example: File-copy using readlines() and print >> (copy2.py)
Basic Python I: Control Structures
- General notes
  - Syntax: header: block
  - Note that a header-line always terminates with a full colon (":").
  - A block is a sequence of statements, all indented the same.
  - Control structures can be nested -- indenting can vary by block, but it must be the same within each block.
  - Can have multiple statements on one line -- separate them by a semi-colon.
    - Do not need statement terminator semicolons -- Python will let you do this, but it is not part of the Python aesthetic, i.e., Python is not C / C++ / Java.
  - How can you have one statement over multiple lines?
    - Old way: Use backslashes at end of lines to be continued.
    - New (and preferred) way: Enclose in parenthesis / be part of continued expression via comma, e.g., print-statement argument list, or square-bracket, e.g., list-specification.
- Conditional branching: The if-elif-else statement
  - Syntax: if (condition): block {{elif (condition2): block2 ...} {else: block2 }}
  - Conditions are Boolean expressions, i.e., expressions (unlike arithmetic expressions) whose values are either true or false.
    - Consists of one or more Boolean expressions connected by logical operators (not, and, or).
    - Boolean expressions can be literals (true (non-zero) or false (zero)) or relational expressions, e.g., s < 10, tag == "AUT".
    - Lutz recommends that parentheses should only be used where absolutely necessary -- however, this can compromise code readability, and is in my opinion an example of pushing the Python aesthetic too far.

Friday, September 16 (Lecture #5)
[LutzL, Sections 12 and 13; Class Notes]

Basic Python I: Control Structures (Cont'd)
- Conditional branching: The if-elif-else statement (Cont'd)
  - Example: Classifying food items (version #1) (produce1.py)
  - Example: Computing numeric to letter grades (version #1) (grades1.py) [grades.dat]
  - Example: Computing numeric to letter grades (version #2) (grades2.py)
    - Checks for out-of-range grades and uses nice range-check syntax construct.
  - Example: Computing numeric to letter grades (version #3) (grades3.py)
    - Implements efficient version (to account for classes in which most grades are low).
  - One can construct arbitrarily complex conditional branchings using (possibly nested) if-else-elif statements. However, general conditional branching is sometimes overkill. For example, a common (and simple) structure in programs is to set a variable to one of two values depending on whether a specified condition is true or false, e.g.,
    To handle this case succinctly, Python 2.5 introduced the conditional expression, which has the form
- Deterministic looping: The for statement (the partial story)
  - Syntax: for x in iterable-thingie: block
  - By iterable-thingie, we mean anything that is a set of elements that you can iterate over. We already know we can do it with lists and textfiles. As we will see in lectures to come, this is also true for many other structures in Python, e.g., strings, dictionaries.
  - This is all well and good; however, suppose I actually want a for-loop over an index, going from a lower to an upper point by some increment?
    - Try for x in range({lower,} upper + 1 {, increment}) (if you are dealing with integers).
    - Example: Classifying food items (version #2) (produce2.py)
    - Example: Evaluating nested summation expressions (evalsum.py)
    - A for-loop over a floating-point index is best coded as a while-loop.

Week 3 [LutzL, Sections 4, 5, 7, and 13; Class Notes]

Monday, September 19 (Lecture #6)
[LutzL, Sections 4, 5, and 13; Class Notes]

Basic Python I: Control Structures (Cont'd)
- Deterministic looping using the for-loop is all well and good -- however, what if you want to stop partway through a loop's execution? Unlike C or Java, modification of a loop index-variable cannot be used to terminate execution, and you cannot add a condition to the for-loop.
- Example: Finding an element in a list (Version #0: for-loop) (find0.py)
- In these cases, we need the traditional while-loop (which, given the excessive flexibility of for-loops in C and Java and the resulting potential for unreadable code, is perhaps a good thing).
- Conditional looping: The while statement (the partial story)
  - Syntax: while condition: block
  - Useful if you don't know how many times you are looping, or if you want to stop partway through a process; implement the latter using a Boolean variable and a more complex while-termination condition.
  - Example: Finding an element in a list (Version #1) (find1.py)
- This approach to partial-looping is well-known and standard, but it is wordy and hence prone to implementation error. Can we do better?
- Loop-execution modification: The break and continue statements
  - The break and continue statements are very restricted versions of the dread goto statement, and are arguably the last gasp of goto in modern programming language design.
  - The continue statement inside a loop causes execution to ignore the statements below the continue in the loop body and jump directly into the next loop iteration -- can save a level of if-nesting.
  - Example: Counting and summing the positive non-zero integers in a file (Version #1) (sumfile1.py) [sumfile1.dat]
  - The break statement inside a loop causes execution to jump to the first statement after the loop, i.e., it stops loop iteration cold -- in a while-loop, can save a level of if-nesting and extra conditions in the loop-condition designed to stop iteration partway through (and in the case of for-loops, is the only thing that can stop iteration short of sys.exit()).
    - In the case of nested loops, a break only terminates the innermost loop in which it is nested -- the other loops still continue on.
  - Example: Finding an element in a list (Version #2) (find2.py)
  - The break and continue statements are handy, but they make code less readable and there are valid alternatives in the language -- hence, they probably shouldn't be used that much and when they are, they should be used with care.
- Often when using break to stop a loop, we need to know afterwards whether or not break was executed during the loop, e.g., was the requested item found in the list or not? As we saw above, this can be simulated with a Boolean variable and some additional if-logic. Can we do better?
- Looping (the full story): The else clause
  - The block associated with a loop else-clause only executes if the loop terminated normally.
  - Example: Finding an element in a list (Version #3a: while-loop) (find3a.py)
  - Example: Finding an element in a list (Version #3b: for-loop) (find3b.py)
  - Perhaps this is how we get rid of the break statement for once and for all -- however, should elimination of one piece of special-purpose language syntax be done by introducing yet another piece of special-purpose language syntax? This is something for you future script-language designers to mull over.
- Doing nothing at all: The pass statement
  - Yes, this is the statement that does nothing at all -- why bother?
    - To satisfy Python syntax, i.e., catching an exception and doing nothing about it.
    - As a placeholder for code that will be written later.
  - Example: Counting and summing the positive non-zero integers in a file (Version #2) (sumfile2.py) [sumfile2.dat]
  - Like break and continue, a handy statement on occasion, but it probably shouldn't be used that much, and when it is, it should be used with care.
Basic Python I: Object-Types
- General Notes
  - In Python, types are associated with objects, not variables, e.g., What is "Bob"?
  - It is critical to make distinctions between objects, references (to objects), and variables (which hold references to objects), particularly when dealing with more complex types like lists and dictionaries.
  - There are ways of determining the type of a variable X, e.g., type(X), instanceof(X, type) ; however, this is frowned upon in Python.
- None
  - Corresponds to an undefined value.
  - As variables do not have types, useful if you want to see if a variable has been assigned a value yet.

Wednesday, September 21 (Lecture #7)
[LutzL, Sections 4 and 5; Class Notes]

Basic Python I: Object-Types (Cont'd)
- Numbers
  - Is actually a collection of types:
    - Integers (32-bit ("short") / infinite digit (long))
    - Floating-Point (32-bit double precision)
    - Complex Numbers, e.g., 5 + 4j
      - For complex number c, use c.real and c.imag to access the real and imaginary parts of c, respectively.
    - Boolean (True (non-zero), False(0))
  - Long integers particularly handy in certain applications.
  - Example: Counting the number of unrooted binary trees on n leaves (Version #1) (numbtree1.py)
  - Literals
    - Regular integer, e.g., -7, 0, 146
    - Long integer, e.g., -9999999991119999L, 100000000000000000002L
    - Regular floating-point, e.g., -0.727, 54.7, 0.0
    - Scientific (exponential) notation, e.g., -7e+0, 4.55e+200, 1.1111e-27
  - Operators
    - Symbolic: N1 + N2, N1 - N2, N1 * N2, N1 / N2, N1 ** N2 (exponentiation), N1 % N2 (modulus)
      - For any operator x above, can have x=, which is useful if we want to replace one of the sets in that operation by the result of that operation, e.g., x += y.
      - Beware! Division (/) in pre-3.0 versions of Python is argument-dependent (classic division) -- if float on either side, does true division but if integers on both sides, does floor division, e.g., 5 / 2 = 2, 5 / -22 = -2.
      - Phased in special floor-division operator (//) as of version 2.2; in 3.0+ versions, / will do true division regardless of arguments (see Lutzl, pp. 117-121 for details).
        
        To start using Python 3.0+ division in Python 2.6 scripts, put the statement from __future__ import division at the top of your program.
    - Relational: Standard set from C / C++ / Java + is
      - is-operator checks reference-equality as opposed to value-equality; this turns out to be the same for numbers, but we will see differences later on.
      - Note that relational operators can be chained.
    - Functions: abs(N), divmod(N1, N2), pow(N1, N2), round(N)
      - Note double functionality of round(); can round to closest integer (round(N)) or to nearest number of digits (round(N, numdigits))
        
        The latter does not truncate digits if the result cannot be realized in fixed-length floating point representation, e.g., round(1.464,1) = 1.5 but round(1.426, 2) = 1.42999999999 ...
      - Other functions are available in special-purpose math modules; we will talk more about these later in the course.
  - Numerical Conversions
    - Implicit: When two numeric arguments of different type are in an expression, arguments are converted where possible such that no information (magnitude / # digits) is lost, i.e, integer goes to long integer goes to floating-point goes to complex. Otherwise, explicit conversion is required.
    - Explicit: complex(N), float(N), int(N), long(N), bool(N)
      - When using int() or long() to convert strings, can use optional second radix argument, e.g., int("101") = 101 but int("101", 2) = 5.

Friday, September 23 (Lecture #8)
[LutzL, Sections 5 and 7; PyCook, Sections 3.12-3.14; Class Notes]

Basic Python I: Object-Types (Cont'd)
- Numbers (Cont'd)
  - In addition to the inbuilt numeric types, Python can support others via import -- most important (and to my knowledge, only current) one of these is Decimal.
    - Stores numbers as strings of decimal digits rather than in binary.
    - Excellent for representing fixed-precision currency calculations or, more importantly, representing quantities that cannot be represented exactly in binary, e.g., 0.1 + 0.1 + 0.1 - 0.3 = 5.5511151231257827e-17 in Python floating-point arithmetic.
    - Not used in general because of speed penalty (typically more than 1000x slower than corresponding floating-point applications).
  - Example: A simple adding machine (Version #1: floating-point) (addmach1.py)
    - Note use of raw_input() function to grab user input line from the keyboard.
  - Example: A simple adding machine (Version #1: Decimal) (addmach2.py)
- Strings
  - Strings are immutable sequences of 8-bit (ASCII / ) / 16-bit (Unicode) characters.
  - Literals
    - Single-quoted -- can contain double quotes / special characters indicated by backslash.
    - Double-quoted -- can contain single quotes / special characters indicated by backslash.
    - Triple double-quoted -- can contain any type of quote / can continue over several lines.
    - Unicode (u'...').
    - Raw (r'...') -- backslash disabled.
      - Useful for embedding special characters, e.g., \t, \n in strings for subsequent printing and interpretation.
  - Create regular and Unicode strings using str() and unicode().
  - String treated as list of characters -- can access individual positions (x[i]) or obtain substrings or subsequences by slices.
    - Slices operate like traditional for-loop indices.
      - x[i:j] -- substring starting at index i and ending just before index j.
      - Negative indices start at the end of the string and work backwards.
      - x[i:j:k] -- subsequence starting at index i (increment by k) and ending just before index j.
        
        Good for subsequences (x[::3] (every third element), x[0::2] (odd-position elements), x[1::2] (even-position elements)) or reversing strings (x[-1::-1]).
    - Strings are immutable -- hence, cannot change string element, e.g., s[2] = 'x' will give an error.
  - Example: Finding a substring in a file (Version #1: slices) (findsub1.py)
  - There are so many string operators in Python that you really should look around and see if there's one that does what you want before you write any string-operation yourself.
  - Operators
    - Symbolic: S1 + S2 (concatenation), S * N (repetition), S1 in S2 (substring detection)
    - Example: Creating repetitions of a given string (Version #1: +) (stringreps1.py)
    - Example: Creating repetitions of a given string (Version #2: *) (stringreps2.py)
    - Example: Finding a substring in a file (Version #2: in) (findsub2.py)
    - Relational: Standard set from C / C++ / Java + is
      - Relational comparisons done relative to lexicographic (phone book) order.
      - is-operator checks reference-equality as opposed to value-equality; this can vary depending on string-length, because internal Python optimization stores short strings as references to a single object but long strings as separate objects (LutzL, p. 187).

Week 4 [LutzL, Sections 7-9; Class Notes]

Monday, September 26 (Lecture #9)
[LutzL, Section 7; Class Notes]

Basic Python I: Object-Types (Cont'd)
- Strings (Cont'd)
  - Operators (Cont'd)
    - Function: (Cont'd)
      - S1.find(S2[, start [,end]]): returns starting index of S2 in S1 or -1 if S2 not found in S1.
      - S1.count(S2): returns number of occurrences of S2 in S1.
      - S.replace(old,new [, count])
      - Example: Replacing all occurrences of word x in a file with word y (replaceword.py) [replaceword.dat]
        
        Try python replaceword.py replaceword.dat frog X with X = "dog", "cat", and "car" for progressively weirder haikus.
      - Note that as strings are immutable, any of these functions that appear to change a string actually return changed copies of that string!
      - S.split({string})
      - Example: Extract and print employee last names in a data file (lastname.py) [lastname.dat]
      - S.partition(string), S.rpartition(string)
      - S.strip(), S.rstrip()
      - S.capitalize()
      - S.lower(), S.upper()
      - S.isdigit(), S.isalpha(), S.islower(), S.isupper(), S.isspace()
      - Example: Determine the case of all letters in a file (filecase.py) [fcLower.dat, fcUpper.dat, fcMixed.dat]
      - S.startswith(string-tuple), S.endswith(string-tuple)
      - ... and many others ...
      - Many of these functions are also available in the string module; however, this module will vanish in Python 3.0 (as all its functions are now part of the Python language proper), so you should not write any more code using it (and should change any code you inherit to not use it).

Wednesday, September 28 (Lecture #10)
[LutzL, Section 7; Class Notes]

Basic Python I: Object-Types (Cont'd)
- Strings (Cont'd)
  - Syntax modifications
    - Strings are iterable-thingies; hence, can use them in for-loops and iterate over the characters from front to back, e.g. for c in line: ...
    - Example: Counting the number of times a particular character occurs in a string (Version #2: find) (numchar1.py)
    - Example: Counting the number of times a particular character occurs in a string (Version #1: for-loop) (numchar2.py)
  - String formatting expressions
    - Analogous to string-construction using printf in C and C++.
    - Form: "... %x ..." % argument-tuple
    - String x in the above can take on a wide variety of values, indicating not only the type to be printed but, perhaps more importantly, its formatting in terms of digits, spacing, and justification.
    - Commonly-used types: d (decimal), i (integer), f (float), e (exponential/scientific notation), s (string), c (character)
    - Formats (L = field length, D = number of significant digits):
      - decimal: %{+}{-|0}{L}d
      - integer: %{+}{-|0}{L}i
      - float: %{+}{-|0}{L}.{D}f
      - exponential: %{+}{-|0}{L}.{D}e
      - string: %{-}{L}s
    - Example: Counting the number of unrooted binary trees on n leaves (Version #2) (numbtree2.py)
      - Displays as long as value can be stored as a float.
      - Need way to compactly display really long integers using scientific notation. Not yet in Python (another sign of numeric handling in transition), but we can simulate it in code.
    - Example: Counting the number of unrooted binary trees on n leaves (Version #3) (numbtree3.py)[Courtesy of Jason Gedge]
  - String formatting method calls (LutzL, pp. 183-193)
    - Introduced in Python 2.6.
    - Has more Python-particular manner of doing string formatting; however, is not clearly better to use in current version.

Friday, September 30 (Lecture #11)
[LutzL, Sections 8 and 9; Class Notes]

Basic Python I: Object-Types (Cont'd)
- Sequences
  - All sequences are heterogeneous, i.e., the types of the elements can be different (and even be other sequences (see below)).
  - Changeable sequences are called lists, and immutable sequences are called tuples.
  - Literals
    - Empty list: []
    - List: [x1, x2, ..., xn]
    - Empty tuple: ()
    - Tuple: (x1, x2, ..., xn)
      - Single-element tuple denoted (x1,) (to avoid interpretation as x1 with surrounding parentheses).
    - Can be split across multiple lines (and frequently are).
    - Can include lists as elements, creating nested lists; however, if specified as variables, can have unexpected problems (see notes on list representation below).
  - Type conversion: Can convert a list to a tuple using tuple(), and can convert a tuple to a list using list().
  - Can use slice-syntax (see notes on Strings above) to access elements and sublists.
    - In the case of lists, can use slices to change elements or sublist via assignment -- moreover, the lists on the two sides of the assignment do not need to be the same length! (but they must both be lists (including strings))
    - Multiple indices allow access to nested-sequence elements.
    - Example: 2-D integer matrices as doubly-nested lists of integers.
    - Example: sparse 2-D integer matrices as doubly-nested lists.
    - Example: sparse 2-D binary matrices as lists of tuples.
    - Example: employee records as nested lists.
    - Note that two different errors can occur when list indices are out of range -- IndexError (if there is a valid sublist at the requested index-level but the index is out of range in that sublist) and TypeError (if there is no sublist at the requested index-level).
  - Operators:
    - Symbolic: L1 + L2, L * N, X in L, del L[Slice]
      - Note that + and * flatten argument-lists prior to concatenating them; can get around this by enclosing argument-lists in square brackets (thus effectively adding a level of nesting).

Week 5 [LutzL, Sections 8 and 9; Class Notes]

Monday, October 3

Class Exam #1 Notes
I've finished making up the first in-class exam. The exam will be closed-book. It will be 50 minutes long and has a total of 50 marks (this is not coincidental; I have tried to make the number of marks for a question equivalent to the number of minutes it should take you to do it). There will be one question worth 22 marks which involves writing two short fragments of Python code and two questions worth 14 marks apiece which involve the writing of two short Python scripts. Topics include all material covered up to and including last Wednesday's lecture. You may find the following of some help:
- Class exam #1 (Fall 2010) (4 pages: PDF)
- Answers for class exam #1 (Fall 2010) (4 pages: PDF)
- Class exam #1 (Fall 2009) (4 pages: PDF)
- Answers for class exam #1 (Fall 2009) (3 pages: PDF)
I hope the above helps, and I wish you all the best of luck with this exam.

Monday, October 3 (Lecture #12)
[LutzL, Sections 8 and 9; Class Notes]

Basic Python I: Object-Types (Cont'd)
- Sequences (Cont'd)
  - Operators (Cont'd):
    - Symbolic: (Cont'd)
      - Note that in operates in a "shallow" manner, in that it will only search in the immediate list-elements, and not dive into sublists.
      - Example: Search in integer-list
      - Example: Search in sparse 2D binary matrix list, cf., sparse 2D integer matrix list.
    - Relational: Standard set from C / C++ / Java + is
      - Relational comparisons done relative to lexicographic (phone book) order, invoking type-appropriate relational operator results for individual element-pairs; moreover, if list-elements are both lists, comparison is done recursively, i.e., list comparison is deep!
      - is-operator checks reference-equality as opposed to value-equality; does not depend on list length, cf. strings.
    - Function
      - Sequence attribute: len(L)
      - Change list:
        
        L.append(X), e.g., L.append(7), L.append([7, 8]) (append item X to end of list L)
        L1.extend(L2) (append elements of list L2 to the end of list L1)
        L.pop() (remove and return last element of L)
        
        Treats L as stack, with end of list being top of stack.
        
        L.index(X) (return index of X in L if X is in L and ValueError exception otherwise)
        
        Note inconsistency with behavior of find() in strings.
        
        L.count(X) (returns number of occurrences of X in L (shallow))
        L.sort(), L.reverse() (see below)
        Note that all of these operations change list L in place; hence, assigning result to L is not necessary (and indeed dangerous, as it erases contents of L).
      - Return changed copy of sequence: sorted(S)
      - Note that index() and reverse() are shallow in that they all operate on the top-level elements in a list), but sort() and sorted() are deep in that the comparison used to order the top-level elements operates in a recursive manner, diving into sublists as necessary. The latter holds because the relational operators on lists operate in a deep manner!
      - Example: Printing entries in a sparse 2D integer matrix by co-ordinate order.
      - Example: Sorting list of employee records by an arbitrary field (Version #1).
      - Combine sequences into list of tuples: zip(S1, S2, ...)
      - Iterate a specific function over a sequence: min(S), max(S), sum(S)
        
        All these functions are shallow.
        As Python allows comparison of arbitrary types, min() and max() will produce results (albeit unexpected ones) for nested sequences. However, sum() returns TypeError on nested sequences.
      - Iterate arbitrary functions over sequences: map(F, S)
      - Example: Converting a line into a list of integers.
      - Creating a string from a sequence: Str.join(S-Str)
      - Example: Converting a list of integers into a line.

Wednesday, October 5

Class Exam #1

Friday, October 7

Lecture cancelled

Week 6 [LutzL, Sections 8, 9, and 14; Class Notes]

Monday, October 10

Midterm break; no lecture

Wednesday, October 12 (Lecture #13)
[Class Notes]

Went over answers for Class Exam #1.

Friday, October 14 (Lecture #14)
[LutzL, Sections 8, 9, and 14; Class Notes]

Basic Python I: Object-Types (Cont'd)
- Sequences (Cont'd)
  - Syntax modifications:
    - Iteration over sequences: for X in L: ...
    - Sequence assignment: X1, X2, ... = L
      - List L may be nested; however, number of top-level elements in L must be the same as the number of variables on the left-hand side.
      - If Python 3.0+, the right-hand side can be longer than the left-hand side if the left-hand size includes a single starred variable; those elements not assigned to other variables on the left-hand side will be placed in a list which is assigned to that starred variable.
    - String formatting over tuple of arguments: Str % T, e.g., "X%02d%5s" % (2, "spam")
      - Length of tuple T must exactly equal the number of targets in Str.
  - List comprehensions
    - Are actually sequence comprehensions.
    - Syntax: [op(x) for x in S {if cond(x)}]
      - x can itself be a list, as long as op operates on lists, e.g., [(x, y + 1) for x, y in L if y >= 2].
    - Example: Sorting list of employee records by an arbitrary field (Version #2).
    - Example: Counting the number of occurrences of a command-line specified word in a file (woc3.py)
    - Example: Counting and summing the positive non-zero integers in a file (sumfile3.py)
    - Can run much faster than loop-version, as it is executing a construct directly in the interpreter rather than being interpreted on a statement-by-statement basis.
  - List generators
    - Though they are fast, list comprehensions may be costly in space as they must generate the whole list before any subsequent operation applied to that list is invoked.
    - In situations where memory space is at a premium, convert a list comprehension to a list generator by changing the enclosing square brackets to parentheses. This generator will only create the list elements one at a time, saving space, and can be used anywhere that the list comprehension was used.
  - List representations in Python, and why you should care
    - Lists are implemented as arrays of references; hence, changes made to an object via one variable and its reference also show up for all variables that reference that object, e.g., if L1 = [1, 2, 3] and L2 = [4, L1, 6], executing L1[1] = 7 changes both L1 and L2.
    - How then, do you copy lists?
      - Call to list() function, i.e., L2 = list(L1) [shallow]
      - Slice-copy, i.e., L2 = L1[:] [shallow]
      - Deep-copy, i.e., L2 = copy.deepcopy(L1) [requires import of copy module]

Week 7 [LutzL, Sections 8 and 9; Class Notes]

Monday, October 17 (Lecture #15)
[Class Notes]

Basic Python I: Object-Types (Cont'd)
- Sets
  - Sets are mutable unordered heterogeneous collections of hashable-type objects in which two objects of equal value cannot occur, i.e., sets in Python are not multisets. Frozensets are immutable sets.
    - "Hashable" essentially means that there is a hash-function associated with that type which can produce an index-value for any object of that type. These values are used for very fast lookup.
    - All immutable types are hashable; note that this includes tuples but not lists (and frozensets but not sets).
  - Prior to Python 3.0, sets are are created by calling set() or frozenset() with a list (or an expression that produces a list) as an argument.
    - If a string is passed in as the argument to set() or frozenset() , it creates the set of all unique characters in the string!
    - Python 3,0+ introduces set literals using curly braces, e.g., S = {1, 3, "Bob", [1, 4]}.
  - Type conversion: Can convert a set to a frozenset using frozenset(), and can convert a frozenset to a set using set().
  - Operators:
    - Symbolic: X in S (membership), S1 | S2 (union), S1 & S2 (intersection), S1 - S2 (difference), S1 ^ S2 (symmetric difference = (union of S1 and S2) - (intersection of S1 and S2))
      - For any operator x above, can have x=, which is useful if we want to replace one of the sets in that operation by the result of that operation, e.g., s1 ^= s2.
    - Relational: Standard set from C / C++ / Java + is
      - Relational comparisons other than == and != compute subset-relations (and in the case of the less-than- and greater-than-equal operators, proper subset relations).
      - is-operator checks reference-equality as opposed to value-equality; does not depend on set size, cf. strings.
  - Function:
    - Set attributes: len(S) (number of elements in S)
    - Set modification: S.add(X), S.remove(X) (return KeyError if X not in S), S.discard(X) (no error returned if X not in S), S.pop() (remove and return random element from S; return KeyError if S is empty)), S.clear() (remove all elements from S)
    - Set combination (compute and return result): S1.union(S2), S1.intersection(S2), S1.difference(S2), S1.symmetric_difference(S2)
    - Set combination (compute and store result in S1): S1.update(S2) (union of S1 and S2)
      - Note that S1 cannot be a frozenset.
    - Set comparison (return boolean): S1.issubset(S2), S1.issuperset(S2)
    - Set copy (return copy of S1): S2 = S1.copy()
    - Sort set: sorted(S) (returns sorted list of set elements)
      - As sets are heterogeneous, list organized by set-element type, with numbers (integers and floats, all interpreted as floats) followed by strings and tuples.
      - Handling of complex numbers in sets causes problems for sorted; this may be indicative of a Python type in transition (like long integers).
  - Syntax modifications:
    - for-in construct for iteration over sets
      - As order of elements in set is not predictable, elements will come out in unpredictable order (but it is the same order one sees when the set is printed).
  - Set comprehensions (available in Python 3.0+)
    - Syntax: {op(x) for x in S {if cond(x)}}
  - Sets are very useful for maintaining collections of distinct objects occurring in another collection, e.g., list of birds observed during a particular time-interval.
  - Example: Measuring textfile similarity wrt distinct character content (charsim.py)

Wednesday, October 19 (Lecture #16)
[LutzL, Section 8; Class Notes]

Basic Python I: Object-Types (Cont'd)
- Dictionaries
  - Dictionaries are mutable unordered heterogeneous collections of key : value pairs. Values can be any object (including nested dictionaries or lists), but keys must be hashable-type objects. Each key : value pair is known as an item.
  - Literals: {key1 : value1, key2 : value2, ...} or {} (empty dictionary)
    - Can also create dictionaries using dict(), which takes either lists of 2-tuples (in which the first value is interpreted as the item-key and the second value is interpreted as the item-value) or in a special keyword-form, e.g., dict(name = 'bob', age = 45), in which each x = y expression is interpreted as an item whose key is x and value is y.
    - The 2-tuple version of dict() allows handy creation of dictionaries using zip().
  - Operators:
    - Symbolic: K in D (key membership), del D[K] (removes entry with key K in D; returns KeyError if K not key in D)
    - Relational: Standard set from C / C++ / Java + is
      - Relational comparisons done relative to sorted lexicographic (phone book) order on key-value pairs, invoking type-appropriate relational operator results for individual element-pairs; moreover, if keys are both tuples and/or values are sequences, comparison is done in a deep manner.
      - is-operator checks reference-equality as opposed to value-equality; does not depend on dictionary size, cf. strings.
    - Function:
      - Dictionary attributes:
        
        len(D) (number of key-value pairs in D)
        D.items() returns list of key-value pairs as tuples)
        D.keys() (returns list of keys of items in D)
        D.values() (returns list (not set) of values of items in D)
        sorted(D) (returns sorted list of keys of items in D)
        D[K] (returns value of item with key K in D if such an item in D and KeyError otherwise)
        D.get(K {,Def}) (like D[K] except if item with key K not in D, returns None if default-value argument not present and Def otherwise)
      - Dictionary modification:
        
        D[K] = V (add item (K,V) to D if no item with key K in D, and update value of item with key K to V otherwise)
        D.pop(K) (removes item with K in D and returns value associated with item if item with key K in D and returns KeyError otherwise)
        D.clear() (remove all items in D)
      - Dictionary combination: D1.update(D2) (add all items in D2 to D1; if an item in D2 has the same key as an item in D1, replace the value of that item in D1 with the value associated with that key in D2)
      - Dictionary copy: D.copy() (shallow)

Friday, October 21 (Lecture #17)
[LutzL, Sections 8 and 9; Class Notes]

Basic Python I: Object-Types (Cont'd)
- Dictionaries (Cont'd)
  - If dictionaries are nested, can access lower-level elements by d[kl1][kl2] ... syntax (analogous to nested-list element access); indeed, we can mix levels of lists and dictionaries as long we use appropriate indices or keys at the appropriate nested-structure level to access elements.
  - Example: Employee record-storage with arbitrary employee attributes via dictionary (keyed on employee ID) of dictionaries (keyed on employee attribute).
  - Syntax modifications:
    - for-in construct for iteration over dictionary keys: for K in D: ...
      - As order of items in dictionary is not predictable, elements will come out in unpredictable order (but it is the same order one sees when the set is printed).
      - To iterate over sorted list of keys, use for K in sorted(D): ...
      - Example: Sparse n-dimensional matrix storage via dictionary keyed on n-tuples.
  - Dictionaries have many uses:
    - Sparse data structures
    - Record-structures
    - Management of records indexed by non-integers
  - Example: Computing and print list of author names in decreasing order by publication count (Version #1: Lists) (authorCount1.py) [authorCount.dat]
  - Example: Computing and print list of author names in decreasing order by publication count (Version #2: Dictionaries and List-Comprehensions) (authorCount2.py)
Storing Persistent Objects (Part I)
- How do we store the various object-types we've seen so far between program executions? This is typically done in files. For now, let's look at the simplest types of file storage, and leave discussion of advanced file-indexing and database-style access for later in the course.
- Can write string-representations of objects obtained using str() or repr() to text files; these can be read back in either with user-written parsing or the eval() function.
  - With str(), the representation of a string may leave out quotes; in this case, repr() is safer.
  - eval() will actually execute any Python command given in the string-argument; should be used with extreme care.
- More compact string-representations of types can be obtained using the pickle module.
  - To use, import the pickle module (import pickle) and use pickle.dump(X,fout) to write an object X to an open textfile fout; to retrieve that object from the textfile fin subsequently reopened for reading, use X = pickle.load(fin).
  - Can store multiple objects in one file; just have to make sure that you re-load them in the same order in which they were dumped.
  - pickle uses a technique called serialization to create these string representations, which are ideal for transferring Python objects over the Internet.

Week 8 [LutzL, Sections 16-19 and 21-22; Class Notes]

Monday, October 24 (Lecture #18)
[LutzL, Sections 16-18; Class Notes]

Basic Python II: Functions (Cont'd)
- Why use functions?
  - Implementing recursion, e.g., searching a nested list.
  - Single parameterized occurrences of commonly-used pieces of code, which can lead to fewer errors if modifications are required, e.g., reading in textfiles and converting them to lists of words.
  - Hides low-level details, which makes calling code more readable, e.g., hide pickling load and dump commands inside database load and dump functions.
  - Allows specification of application-specific function libraries, which simplified application development, e.g., sparse 2D matrix manipulation.
- Basic syntax: def f(x1, x2, ...): statements; return {something}
- Critical to define functions used at top of file; if they are at the bottom, will not be accessible to main program.
- What is our main program now? For our purposes at the moment, it is the block of non-function code at the end. However, once we start creating scripts consisting purely of functions, this will have to be modified slightly.
- Example: Recursively searching a nested list (deepfindEx.py)
- Example: Summing the integers in a file (no range check) (sumfunc1.py)
- Example: Summing the integers in a file (range check #1) (sumfunc2.py)
- Interesting features of Python functions
  - Function parameter are not typed and can vary in number
    - Function parameter / call-argument matching can be done positionally or by parameter name (keyword form).
    - Non-keyword call-arguments matched positionally as far as possible, and remainder are placed in *X (if *X is included as a function parameter).
    - Keyword call-arguments not matched are placed in **X (if **X is included as a function parameter).
    - Can interpolate positional parameters into a function call with a *List list-variable and a **Dict dictionary-variable.
    - With overloaded / polymorphic operators, allows true multi-type functions (sort of like Java generics).
  - Example: Summing the integers in one or more files (range check #1) (sumfunc3.py)

Wednesday, October 26 (Lecture #19)
[LutzL, Sections 16-18; Class Notes]

Basic Python II: Functions (Cont'd)
- Interesting features of Python functions
  - Function parameters can have default values set in the parameter list itself (analogue of keyword-form in function call).
  - Example: Summing the integers in one or more files (range check #2) (sumfunc4.py)
  - Function parameter values set "by assignment"
    - Attempts to change immutable objects in a function will cause errors; hence, for all practical purposes, immutable objects are passed by value and mutable objects are passed by reference.
  - Function return values are not typed and can vary in number
    - Any comma-separated list will be treated as a tuple; hence, even though technically one thing is returned, you can return any number of things (as items in that tuple).
  - Example: Processing the integers in one or more files (range check #2) (sumfunc5.py)
  - Functions are objects
    - At most basic, this means functions can be assigned to variables and stored / passed around like other objects (which makes map() much more powerful, for instance).
    - Functions can now be defined anywhere in the code, even inside conditional statements or loops, , conditional definition of a function itself rather than implementing conditional behavior by conditional statements inside a function.
    - To trigger function-object f, use apply(f, pargs {, kargs}) or f(arg1, arg2, ..., argn) to trigger function object; apply is particularly useful if you do not know the number of arguments at coding-time.
      - The features of apply() are so convenient that they have been made part of the core language syntax as of Python 2.6, namely f(*parg, **karg), where parg and karg are a list and a dictionary, respectively. As apply() will be eliminated in future versions of Python, should get used to using the new syntax.
      - Note how this new syntax is consistent with how excess arguments are handled in current Python function calls.
    - Can embed functions into lists and dictionaries to create jump tables, which specify (by index or key) the actions to be performed in a particular situation.
    - Example: A simple adding machine (Version #3: Decimal + jump table) (addmach3.py)
- Variable scope
  - Python maintains a hierarchy of variable namespaces; the same variable-name may exist in multiple namespaces, each with a different associated object and type-interpretation.
  - The LEGB Rule.
    - Specifies namespace-order in which Python looks for variable-interpretations (local (function), enclosing function (in reverse nesting order), global (module), built-in (Python language)).
  - Can be short-circuited with the global statement; however, you really shouldn't ...

Friday, October 28 (Lecture #20)
[LutzL, Sections 16-19 and 21-22; Class Notes]

Basic Python II: Functions (Cont'd)
- Lambda expressions
  - Syntax: lambda arg1, arg2, ..., argn: expression
  - Essentially, allows the definition of very short anonymous functions.
  - Why not just use a regular function?
    - Can appear anywhere an expression does, e.g., function-argument to function (like map()), jump tables.
    - Allows functions to be defined closer to where they are used (code proximity).
  - Example: A simple adding machine (Version #4: Decimal + jump table + lambda expressions) (addmach4.py)
Basic Python II: Modules
- What Is (and Isn't) A Module in Python
  - A module is collection of variable-names and their associated objects; these variable-object pairs are known as attributes.
  - A module in Python can correspond not just to a Python script but to a collection of functions and/or data structures written in another languages such as C or Fortran that are accessed by Python scripts.
  - A module is more than an included library or a compile-time directive (in that it is an assignment-like statement that is executed) and less than a true OOP-style object or class (in that it does not implement the privacy portion of encapsulation, or force data in a module to be manipulated purely by functions in that module).
- Accessing Module Attributes via Import
  - Three syntactic variants:
    - import X
    - from X import Y {as Z}
    - from X import *
  - The first variant makes all attributes Y of module X accessible by the syntax X.Y, the second adds attribute Y of X to the calling module's namespace such that it can be accessed directly as Y (or Z, if the as-clause is used), and the third adds all attributes Y in X to the calling module's namespace directly.
  - An import-statement does three things in order: finds the requested module, compiles it to bytecode (maybe), and executes its statements (from top to bottom).
    - Finding is done in local directory or under guidance of Python path list (stored in sys.path).
    - Compilation (to a .pyc bytecode file) is done if script does not contain a main program (see below) and .pyc file does not exist or changes have been made to file since previous .pyc creation.
    - Execution creates all functions and objects specified by the script.
    - Is this convenient? Lord yes. Is it time-consuming? Again, Lord yes. This is why, in situations where imports of the same module occur multiple times, e.g., the interactive interpreter environment, all three steps are only done on first import and subsequent imports only link to the established module-object.
      - Problematic if relying on value-initialization via import.
      - Can get around this (to a degree) with reload().
  - An import gives access to and the ability to change all imported attributes -- this cannot be overridden.
- Module Coding Guidelines
  - The from-versions actually invoke full imports, so they do not save time by selective import. Be careful using these (particularly from *), as they will overwrite the values of variables in the calling module with the same name as imported attributes.
    - With from *, can prevent import by either naming a variable with an initial underscore, or restricting the imported attributes to those in list __all__, e.g., __all__ = ["x", "y2", "procFile"].
    - Note that this is not a private declaration, as stuff hidden in this manner can still be made accessible by a regular import statement, i.e., you can hide but you can still be run.
  - Imports of whole directories of modules at once also possible, and is desirable in larger Python systems -- however, we will not cover such package imports in this course.
  - Use __main__ to delimit main-program code (by using the if __name__ == "__main__": construct to delimit code that is run if the script is run stand-alone mode; consider using this in tandem with a main() function).
    - If module consists purely of functions that are not run in stand-alone mode, e.g., a math function library, use the main program to store module self-test code.
  - Example: Processing the integers in one or more files (module + main program) (sumfunc6a.py [module], sumfunc6b.py [main program])

Week 9 [LutzL, Sections 16-19 and 21-22; Class Notes]

Monday, October 31

Class Exam #2 Notes
I've made up the second in-class exam. This exam will be closed-book. It will be 50 minutes long and has a total of 50 marks (this is not coincidental; I have tried to make the number of marks for a question equivalent to the number of minutes it should take you to do it). There will be four questions, two worth 10 marks apiece and two worth 15 marks apiece, all of which involve writing fragments of Python code. Topics include all material covered up to and including Lecture #18. You may also find the following of some help:
- Class exam #2 (Fall 2010) (4 pages: PDF)
- Answers for class exam #2 (Fall 2010) (2 pages: PDF)
- Class exam #2 (Fall 2009) (4 pages: PDF)
- Answers for class exam #2 (Fall 2009) (2 pages: PDF)
I hope the above helps, and I wish you all the best of luck with this exam.

Monday, October 31 (Lecture #21)
[Class Notes]

Basic Python II: Modules (Cont'd)
- Module Coding Guidelines (Cont'd)
  - Associate docstrings with each attribute of importance (module, data-structure, function) by placing docstring immediately after attribute-definition.
    - To get a quick overview of variables and functions associated with a module X, from within the Python interpreter, import that module (import X) and print its associated doc-string (print X.__doc__). One can display the docstring associated with any attribute of that module similarly (print X.Y.__doc__).
    - Alternatively, to get a nicely-formatted description of all docstrings associated with a module X, import the module and use use help(X).
  - Use _X and __all__ to limit namespace pollution from imports.
  - Follow standard software engineering practice, e.g.,
    - Minimize coupling of modules via use of "global" objects/ data structures to pass information between modules.
    - Maximum coherence of modules by making sure attributes in a module have a common sensibly-defined purpose and that these attributes associated with this purpose are not split across multiple modules.
- Given that we now know about how modules work in Python, let's spend the next few lectures looking at services provided by some of the standard Python modules.
- Accessing Python interpreter internals: The sys Module
  - Attributes we have seen so far: argv, path, exit(), getrefcount()
  - Variables stdin, stdout, and stderr store the file-objects associated with where interpreter input comes from and interpreter output and error messages go. The defaults for these are the keyboard and terminal screen, respectively -- however, these can be changed, e,g., redirect error messages to a specific file.
    - The original version of each stream X is stored in __X__, and can be recovered; however, Python being Python, you can change these too ...
  - Example: Fun and games with stdin, stdout, and stderr (sysRedirect.py)
- Accessing System Files and Directories: The os, shutil, and glob Modules
  - The os module operates on an abstract file system which is a directory-tree with non-empty directories as internal nodes files (and empty directories) as leaves.
    - Can designate any directory as a current working directory (cwd).
    - Can designate directory-paths linking entities in the tree. Each such path can be cracked into a directory-path and an entity-name (with the latter being empty if the entity is a directory).
    - Each entity in the tree has a unique associated directory-path from the root-directory to that entity (absolute path).
    - Each entity in the tree has a unique associated directory-path from a designated cwd to that entity (relative path).
  - By operating on an abstract file-system whose path-specifics are stored as variables, e.g., path separator, the os module can be customized to allow generic file and directory access on many types of operating systems.
    - This in turn allow you to write operating-system invariant code! (provided, of course, that all file-manipulation is done using os-module variables and functions).
  - Services provided by os:
    - Variables:
      - name (name of operating system)
      - curdir (string denoting current directory; "." under Linux)
      - pardir (string denoting parent of current directory; ".." under Linux)
      - sep (string denoting directory-path separator; "/" under Linux)
      - extsep (string denoting filename-extension separator; "." under Linux)
    - Functions:
      - Entity characteristics:
        
        Entities specified by path-strings.
        access(P, {os.R_OK, os.W_OK. os.X_OK}) (returns True if P accessible in requested manner and False otherwise)
        listdir(P) (returns list of all files and directories in directory P, including "invisible" dot-files)
      - Change entity characteristics:
        
        chmod(P, mode) (reset access-permissions of P to mode)
        rename(oldP, newP) (rename entity oldP as newP; corresponds to a mv command in Linux)
      - Create / delete entities: remove(P), mkdir(P), rmdir(P)
        
        rmdir() only removes an empty directory; to remove all files and directories in a directory, use shutil.rmtree(P).

Wednesday, November 2

Class Exam #2

Friday, November 4 (Lecture #22)
[PyNut, Section 9; Class Notes]

Went over answers to Class Exam #2.
Basic Python II: Modules (Cont'd)
- Accessing System Files and Directories: The os, shutil, and glob Modules (Cont'd)
  - The os.path sub-module provides additional services for manipulating paths themselves
    - Characteristics of entity reached by path: exists(P), getsize(P), getmtime(P), isfile(P), isdir(P), islink(P)
    - Path characteristics: abspath(P) (absolute path of P), dirname(P) (non-terminal directory path), basename(P) (terminal entity-name), split(P) (returns pair (dirpath(P), basename(P))
    - Path construction: join(L) (given list L of directories and optional terminal file, constructs path of entities in sequence separated by sep)

Week 10 [Class Notes]

Monday, November 7

No lecture; instructor sick

Wednesday, November 9 (Lecture #23)
[PyNut, Section 9; Class Notes]

Basic Python II: Modules (Cont'd)
- Accessing System Files and Directories: The os, shutil, and glob Modules (Cont'd)
  - Example: Listing all readable files in the current directory (listDir1.py)
  - Traversing a directory tree
    - If all you want to do is visit and perform the same operation on each file in each file in a directory tree (possibly accumulating results from each file in a variable or list as you go), use os.path.walk().
      - Usage: os.path.walk(root, myfunc, arg), where myfunc has the form myfunc(arg, dirname, files) such that dirname is the directory being examined and files is a list of all files in that directory.
    - If you want to do something more complex, code up a traversal yourself using your favorite recursive tree-traversal algorithm as a template (one such example is on p. 124 in Section 3.4.7 of Langtangen (2008)).
  - Example: Listing all readable files in the directory tree rooted at the current directory (listDirTree.py)
  - To copy files and directories, use shutil.copy(oldP, newP), shutil.copy2(oldP, newP), and shutil.copytree(oldP, newP).
    - copy() modifies last access / creation time while copy2() does not.
  - Selective directory listing with glob
    - listdir() is well and good for listing all files in a directory. However, we often only want files of a particular type, e.g., Python scripts, files whose names start with capital letters.
    - glob(pat) in module glob returns a list of all files in the current working directory whose names match the pattern pat.
      - Patterns incorporate ordinary symbols and various pattern-specifiers, e.g., ? (any single character), * (0 or more characters), [x ... y] (any one of characters x .. y) [x-y] (any one of characters in Unicode / ASCII range x - y).
  - Example: Listing all readable files in the current directory with a specified extension (listDir2.py)
- Pattern-Matching: The re Module
  - The elementary pattern-matching in glob would be a useful thing to have for string-processing in general. Such a general facility is provided by the re module, in which patterns are represented as regular expressions.
  - What is a regular expression?
    - A regular expression specifies a set of strings; if a given string s is in that set, s is said to match the regular expression.
    - At its most basic, a regular expression is a sequence of units, where each units specifies a choice of one or more things that are repeated some number of times.
      - Something (entity): Symbol
      - Choice:
        
        . (any character except \n)
        [s] (any character in string s)
        [x-y] (any character between characters x and y inclusive in Unicode)
        [^s] (any character not in string s)
        \w (any word character, i.e., [a-aA-Z0-9])
        \W (any non-word character, i.e., [^\w])
        \d (any digit character, i.e., [0-9])
        \D (any non-digit character, i.e., [^\d])
        \s (any space character, i.e., [ \t\n])
        \S (any non-space character, i.e., [^\s])
      - Repetition (quantifiers):
        
        * (0 or more occurrences)
        + (1 or more occurrences)
        ? (0 or 1 occurrences, i.e., optional)
        {m} (m occurrences)
        {m,} (m or more occurrences)
        {m,n} (m to n inclusive occurrences)
        Note that special status of symbols overridden inside square brackets (a+ vs. [a+]). To avoid proliferation of backslashes used to create escape-versions of characters, use raw strings (r'...').

Thursday, November 10

Final Exam Notes
I making up your final exam now; things may change a bit, but I'm pretty sure of the general format. The exam will be closed-book. It will be 120 minutes long and has a total of 120 marks (this is not coincidental; I have tried to make the number of marks for a question equivalent to the number of minutes it should take you to do it). There will be 3 questions:
- Give short code-fragments and functions (4 parts; 60 marks)
- Describe the output of a Tkinter script (15 marks)
- Write functions associated with an example system (4 parts; 45 marks)
Topics include all material covered up to and including GUI design using the Tkinter module. You may also find the following of use:
- Final exam (Fall 2008) (11 pages: PDF)
- Answers to final exam (Fall 2008) (5 pages: PDF)
- GUI code for Question #2 on final exam (Fall 2008) ( tf_GUI_F08.py)
- Final exam (Fall 2009) (9 pages: PDF)
- Answers to final exam (Fall 2009) (4 pages: PDF)
- GUI code for Question #2 on final exam (Fall 2009) ( tf_GUI_F09.py)
- Final exam (Fall 2010) (9 pages: PDF)
- Answers to final exam (Fall 2010) (4 pages: PDF)
- GUI code for Question #2 on final exam (Fall 2010) ( tf_GUI_F10.py)
I hope the above helps, and I wish you all the best of luck with this exam and your other exams.

Friday, November 11

Remembrance Day; no lecture

Week 11 [Class Notes]

Monday, November 14 (Lecture #24)
[PyNut, Section 9; Class Notes]

Basic Python II: Modules (Cont'd)
- Pattern-Matching: The re Module (Cont'd)
  - What is a regular expression? (Cont'd)
    - Example: Recognizing integers.
    - Example: Recognizing floating-point numbers.
    - Example: Recognizing exponential numbers.
    - Units can be built out of other units by grouping with parentheses; aside from aiding clarity, such groups can also be accessed by position going from left to right in the expression (\i, for i greater than or equal to 1), or even given names (which we will not get into in this course).
      - Such backreferences allow reference to previous matches later in pattern, e.g., (\d+)X\1.
      - If you do not want parentheses to be interpreted as a group, use (?: ...).
    - Example: Recognizing (and breaking down) proper names.
    - In languages like Python, regular expressions are augmented to consider matches of one or more substrings inside a string.
      - Can specify where in string match must occur to be valid, e.g.,
        
        \A (beginning of string or after \n)
        ^ (beginning of string)
        \Z (end of string or before \n)
        $ (end of string)
      - If multiple matches are possible, can override greedy default (longest) to match shortest (trailing ? on quantifier).
    - Example: Recognizing name at start (end) of file vs. start (end) of any line in file.
    - Example: Recognizing XML-tagged entities

Wednesday, November 16 (Lecture #25)
[PyNut, Section 9; Class Notes]

Basic Python II: Modules (Cont'd)
- Pattern-Matching: The re Module (Cont'd)
  - Regular expression matching in Python: The match object
    - Describes result of matching a particular regular expression against a particular string.
    - Variables:
      - string (string on which match was performed)
      - re (re-object used to make match)
      - pos (requested start-index of match)
      - endpos (requested finish-index of match in string)
    - Functions:
      - Matched-group characteristics:
        
        group(gid=0) (returns string matched by group gid (whole string matched if no gid specified) or none if no match by group gid)
        groups() (returns tuple of all strings matched by groups, with None if no match by group gid)
        start(gid=0) (returns start-index of string matched by group gid (start of whole match if no gid specified) or -1 if not match by group gid)
        end(gid=0) (returns finish-index of string matched by group gid (finish of whole match if no gid specified) or -1 if not match by group gid)
        span(gid=0) (returns (m.start(gid), m.end(gid)))
      - Applying matched groups: expand(s) (return copy of s in which all backreferences to matched groups are replaced)
  - Applying regular expressions
    - Creating regular expression objects: r = re.compile(pattern {,flags})
      - A regular-expression object (re) has variable pattern giving the pattern-string from which it was created.
      - During compilation, can also specify various flags that modify interpretation of pattern, e.g.,
        
        re.IGNORECASE (make match case-insensitive)
        re.DOTALL (allow .-character in pattern to also match \n)
        re.VERBOSE (ignores whitespace / #-comments in pattern)
        re.MULTILINE (makes ^ and $ function like \A and \Z)
    - Services on regular expression objects:
      - Apply re to string:
        
        r.match(s, start=0, end=sys.maxint) (returns match-object for match of r to s starting at s-index start and finishing before s-index end, and None if no match of r in s)
        r.search(s, start=0, end=sys.maxint) (returns match-object for match of r to s starting at or after s-index start and finishing before s-index end, and None if no match of r in s)
        r.findall(s) (return list of (non-overlapping) substrings of s matched by r)
      - Manipulate string using re:
        
        r.split(s) (return a list of substrings of s matched by non-overlapping matches of r in s; compare with s.split())
        r.sub(repl, s) (if repl is string, return copy of s in which all matches with r are replaced by repl (with backreferences to groups in r in repl replaced appropriately); if repl is function-object that takes match-object as only parameter, return copy of s in which all matches with r are replaced with string returned by repl(m))
        r.subn(repl, s) (returns 2-tuple (r.sub(repl, s), n) where n is number of matches of repl in s)
    - The services above are available through the re module itself, if the regular expression is given as a pattern to the function, e.g., re.sub(r, repl, s); however, these versions lack some of the functionality available in the re-object versions, e.g., cannot specify start / end positions for matches in string.

Friday, November 18 (Lecture #26)
[PyNut, Section 9; Class Notes]

Basic Python II: Modules (Cont'd)
- Pattern-Matching: The re Module (Cont'd)
  - Example: Breaking exponential numbers into parts (expflt1.py)
  - Example: Extracting real-number values of exponential numbers (expflt2.py)
  - Example: Breaking proper names into parts (nameparse1.py) [names.txt]
  - Example: Rewriting proper names (Version #1: match-expand version) (nameparse2.py)
  - Example: Rewriting proper names (Version #2: re-sub string version) (nameparse3.py)
  - Example: Rewriting proper names (Version #3: re-sub function version) (nameparse4.py)
  - Example: Jazzing up of all proper names in an annotated file (jazzname.py) [aname.txt]
  - Example: Counting the number of proper names in an annotated file (countname.py)

Week 12 [Class Notes]

Monday, November 21 (Lecture #27)
[PyNut, Section 17; PyProg, Sections 8 and 9; Class Notes]

Basic Python II: Modules (Cont'd)
- GUI Development: The Tkinter Module
  - Tkinter as framework
    - Frameworks are OOP constructs which allow re-use of both code and design; are typically used to simplify the creation of specific complex applications, e.g., abstract data types, GUIs, Web servers.
    - General characteristics of frameworks:
      - Extendability (create entities by extending classes in framework)
      - Inversion of control (to use framework, only need to specify basic appearance and behavior of application -- framework controls actual creation and execution of specified mechanism)
  - Over the next several lectures, as we look at various GUI features in Tkinter, compare these with what you need / want to implement for Assignment #9 to figure out those features you need to master.
  - Core Tkinter GUI entities:
    - Containers (windows / panels)
    - Widgets: Widget appearance / behavior + layout in container
  - Keep in mind that the next several lectures are a very basic overview of Tkinter -- there are many more features in Tkinter than we will cover here, not only in terms of extra widgets and containers, but also extra parameters and methods for those widgets and containers we do describe. More details can be found on the Python Tkinter Wiki and in the Tkinter reference manual (thanks to Jason Gedge for pointing these out).
  - Focus today on single-container windows today; look next lecture at nested containers.
  - Generic single-window GUI script structure:
  - Root Window setup
    - Create root window: root = Tk()
    - Customize appearance of window by calling methods relative to the created root window-object, e.g., root.title(S)
    - Once GUI is set up, trigger execution of GUI using root.mainloop() (see below).
  - Basic widgets:
    - Create widget-objects by calling various functions. First parameter of each of these functions is always the container in which the widget is placed; remaining parameters (typically specified in keyword-fashion) specify appearance and behavior of widget.
    - Information passed in and out of widgets via special Tkinter variables (IntVar(), StringVar(), DoubleVar(), BooleanVar()), which are manipulated using methods get() and set().
    - Information-display widgets:
      - Two Forms:
        
        Label(parent, text="Text")
        Label(parent, textvariable=svar)
        Former good for small (possibly multi-line, with newline-embedded text) static text displays, and latter good for dynamic text displays.
    - Information-entry widgets
      - Entry(parent, textvariable=svar)
        
        Models single-line text-entry field.
        Text associated with / entered into field is stored in string-variable svar.
      - Checkbutton(parent, variable=ivar, text="Text")
        
        Models on/off button.
        Integer-variable ivar has value 1 (0) if button (not) pressed down with mouse click.
      - Radiobutton(parent, variable=type-var, value=type-val, text="Text")
        
        Models one of a set of radio buttons, i.e., a set of buttons in which only one member can be pressed down at a time.
        Group of radio buttons specified as set of radio-button widgets operating off the same variable type-var (which, as the name suggests, may be of any valid Tkinter variable type).
        Variable type-var has value associated with currently depressed radio button in group.
      - Scale(parent, label="Text", variable=dvar, from_=dvalL, to=dvalU, tickinterval=dvalI, resolution=dvalR, showvalue=YES, orient=str)
        
        Models entry of floating-point value by slider-scale in range dvalL to dvalU inclusive. Slider tick-interval is dvalI.
        Double-variable dvar has value associated with slider-position as rounded to nearest floating-point number modulo resolution dvalR.
        Orientation of slider can be 'horizontal' or 'vertical'.

Wednesday, November 23 (Lecture #28)
[PyNut, Section 17; PyProg, Sections 8 and 9; Class Notes]

Basic Python II: Modules (Cont'd)
- GUI Development: The Tkinter Module (Cont'd)
  - Basic widgets: (Cont'd)
    - Command-activation widgets:
      - We will only consider one such widget, the control-button.
      - Syntax: Button(parent, text="Text", command=func-name)
      - Elementary event-handling done inside this widget using the command parameter, i.e., function func-name is executed when button is pressed down.
        
        Due to the framework-structure of Tkinter (in which control is handed by Python to Tkinter and events are handed back from Tkinter to Python for processing), event-handling is also known as callback processing and event-handler functions are known as callback functions.
        Default is that callback functions have no parameters; if parameters are necessary, enclose callback function in a "helper" lambda function (making sure that parameters are interpreted correctly at call-time).
      - To exit GUI, may want to define button with callback function set to either root.quit (resume execution of script after root.mainloop() (see below)) or sys.exit (terminate GUI and script).
  - Widget layout in container
    - Done by calling layout-method relative to each each widget.
    - Layout methods:
      - pack(expand={YES, NO}, fill={BOTH, X, Y}, side={TOP, BOTTOM, LEFT, RIGHT})
      - grid(row=int, column=int)
      Can implement exact placement by pixel-position using place(), but this is very complex to use -- pack() and grid() are usually preferable.
    - When using pack(), establish contents of top and bottom sides before adding contents of left and right sides; otherwise, horizontal extent of window may be misjudged by Tkinter and contents may be mixed up.
    - When using pack(), can control (to a degree) placement of widgets when window is resized using parameters expand and fill.
    - Example: Stacked radiobutton group
    - All positions in a specified grid need not be filled when using grid() -- will fill unused positions in with white space automatically.
    - Example: Compass radiobutton group
    - Should not mix layout-types in a single container -- results may be unexpected.
  - Once all widgets set up (including Tkinter variables and callback functions) and configured to have the root window as their parent container, call root.mainloop() to hand control to Tkinter and trigger GUI creation and execution.
  - Example: Basic Tkinter GUI (pack()-layout, no-resizing) (GUI1.py)
  - Example: Basic Tkinter GUI (pack()-layout, automatic resizing) (GUI2.py)

Friday, November 25 (Lecture #29)
[PyNut, Section 17; PyProg, Sections 8 and 9; Class Notes]

Basic Python II: Modules (Cont'd)
- GUI Development: The Tkinter Module (Cont'd)
  - Creating nested containers
    - Create nested containers using Frame()
    - Syntax: frame = Frame(parent)
    - Make sure that nested frame is assigned appropriate parent and is located (using pack(), grid(), or place()) appropriately in that parent-frame.
    - A useful idiom for structuring GUIs with nested frames is to define the root container and when defining each nested frame in turn, define that frame, define and pack all widgets in that frame (making sure that frame is the parent of each widget), and then place the nested frame appropriately in the parent, e.g.,
  - Example: Basic Tkinter GUI (mixed layout in nested frames) (GUI3.py)
  - Two general GUI operation modes:
    - Make and trigger GUI such that all input is accumulated by GUI, and when input is gathered, use callback to trigger root.quit to return to main program to do processing and output, e.g..
    - Make and trigger GUI such that all I/O and processing is handled via callback in GUI, and at end of session, use callback to trigger sys.exit(), e.g.,
    The former is good for adding pretty front-ends to existing scripts, while the latter is more suited to interactive sessions alternating I/O and processing.
  - Example: Basic front-end GUI (GUI4a.py)
  - Example: Basic interactive-session GUI (GUI4b.py)
  - Example: Extracting real-number values of exponential numbers (front-end GUI) (expfltGUI1.py)
  - Example: Extracting real-number values of exponential numbers (interactive-session GUI) (expfltGUI2.py)

Week 13 [Class Notes]

Monday, November 28 (Lecture #30)
[PyNut, Section 16; Class Notes]

Basic Python II: Modules (Cont'd)
- General Numerical Processing: The math, cmath, random, and gmpy Modules
  - Many commonly-used mathematics functions are given in the math module; where applicable, versions of these functions for complex numbers are given in the cmath module.
    - These modules also have variables giving the values of mathematical constants e and pi (which, oddly enough, have the namese and pi).
  - The random module provides many functions associated with uniform distributions, e.g.,
    - seed(x=None) (sets seed to hashable object x; otherwise, sets seed to platform-specific source of randomness, e.g., system time (latter done automatically when random module is loaded)).
    - random() (returns a random float in the range 0 to 1 inclusive)
    - uniform(l, u) (returns a random float in the range l to u inclusive)
    - choice(S) (returns random element from sequence S)
    - sample(S, k) (returns list of k randomly-selected elements from sequence S)
    - shuffle(S) (does in-place random shuffle of elements in mutable sequence S)
    The random module also offers these services relative to other commonly-used distributions, e.g., Gaussian, exponential; if you are manipulating such distributions, do consult the documentation on this module to see if what you need has already been provided.
  - The gmpy module implements efficient arbitrary-precision integer and float types, as well as a rational-number type.
- Efficient Manipulation of (Numerical) Multidimensional Arrays: The NumPy Module
  - As noted earlier in this course, nested lists in Python allow easy implementation of multidimensional numerical array processing; however, large-scale numerical processing done in this fashion is very slow. The flexibility of Python multidimensional lists (heterogeneous, non-contiguous memory storage, mutable) is purchased at the expense of processing efficiency!
  - The multidimensional array type ndarray underlying NumPy (by virtue of being homogeneous, immutable (sort of; see below), and based on a contiguous chunk of memory) regains efficiency at the expense of flexibility.
  - An ndarray s an n-dimensional array of fixed size in which each element is of a fixed array-specific numerical type. The number of dimensions is the ndarray's rank, and the number of elements along a particular dimension is that dimension's length. Each ndarray has the following associated variable-attributes:
    - shape: Tuple giving lengths of array dimensions.
    - ndim: Number of array dimensions.
    - size: Number of elements in array, i.e., product of shape-tuple elements.
    - dtype: Object describing numeric type of array elements, drawn from set {byte, int, float, complex, uint8, uint16, uint64, int8, int16, int32, int64, float32, float64, float96, complex64, complex 128, complex192}.
    - itemsize: Number of bytes required to store a single array element.
  - There is no ndarray literal. However, there are a variety of ways of creating ndarrays:
    - Create a one-dimensional array which is subsequently reshaped to have multiple dimensions (see below), e.g.,
      - arange(l, u, i): A one-dimensional ndarray with integer elements l through u inclusive relative to increment i.
      - linspace(l, u, n): A one-dimensional ndarray with n elements evenly spaced between l and u inclusive.
      Note that l, u, and i can be either integer or floating-point; however, given difficulties with trying to get exact floating-point quantities under fixed floating-point precision, linspace() is safer for generating floating-point sequences.
    - Create an ndarray from a nested-list representation L of a multidimensional array (array(L {, dtype}), where dtype is one of the numerical element-types described above).
    - Create a special-purpose ndarray with a specified shape and type via function stype(shape-tuple {, dtype}), where dtype is one of the numerical element-types described above and stype is one of ones (all ones), zeros (all zeroes), or empty (arbitrary-value).
  - An ndarray a's shape can be modified using a.transpose() (return view of a with reversed shape-tuple of a), a.reshape(s) (return view of a with shape-tuple s, where produce of a.shape = product of elements of s), a.resize(s) (reshape a according to s in-place), and a.ravel() (re-shape a in-place to one-dimensional array of elements in a in enumeration-order of a (rightmost index changes fastest)).
    - Note that views are not copies; are rather references to same areas of memory with different indexing-rules.
  - Access elements and slices of ndarrays using nested list indexing and slice syntax (a[i][j][k]) or collapsed version of same (a[i,j,k]), which is more efficient.
    - Can also extract list of arbitrary elements using a Boolean matrix B of the same shape as the operand (a[B]) or a Boolean expression BExpon ndarray x which is evaluated element-wise on x (a[BExp], e.g., a[a == 10]) (see below).
    - Note that in ndarrays (unlike lists), slices do not produce copies, but are rather references to areas of memory. To create true copies, use a.copy().
  - Operators:
    - Symbolic: Standard Python arithmetic operators
      - Are applied element-wise to create matrices of same size (and upcasted type) as operands if argument-matrices of same size.
      - If operand matrices are not of same size, these matrices are augmented to be the same. This is called broadcasting, and as the rules of broadcasting are intricate, they will not be covered here.
      - Can do operations in-place, e.g., a += b.
      - Note that a * b does not give conventional matrix multiplication; need special function (see below).
    - Relational: Standard Python relational operators
      - Are applied element-wise to generate matrices of boolean values from operand matrices (which may themselves be augmented by broadcasting if necessary).
      - Can use these boolean matrices as indices (see above) or as input to matrix-to-scalar summary functions (see below).
    - Function:
      - Most math and cmath functions are available in forms that operate on ndarrays; f(a) returns a copy of ndarray a as modified element-wise by function f().
      - dot(a, b) returns matrix resulting from conventional matrix-multiplication of a and b.
      - There is also a group of matrix-to-scalar functions that summarize matrices, e.g., min(), max(), sum(), prod().
      - Are many others ...
  - Syntax modifications:
    - Enumerate over rows / outermost ndarray index: for r in a: ...
    - Enumerate over elements of ndarray: for e in a.flat: ...
    - Enumerate over index-element pairs of ndarray: for ind, elm in ndenumerate(a): ...
      - As convenient as it is, ndenumerate(a) is much slower than enumeration via a.flat.
    - Display ndarray: print a
      - If space not sufficient to display full basic-matrix slice, will replace central elements of slice with dots to indicate missing elements.
  - NumPy supplies special functions a.dump(f) and a.load(f) to write ndarrays to / read arrays from file f in space-efficient pickle format.
  - A special NumPy sub-type Matrix is supplied for high-speed 2-D ndarray operations. Note that under Matrix, the *-operator corresponds to matrix multiplication.
- Two lessons can be drawn from the above:
  - If you are doing numerical processing, get familiar with the various numeric-processing libraries in Python.
  - Modules like gmpy and NumPy can be seen as temporary, given that the efficiency concerns that motivated their creation may be irrelevant or may not matter as much in future as computers get faster; however, they are certainly necessary now, to enlarge the potential user-base for Python to hard-core numerical processing folk (the Fortran / C / C++ Brigade).
- Matlab-like Plotting: The Matplotlib Module
  - Matplotlib is another framework-module like Tkinter; in this case, one sets up a specification of a graph to be plotted, and then one triggers the actual plotting and creation of either a plot-file or a display window. The specification mimics that used in the plotting functions implemented in the Matlab system.
  - As with Tkinter, keep in mind that this lecture is a very basic overview of Matplotlib -- there are many more features in Matplotlib than we will cover here, as well as extra parameters and functionality for those features that we do describe. More details can be found on the Matplotlib reference page.
  - General structure of a simple Matplotlib plotting script:
  - Setting up plot-data
    - Arrays of x/y-coordinates are stored internally in Matplotlib as 1-D NumPy arrays. However, the functions that use these co-ordinates will accept sequences (lists or tuples) and do appropriate conversions.
    - You can create NumPy arrays of x- or y-co-ordinates directly using arange(l, u{, i}); such arrays are useful when specifying the plotted points in x-data/function format (see below).
  - Describing a plot surface
    - Immediate attributes of the plot surface can be set by various functions, e.g.,
      - xlabel(s): Set (horizontal) x-axis label to s.
      - ylabel(s): Set (vertically-rotated) y-axis label to s.
      - title(s): Set title of plot (centered above plot) to s.
    - Can also control the portion of the co-ordinates that are plotted using xlim(l, u) and ylim(l, u). If these are present, they must occur after the descriptions of individual plot lines (see below).
  - Describing and creating an individual line-plot
    - General syntax: (p =} plot(point-spec, line-spec)
    - The plotted x/y points can be specified in three ways:
      - y-data, e.g., plot(y): A y-coordinate list or array is given, and x-coordinates in the range 0,...,len(y)-1 are generated automatically and paired with the appropriate y-values.
      - x/y-data, e.g., plot(x, y): x- and y-coordinate lists or arrays are given and automatically paired in zip-fashion.
      - x-data/function, e.g., plot(x, f): An x-coordinate NumPy array is given with a function f(), and y-coordinates are generated automatically using f() and paired with the appropriate x-values. The function may be expressed as a Python function object or a NumPy array-expression written in terms of x, e.g., x * x, (2 * x) + 1, 2 ** x.
    - Plot-line characteristics are expressed in terms of a three-region string in which the first, second, and third regions are codes for the requested line color, line style, and x/y-point marker style. The most commonly-used codes are as follows:
      - Line color: b (blue), g (green), r (red), c (cyan), m (magenta), y (yellow), k (black)
      - Line style: - (solid), -- (dashed), -. (dash-dot), : (dotted), null string (no line connecting point-markers)
      - x/y-point marker style: . (point), o (circle), ^ (triangle), s (square), D (diamond), p (pentagon), h (hexagon) + (plus-sign), x (cross)
  - A plot-window displaying the described plot is created using show() (this transfers control to the plot-window; on termination of this window, control is passed back to the plot-generating Python script). The plot can also be saved to a file using savefig(filename.ext), where ext specifies the format in which the plot is saved.
  - Example: Plotting a y-data line (mplot1.py)
  - Example: Plotting an x/y-data line (mplot2.py)
  - Example: Plotting an x-data/function line (mplot3.py)
  - Example: Generalized plotting of an x-data/function line (gmplot1.py) [gmplot1_1.dat, gmplot1_2.dat]

Wednesday, Novemner 30 (Lecture #31)
[PyCook, Section 8; PyNut, Section 18; Class Notes]

Basic Python II: Modules (Cont'd)
- Matlab-like Plotting: The Matplotlib Module
  - Advanced plotting
    - Multiple single-line plots on one surface: An n x m grid is implicitly specified using calls to subplot().
      - Prior to each plot() call, have a call subplot(n, m, i) which specifies the grid dimensions (rows x columns) and the index-position i in which that plot is placed (i is in the range 1, ..., m * n and indicate positions starting at the upper left-hand corner and moving left to right and down the rows to the lower right-hand corner).
      - Individual x- and y-labels of the sub-plots may be set by placing the appropriate xlabel() and ylabel() calls between the subplot() and plot() calls.
    - Multi-line plots: In this case, create a list P of n plot-objects via the appropriate calls to plot() along with an n-length list L of strings (possibly containing embedded LaTeX code) describing the individual plot-linesa and call function legend(P, L, loc=loc-str), which generates a single plot in which all specified lines are plotted and a legend is placed on the plot in the position specified by loc-str, e.g., "upper right", "lower center", lower left".
  - Example: Sub-plotting several x-data/function lines on a 2 x 2 plot-surface grid (mplot4.py)
  - Example: Multi-plotting several x-data/function lines on a single plot surface (mplot5.py)
  - Example: Generalized multi-plotting of several x-data/function lines on a single plot-surface (gmplot2.py) [gmplot2_1.dat]
- Testing, Debugging, and Optimizing Python Scripts: The doctest, timeit, profile, and pstats Modules
  - Testing, debugging, and optimizing are the activities underlying the Ordered Holy Trinity of Programming: Make it run, make it right, make it fast.
  - Systematic testing is typically done by making sure that a program produces correct answers relative to a specified set of test cases (if correctness is judged by a test case producing the same answer as that produced by a previous program thought to be correct, this is called regression testing).
  - Testing can be done relative to individual program units (typically functions) or the system as a whole; focus here on the former.
  - Simple unit testing: The doctest module
    - Good for testing test-cases that are simple outputs of functions.
    - To invoke, have as main program import of doctest and statement doctest.testmod(); this will locate all examples in doc-strings with associated outputs and automatically run examples and compare against outputs, flagging those that differ.
      - Make sure given outputs for examples in doc-strings are themselves correct! This can be ensured by cut-and-paste of examples and outputs from Python interpreter session.
    - Example: Testing number-string generation (numstr1.py)
  - Once you have isolated a problem by testing and can trigger it when necessary with one or more test cases, debugging is in order.
  - At heart, debugging is essentially interrogating various objects at specific points in a program run to see if they are what you think they should be. This is most simply done with print statements; however, there are several modules that allow more advanced forms of interrogation, e.g., inspect, pdb.
  - Much of Python is optimized already, meaning that code will often be fast enough. If you must optimize, do the following in order:
    - Make sure there is a speed problem, i.e., run speed benchmark tests.
    - Find out what parts of the code are taking the most time, and are hence worth optimizing (profiling).
    - Do large-scale optimization, i.e., choose better algorithms.
    - Do small-scale optimization, i.e., choose better statements/ constructs.

Friday, December 2 (Lecture #32)
[PyCook, Section 8; PyNut, Section 18; Class Notes]

Basic Python II: Modules (Cont'd)
- Testing, Debugging, and Optimizing Python Scripts: The doctest, timeit, profile, and pstats Modules (Cont'd)
  - Profiling code performance: The profile and pstats modules
    - If programs or program portions are executed a large number of times, an ordinary wristwatch suffices to do gross benchmarking.
    - Distinguish several types of execution time:
      - elapsed time : wallclock time
      - system time : time spent by operating system doing I/O
      - user time : time spent processing data
      - CPU time : total execution time (system + user)
    - Precisely measuring execution time of code-segment: The os-times() function
      - To use this function, import os, call t0 = os.times() before the code-segment of interest, and t1 = os.times() after the code-segment.
      - Given t0 and t1, for that code-segment, elapsed time = t1[4] - t0[4], user time = t1[0] - t0[0], system time = t1[1] - t0[1], and CPU time = system time + user time.
      - Example: Timing number-string generation #1 (numstr2.py)
    - Precisely measuring CPU time of statement(s): The timeit module
      - To use this module after import, set up a Timer object by specifying one more statements and the setup-statements for those statements, and then calling timeit() relative to the Timer object with the requested number of iterations of the statements.
      - Example: Timing number-string generation #2 (numstr3.py)
      - There is also a command-line version of timeit (see doc-string of timeit module and page 484 of PyNut for details).
      - The run()-function in profile runs a particular command via exec() and stores profiling information on that command in a specified file; the information in such files may then be sorted and/or reduced prior to display using the Stats object in pstats.
      - To profile programs, you may find it useful to create a special main-program function that is callable via exec().
      - Example: A profiler for program PyText3.py (adapted from PyText2.py in Assignment #7) (profile_PyText3.py) [PyText3.py, nm1.dat, nm2.dat, nm3.dat, nm4.dat, tc1.dat, tc2.dat, tc3.dat, tc4.dat, com1.txt)
    - More often than not, large-scale optimization via better choice of algorithms suffices to handle problems identified by profiling. If further optimization is necessary, the following are some common ways of obtaining additional speedup:
      - Use join() to create strings by repeated concatenation (O(n^2) to O(n)).
        
        ... Though this only matters when the string-lists are of sufficient length! (try for n = 5, 50, 100, and 250 in numstr2.py and numstr3.py above). This is not surprising, given the leading constant and trailing terms hidden by O() notation.
      - Use "decoration" instead of special sort() comparators (5x speedup).
      - Avoid from X import * where possible.
      - Replace loops over lists with list comprehensions.
      - In multi-nested loops, "hoist" code that does not depends on inner loop indices to outer loops (this includes use of global variables).
      - Inline short functions
      - Avoid module prefixes for frequently-called functions by using from X import Y.
      Before starting any time-consuming optimization, it is always worth doing a timing study to make sure that the effort is truly worthwhile in terms of potential increases in code performance (see repeated-string example above).
- Course Evaluation Questionnare (CEQ)

References

Langtangen, H.P. (2008) Python Scripting for Computational Science (Third Edition). Texts in Computational Science and Engineering no. 3. Springer; Berlin.
Loui, R.P. (2008) "In Praise of Scripting: Real Programming Pragmatism." IEEE Computer, 41(7), 22-26. [PDF]
Lutz, M. (2009) Learning Python (Fourth edition). O'Reilly. (Abbreviated above as LutzL)
Lutz, M. (2006) Programming Python (Third edition). O'Reilly. (Abbreviated above as LutzP)
Martelli, A. (2006) Python in a Nutshell (Second edition). O'Reilly. (Abbreviated above as PyNut)
Martelli, A., Ravenscroft, A.M., and Ascher, D. (2005) Python Cookbook (Second edition). O'Reilly. (Abbreviated above as PyCook)

Computer Science 2500, Fall '11 Course Diary

Copyright 2011 by H.T. Wareham All rights reserved

Wednesday, September 7 (Lecture #1) [LutzL, Sections 1-3; Class Notes]

Friday, September 9 (Lecture #2) [Class Notes]

Monday, September 12 (Lecture #3) [LutzL, Sections 9 and 15; Class Notes]

Wednesday, September 14 (Lecture #4) [LutzL, Sections 9, 10, and 12; Class Notes]

Friday, September 16 (Lecture #5) [LutzL, Sections 12 and 13; Class Notes]

Monday, September 19 (Lecture #6) [LutzL, Sections 4, 5, and 13; Class Notes]

Wednesday, September 21 (Lecture #7) [LutzL, Sections 4 and 5; Class Notes]

Friday, September 23 (Lecture #8) [LutzL, Sections 5 and 7; PyCook, Sections 3.12-3.14; Class Notes]

Monday, September 26 (Lecture #9) [LutzL, Section 7; Class Notes]

Wednesday, September 28 (Lecture #10) [LutzL, Section 7; Class Notes]

Friday, September 30 (Lecture #11) [LutzL, Sections 8 and 9; Class Notes]

Monday, October 3

Monday, October 3 (Lecture #12) [LutzL, Sections 8 and 9; Class Notes]

Wednesday, October 5

Friday, October 7

Monday, October 10

Wednesday, October 12 (Lecture #13) [Class Notes]

Friday, October 14 (Lecture #14) [LutzL, Sections 8, 9, and 14; Class Notes]

Monday, October 17 (Lecture #15) [Class Notes]

Wednesday, October 19 (Lecture #16) [LutzL, Section 8; Class Notes]

Friday, October 21 (Lecture #17) [LutzL, Sections 8 and 9; Class Notes]

Monday, October 24 (Lecture #18) [LutzL, Sections 16-18; Class Notes]

Wednesday, October 26 (Lecture #19) [LutzL, Sections 16-18; Class Notes]

Friday, October 28 (Lecture #20) [LutzL, Sections 16-19 and 21-22; Class Notes]

Monday, October 31

Monday, October 31 (Lecture #21) [Class Notes]

Wednesday, November 2

Friday, November 4 (Lecture #22) [PyNut, Section 9; Class Notes]

Monday, November 7

Wednesday, November 9 (Lecture #23) [PyNut, Section 9; Class Notes]

Thursday, November 10

Friday, November 11

Monday, November 14 (Lecture #24) [PyNut, Section 9; Class Notes]

Wednesday, November 16 (Lecture #25) [PyNut, Section 9; Class Notes]

Friday, November 18 (Lecture #26) [PyNut, Section 9; Class Notes]

Monday, November 21 (Lecture #27) [PyNut, Section 17; PyProg, Sections 8 and 9; Class Notes]

Wednesday, November 23 (Lecture #28) [PyNut, Section 17; PyProg, Sections 8 and 9; Class Notes]

Friday, November 25 (Lecture #29) [PyNut, Section 17; PyProg, Sections 8 and 9; Class Notes]

Monday, November 28 (Lecture #30) [PyNut, Section 16; Class Notes]

Wednesday, Novemner 30 (Lecture #31) [PyCook, Section 8; PyNut, Section 18; Class Notes]

Friday, December 2 (Lecture #32) [PyCook, Section 8; PyNut, Section 18; Class Notes]

References

Created: June 28, 2011 Last Modified: November 18, 2011

Computer Science 2500, Fall '11
Course Diary

Copyright 2011 by H.T. Wareham
All rights reserved

Wednesday, September 7 (Lecture #1)
[LutzL, Sections 1-3; Class Notes]

Friday, September 9 (Lecture #2)
[Class Notes]

Monday, September 12 (Lecture #3)
[LutzL, Sections 9 and 15; Class Notes]

Wednesday, September 14 (Lecture #4)
[LutzL, Sections 9, 10, and 12; Class Notes]

Friday, September 16 (Lecture #5)
[LutzL, Sections 12 and 13; Class Notes]

Monday, September 19 (Lecture #6)
[LutzL, Sections 4, 5, and 13; Class Notes]

Wednesday, September 21 (Lecture #7)
[LutzL, Sections 4 and 5; Class Notes]

Friday, September 23 (Lecture #8)
[LutzL, Sections 5 and 7; PyCook, Sections 3.12-3.14; Class Notes]

Monday, September 26 (Lecture #9)
[LutzL, Section 7; Class Notes]

Wednesday, September 28 (Lecture #10)
[LutzL, Section 7; Class Notes]

Friday, September 30 (Lecture #11)
[LutzL, Sections 8 and 9; Class Notes]

Monday, October 3 (Lecture #12)
[LutzL, Sections 8 and 9; Class Notes]

Wednesday, October 12 (Lecture #13)
[Class Notes]

Friday, October 14 (Lecture #14)
[LutzL, Sections 8, 9, and 14; Class Notes]

Monday, October 17 (Lecture #15)
[Class Notes]

Wednesday, October 19 (Lecture #16)
[LutzL, Section 8; Class Notes]

Friday, October 21 (Lecture #17)
[LutzL, Sections 8 and 9; Class Notes]

Monday, October 24 (Lecture #18)
[LutzL, Sections 16-18; Class Notes]

Wednesday, October 26 (Lecture #19)
[LutzL, Sections 16-18; Class Notes]

Friday, October 28 (Lecture #20)
[LutzL, Sections 16-19 and 21-22; Class Notes]

Monday, October 31 (Lecture #21)
[Class Notes]

Friday, November 4 (Lecture #22)
[PyNut, Section 9; Class Notes]

Wednesday, November 9 (Lecture #23)
[PyNut, Section 9; Class Notes]

Monday, November 14 (Lecture #24)
[PyNut, Section 9; Class Notes]

Wednesday, November 16 (Lecture #25)
[PyNut, Section 9; Class Notes]

Friday, November 18 (Lecture #26)
[PyNut, Section 9; Class Notes]

Monday, November 21 (Lecture #27)
[PyNut, Section 17; PyProg, Sections 8 and 9; Class Notes]

Wednesday, November 23 (Lecture #28)
[PyNut, Section 17; PyProg, Sections 8 and 9; Class Notes]

Friday, November 25 (Lecture #29)
[PyNut, Section 17; PyProg, Sections 8 and 9; Class Notes]

Monday, November 28 (Lecture #30)
[PyNut, Section 16; Class Notes]

Wednesday, Novemner 30 (Lecture #31)
[PyCook, Section 8; PyNut, Section 18; Class Notes]

Friday, December 2 (Lecture #32)
[PyCook, Section 8; PyNut, Section 18; Class Notes]

Created: June 28, 2011
Last Modified: November 18, 2011