Computer Science 3711 / Engineering 5891, Winter '02
Course Diary
Copyright 2003 by H.T. Wareham
All rights reserved
Week 1,
Week 2,
Week 3,
Week 4,
Week 5,
Week 6,
Week 7,
(Midterm Exam Notes),
Week 8,
Week 9,
Week 10,
Week 11,
Week 12,
Week 13,
Week 14,
(Final Exam Notes),
(end of diary)
Thursday, January 10 (Lecture #1)
[Sections 1.1-1.2, 2.1]
- Introduction to course; dates and conventions.
- Problems and Algorithms.
- Algorithms as technology.
- Desirable properties of a measure of algorithm efficiency:
- Function of input size.
- Machine independent.
- Mathematically tractable, i.e., can be derived
(relatively) easily for any algorithm.
- Useful for assessing individual and comparing groups of
algorithms.
- For most of this course, we will focus on efficiency wrt
algorithm running time; however, we will occasionally mention
other measures, i.e., space (computer memory).
- Time complexity
- The road from actual running time to asymptotic worst-case time
complexity -- necessary lies (or rather, abstractions):
|
|
|
|
T(n) |
|
O(g(n)) |
| Actual |
|
Abstract |
|
(Worst-Case) |
|
Asymptotic |
| Running |
-----> |
Running |
-----> |
Time Complexity |
-----> |
Worst-Case |
| Time |
|
Time |
|
|
|
Time Complexity |
|
|
Instruction |
|
Input Size / |
|
Asymptotic |
|
|
counts |
|
Worst Case |
|
Notation |
- Let's look at various concepts and techniques mentioned above
in more detail.
- Input size.
- Isolate a small set of parameters that describe
amount of memory used to store input.
- Hard to define formally; acquired informally via
experience doing time complexity analyses.
- We will assume where possible that we are dealing with
small numbers in the input that fit inside a single
computer-memory "word"; hence, the size of numbers in
the input need not be a parameter in the input size.
- Counting instructions
- Basic instructions in a program can have a wide range
of running times. Can indicate this in an analysis
by assigning a separate constant for the running time of
each kind of instruction; is accurate, but very
cumbersome.
- We will assume that each kind of basic instruction runs
in 1 time unit. Is not accurate but does simplify
analyses; moreover, can be justified by our ultimate
focus on asymptotic time complexity.
- We will assume a RAM model of memory in which all
stored elements are accessible in 1 time unit. Is not
accurate, as we ignore memory hierarchy; however,
does simplify analyses.
Tuesday, January 15 (Lecture #2)
[Sections 2.1-2.2,3.1]
- Time Complexity (Cont'd)
- Counting instructions (cont'd)
- Each type of control-structure has its own rule for
counting instructions.
- Two general methods to deriving a time complexity
function for an algorithm:
- Compute counts for individual instructions and
add together to derive function.
- "Grow" function by starting at innermost level
of nesting in algorithm and working outwards.
- Which method you use depends on the intricacy of
the given algorithm.
- while-loop rule: #iter(body + 1) + 1
- Example: Simple while-loop.
- for-loop rule: #iter(body + 2) + 2
- Example: Simple for-loop.
- Example: Simple doubly-nested for-loop.
- Exact time complexity functions desirable; however,
will most often deal with worst-case time complexity
functions.
- Worst-case time complexity functions are useful for
a variety of reasons:
- Worst-case is upper bound, which is mentally
useful.
- Worst-case may often occur in practice.
- Average-case may often be equivalent to
worst-case.
- Worst-case assumption is easiest to deal with
mathematically; can also on occasion simplify
analyses.
- Example: Doubly-nested for-loop with
inner-loop dependency.
- if-then-else-loop rule:
max(if-body, else-body) + 1
- Example: for-loop with nested
if.
- Example: Some simple algorithms and
their time complexities
(tc_examples_1.txt)
Thursday, January 17 (Lecture #3)
[Sections 2.2,3.1]
- Time Complexity (Cont'd)
- Asymptotic notation
- Used to smooth "raw" time complexity functions.
- Asymptotic upper bounds: Big-Oh notation
(O(g(n)).
- Example: T(n) = 3n^2 + 4n + 5 is O(n^2)
(big_Oh.txt)
- One could derive the required constant c by fancy
algebra; I prefer to set c to the sum of the absolute
values of the coefficients -- it's very dirty but it's
also quick.
- Example: T(n) = 3n^2 + 4n + 5 is not O(n)
- Try to avoid ludicrously loose upper bounds.
- Example: T(n) = 2n is O(n^57)
- Relationship between informal "T(n) is O(g(n))" definition
and "O(g(n)) as the set of all g(n)-upper-bounded functions"
definition given in textbook; keep the latter in mind,
but the former is colloquial CS usage. May stem from
algorithm-centric and math-centric views of asymptotic
notation, respectively.
- Asymptotic lower bounds: Omega notation
(OMEGA(g(n)).
- Example: T(n) = 3n^2 + 4n + 5 is OMEGA(n^2)
- Example: T(n) = 3n^2 + 4n + 5 is not OMEGA(n^3)
- Try to avoid ludicrously loose lower bounds.
- Example: T(n) = 2n^57 is OMEGA(n)
- Asymptotic exact bounds: Theta notation
(THETA(g(n)).
- Three caveats on using asymptotic notation:
- How you interpret asymptotic notation depends on how
the function T(n) was derived. If T(n) is an exact
value for all inputs of a size n, then can say
T(n) = O(g(n)) (OMEGA(g(n))) [THETA(g(n))] means
"the running time is upper (lower) [exactly] bounded
by (g(n)"; however, if T(n) is merely an upper bound
on the running time (as has been the case in all of our
examples), must be more careful in interpreting
OMEGA and THETA statements.
- Though it is in a sense an abuse of notation, you
will frequently see O(1) or THETA(1) used to indicate
an unspecified constant in various expressions.
- The rules about combining separate asymptotic-notation
expressions are tricky; be VERY careful if you ever
do this in your math -- it might be better off to
write out the full constant-bounds and work with those,
so you don't make mistakes.
- Example: Searching: the linear and binary search
algorithms (tc_searching.txt)
Tuesday, January 22
- University closed an account of blizzard; lecture not held.
- The lecture that would have been given today will be given
Thursday (Jan 24). Until we miss more lectures, I have no
plans to schedule a makeup lecture.
Thursday, January 24 (Lecture #4)
[Sections 2.3,4.1]
- Time Complexity (Cont'd)
- Example: Sorting: the selection and insertion sort
algorithms (tc_sorting.txt)
- Example: Sorting: The merge sort algorithm
- Use recurrences to model running times of recursive
algorithms -- but what are recurrences and how do we
solve them?
- Example: Recurrence for Fibonacci numbers
- Example: Recurrence for running time of Mergesort
algorithm.
- The recursion tree method (VERY nice way of "visually"
solving divide-and-conquer recurrences)
(Section 4.2)
- Example: Using the recursion tree method to show
that the Mergesort algorithm runs in O(n log n) time
(Section 2.3.2).
- Recurrences actually useful for a lot of things -- it
would be nice to look further into general methods for
solving them.
- Example: The recurrence for the number of nodes in
a complete binary tree of height n >= 0.
- Dealing with recurrences
- Simplifications of recurrences in context of asymptotic
worst-case bounds.
- Deriving upper bounds on recurrences
- Substitution method (mathematical induction in disguise)
(Section 4.1)
- Example: T(n) = 2T(n - 1) + 1 <= 2^{n + 1} - 1
Tuesday, January 29 (Lecture #5)
[Sections 4.1-4.2; Appendix A; Class Notes]
- Time Complexity (Cont'd)
- Dealing with recurrences
- Deriving upper bounds on recurrences
- Substitution method (mathematical induction in disguise)
(Section 4.1)
- Example: T(n) = 2T(n - 1) + 1 <= 2^{n + 1} - 1
- Example: T(n) = 2T(FL(n/2)) + 1 <= n - 1
- Oops! This fails when we try to show it
for n = 1. Why? The inductive hypothesis
is too strong!
- Example: T(n) = 2T(FL(n/2)) + 1 <= 2n - 1
- Example: T(n) = 2T(n/2) + cn <= cn(log2 n) + cn
- This is all fine and good for verifying guesses;
however, where do we get the guesses?
- Recursion tree method (diagrammatic expansion of
recurrence into summation and solution of summation) (Section 4.2)
- Example: T(n) = 2T(n/2) + cn
- Example: T(n) = 3T(n/4) + cn^2
- To solve these, you still need to be able to
solve or at the very least upper bound summation
expressions.
- Dealing with Summations (Appendix A)
- Three algebraic rules for simplifying
summations.
- First rule of solving / bounding summations:
Look it up!
- Simple summations: linear, quadratic, cube
- Geometric-series summations
- Example: T(n) = 3T(n/4) + cn^2 (Cont'd)
- If you are really comfortable with summations and
good at formulating them given the terms of
a series, there is a final method for dealing
with recurrences.
- Iteration method ("plain" expansion of
recurrence into summation and solution of summation)
- Example: T(n) = 2T(n/2) + cn
- Example: T(n) = 3T(n/4) + cn^2
Thursday, January 31 (Lecture #6)
[Class Notes]
- Time Complexity (Cont'd)
- A review of some tricks for deriving big-Oh time
complexity expressions without deriving detailed
T(n) expressions (BOTECs: Back Of The
Envelope Calculations).
- Often, simply multiplying out the number of iterations
for each loop in the deepest chain of nested loops yields
a near-optimal big-Oh expression.
- Example: Two nested for-loops, for-bodies
composed of elementary statements.
- Example: Two nested for-loops, for-bodies
composed of procedure calls of unspecified time
complexity.
- In cases like this one, it is best to derive general
expression with unknown procedure time complexities
as variables. This is a useful habit to get into,
especially
for cases where the unknown procedures are generic
data structure access operations and we want
to evaluate the cost of an algorithm relative to
various data structures.
- Example: for-loop with nested if,
for- and if-bodies and
if-conditional expression composed of
procedure calls of unspecified time complexity.
- In cases like this one, can derive loose big-Oh
expressions by (as we did in previous lectures)
assuming if always evaluates to true; however,
if we have a function describing how often
the if is true, we can derive a tighter expression
along the lines sketched in the previous example.
- Example: Mergesort (again).
- The relationship between a recursive algorithm and
the recurrence describing that algorithm's running time.
- The relationship between the recurrence describing the
running time of a recursive algorithm and the
various portions of the diagram constructed via
the recursion tree method.
- Remember: If the recurrence describing the running
time of a recursive algorithm is too bizarre,
e.g., multiple base and/or inductive cases,
simplify the recurrence -- we are typically only
interested in asymptotic upper bounds.
- Example: Simplifying the recurrence describing
the running time of a simple recursive algorithm for
computing Fibonacci numbers.
- Algorithm Design Techniques
- Combinatorial optimization problems: What are they?
- Each solution associated with an instance has
a cost, and we want the solutions of optimal
(min/max) cost over the whole solution space.
- Each instance has combinatorial, i.e.,
exponential in instance size, solution
space.
- Example: The Knapsack problem (KNAPSACK)
Knapsack
Input: A set U of items, an item-size
function s(), an item-value function v(), and
a positive integer B.
Output: The subsets U' of U such that the
sum of the sizes of the items in U' is less
than or equal to B and the sum of the values of
the items in U' is maximized over the solution
space.
- Each subset U' can be modeled as a binary vector
of length |U|; hence, there are 2^{|U|} - 1
possible solutions for any instance of KNAPSACK.
Tuesday, February 5
- University closed an account of blizzard; lecture not held.
- The lecture that would have been given today will be given
Thursday (Feb 7). I will also schedule a makeup lecture
at that time.
Thursday, February 7 (Lecture #7)
[Sections 9.1-9.7, B&B; Class Notes]
- Algorithm Design Techniques (Cont'd)
- Combinatorial optimization problems (Cont'd)
- Example: Longest Common Subsequence (LCS)
Longest common subsequence
Input: Two strings s, s' over an alphabet
Sigma.
Output: All strings s'' over Sigma such that s'' is
a subsequence of both s and s' and the length of
s'' is of is maximized over the solution space.
- Each subsequence s'' can be modeled as a binary
vector of length min(|s|,|s'|); hence, there are
2^{min(|s|,|s'|)} possible solutions for any
instance of LCS.
- Example: Minimum Spanning Tree (MST)
Minimum spanning tree
Input: An edge-weighted graph G = (V, E, w).
Output: All spanning trees T of G, i.e., trees
that connect all vertices in G, such that the sum
of the weights of the edges in T is minimized
over the solution space.
- By Cayley's Theorem, a graph G = (V, E) has
O(|V|^{|V| - 2}) spanning trees; hence, there are
O(|V|^{|V| - 2}) possible solutions for any
instance of MST.
- Can solve such problems by constructing each
solution and examining that solution for optimality;
however, this is inefficient given the exponential
sizes of the solution spaces. Can we do better?
- Combinatorial Solution-Space Trees
- The most basic approach to solving combinatorial
optimization problems.
- As a first step, simplify generation of solutions
for a problem instance by using a combinatorial
solution-space tree (CST).
In such a tree, internal nodes are partial
solutions and the full solutions labeling the
leaves comprise the solution space.
- Example: A CST for an instance of KNAPSACK
in which |U| = 4.
- Can traverse such a tree and examine all of its
leaves using depth-first search (DFS) or
breadth-first search (BFS).
- As a second step, decrease space requirements by
operating on an implicit CST, i.e., generate tree
nodes as necessary during search.
- Example: Algorithm for DFS-based traversal
of implicit CST (minimization version).
DFS-U(i, sol)
if (i == n + 1)
if (viable (sol))
return(cost(sol))
else
return(INFINITY)
else
return(min(DFS-U(i + 1, sol),
DFS-U(i + 1, sol U item[i])))
- Initial call: DFS-U(1, empty set)
- Note that n is the number of items in the
set in the instance; hence "i == n + 1" means
that you have reached a leaf in the CST.
- Example: Algorithm for BFS-based traversal
of implicit CST (minimization version).
BFS-U()
Q = {(1, empty set)}
best = INFINITY
while not empty(Q)
(i, sol) = pop(Q)
if (i == n + 1)
if (viable(sol))
best = min(best, cost(sol))
else
push(Q,(i + 1, sol))
push(Q,(i + 1, sol U item[i]))
return(best)
- As a third step, where possible, use information
about partial solution viability and potential
optimality to terminate search in unfeasible
subtrees of the CST (dynamic pruning).
- Traditionally, DFS + dynamic pruning is known as
backtracking and BFS + dynamic pruning
is known as branch and bound.
- Backtracking typically used; however, if search
tree does not have finite depth, branch and
bound is preferable.
- Example: Viability and potential optimality
conditions for KNAPSACK, LCS, and MST.
- Viability (continue if answer is "yes")
- Do items in partial solution fit in
knapsack? (KNAPSACK)
- Is partial solution string a subsequence of
both given sequences? (LCS)
- Do edges in partial solution form a forest?
(MST)
- Potential optimality (continue if answer
is "yes")
- Does sum of values of items in partial
solution plus sum of values of all remaining
items that could be added exceed the
value of the best solution seen so far?
(KNAPSACK)
- Does the length of the partial solution
string plus the length of the string of
all remaining characters that could be
added exceed the length of the best
solution string seen so far? (LCS)
- Is the sum of the weights of the edges in
the partial solution less than the
cost of the best solution found so far?
(MST)
- Example: Algorithm for DFS-based traversal
of implicit CST + dynamic pruning (minimization version).
DFS-C(i, sol, best)
if (not viable(sol)) or (best < cost(sol))
return(INFINITY)
else if (i == n + 1)
best = min(best, cost(sol))
return(cost(sol))
else
return(min(DFS-C(i + 1, sol, best),
DFS-C(i + 1, sol U item[i], best)))
- Initial call: DFS-C(1, empty set, INFINITY)
- Note that variable BEST is passed by reference,
e.g., it is a pointer to an integer; if it is
passed by value, the value of BEST is not
saved from any of the leaf-evaluations.
Example: Algorithm for BFS-based traversal
of implicit CST + dynamic pruning (minimization version).
BFS-C()
Q = {(1, empty set)}
best = INFINITY
while not empty(Q)
(i, sol) = pop(Q)
if (viable(sol) and (cost(sol) < best)
if (i == n + 1)
best = min(best, cost(sol))
else
push(Q,(i + 1, sol))
push(Q,(i + 1, sol U item[i]))
return(best)
Such tricks can drastically reduce actual running
time for CST traversal; however, all of these
algorithms are still asymptotically exponential time
in the worst case (you still might have to visit all
the nodes in the tree).
The CST approach to solving combinatorial optimization
problems assumes nothing about the structure of
the solution space for an instance. If we did have
such knowledge, can we do better?
Tuesday, February 12 (Lecture #8)
[Sections 15.3-15.4]
- Algorithm Design Techniques (Cont'd)
- Dynamic Programming
- Combinatorial Search Tree: Tree of partial solutions.
- Recursive Decomposition: Tree of recursive calls,
each of which solves a subproblem of smaller size.
- Characteristics of recursive decomposition:
- Can decompose instance of problem into smaller
instance of the same problem, i.e.,
subproblems.
- At any point in the computation, may be several
choices for decomposition.
- Solutions to subproblems are independent,
and can hence do not "interfere" when they are
recombined to produce a solution to the
original instance.
- Problems that are solvable by recursive decomposition
have the optimal substructure property,
i.e, optimal solution for instance of
problems composed of optimal solutions for
subproblems.
- Divide-and-Conquer is an example of recursive
decomposition when each subproblem occurs at most
once in the recursive-calls tree. However, what
happens when subproblems may occur more than once?
- Example:: Computing Fibonacci numbers
- Recurrence
- Recursive algorithm
- Recursive-calls tree
- Memoized (recursive) algorithm
- Dynamic programming (iterative) algorithm
- If there are a polynomial number of possible
subproblems, can store these in a table. If we
are really lucky, we can use the recurrence to solve
these subproblems in a "bottom up" rather than a
"top down fashion", solving the smallest subproblems
first and solving progressively larger subproblems
until we solve the original problem instance.
- Dynamic programming (DP) => table-driven, bottom-up
recursive decomposition algorithms!
- Example: DP algorithm for Longest Common
Subsequence (LCS)
- Recurrence (recursive + base cases)
- D(i,j) = length of longest common
subsequence of first i characters of
s and first j characters of s'.
- DP algorithm
- Example run: s = cbab and s' = abacab
- Matrix fill-in (including backpointers)
for deriving length of longest common
subsequence.
- Rectangular (m + 1) x (n + 1) matrix;
fill-in proceeds from cell (1,1)
row-wise, column-wise, or
anti-diagonal-wise.
- Fill-in of an individual cell
involves consulting filled-in
cells in the 3-cell "overhang" to
left and behind of that cell.
- Traceback for deriving longest common
subsequences.
Thursday, February 14 (Lecture #9)
[Sections 15.2]
- Algorithm Design Techniques (Cont'd)
- Dynamic Programming (Cont'd)
- Example: DP algorithm for Longest Common
Subsequence (LCS) (Cont'd)
- Time complexity of matrix fill-in: (# cells) x (#choices per cell) x
(max # subproblems per choice) = (mn) x (1) x
O(1) = 2mn = O(mn).
- Space complexity: # cells = (((m + 1) x (n + 1)) = O(mn) = O(n^2)
- Deriving an arbitrary longest common subsequence
given a filled-in matrix takes O(m + n) time;
however, enumerating all longest common
subsequences may require exponential time!
- Example: DP algorithm for Matrix Chain
Parenthesization (MCP)
- Review of 2-matrix multiplication: O(n^3) time
(n = max matrix dimension)
- 3-matrix multiplication: Parenthesization of
matrix "chain" for 2-matrix multiplication
can drastically affect the number of
individual element multiplications.
- Recurrence (recursive + base cases)
- m(i,j) = minimum number of element
multiplications required to multiply
matrices i through j in the given chain.
- Note that this is the first recurrence
we've seen that involves multiple
subproblem decomposition choices.
- DP algorithm
- Example run: matrix chain is A_1 A_2 A_3 A_4,
where A_1 = 10 x 5, A_2 = 5 x 20, A_3 =
50 x 5, and A4 = 5 x 20. Thus, n = 4 and
dimension list p = (10,5,20,5,20)
- Matrix fill-in (including backpointers)
for deriving minimum number of
multiplications relative to optimal
matrix chain parenthesization.
subsequence.
- Ragged-bottom pyramid / B1 bomber
shape matrix; fill-in proceeds
from left to right in ascending
rows.
- Fill-in of an individual cell
involves consulting filled-in
cells in the "boomerang" hanging
off that cell.
- Traceback for deriving optimal matrix chain
parenthesization.
- Time complexity of matrix fill-in: (# cells) x (#choices per cell) x
(max # subproblems per choice) = (n(n - 1)/2) x O(n) x
O(1) = O(n^3).
- Space complexity: # cells = (n(n - 1)/2) = O(n^2)
Friday, February 15 (Lecture #10)
[Sections 16.1 and 28.2; Section 8.4, B&B]
Binary search is a particularly interesting example of
a D&C algorithm in which there is, at each point in the
computation, at most 1 choice and 1 subproblem to be
solved. This supposes a particular simple subproblem
structure, in which a choice can be made before
the subproblem is solved, i.e., the choice can be
made in a locally optimal (greedy) manner. Are there
other such algorithms?
Greedy Algorithms
- Example: The Activity Selection problem
- Problem statement
- DP algorithm
Monday, February 18
- Midterm Exam Notes
I've finished making up the upcoming midterm exam.
The exam will be closed-book.
It will be 60 minutes long and has a total of 60 marks
(this is not coincidental; I have tried to make the number
of marks for a question equivalent to the number of minutes
it should take you to do it). There
may still be some minor adjustments of marks allotted to
particular questions and some deletions of sub-parts in
various questions, but the current layout is as follows:
- Asymptotic notation (3 parts) (9 marks)
- Asymptotic worst-case time complexity (3 parts)
(9 marks)
- Recurrences (2 parts) (12 marks)
- Algorithm design techniques (2 questions, 2 parts
each) (14 marks)
- Dynamic programming (2 parts) (16 marks)
Things that will not be covered are
the substitution method, detailed derivation of
"raw" worst-case time complexity functions, very complex
algorithms like the DP algorithm for matrix-chain
parenthesization, or greedy algorithms. However, all else
is fair game. The questions on the last two assignments
(the simpler ones, anyway) are guides to what I have in
mind; you may also want to look back in your notes to anytime
I said "This would / has in the past made a good exam
question" :) as well as last year's CS 3711 / ENGI 5891 exams:
I hope the above helps, and I wish you all the best of luck
with this midterm.
Tuesday, February 19 (Lecture #11)
[Sections 16.1-16.2]
- Algorithm Design Techniques (Cont'd)
- Greedy Algorithms (Cont'd)
- Example: Versions of Knapsack that allow
Greedy and DP algorithms.
Thursday, February 21
Tuesday, February 26
- Midterm, break: no lecture
Thursday, February 28 (Lecture #12)
[Sections 6.1-6.3; Class Notes]
- Algorithm Design Techniques (Cont'd)
- Data structures
- The techniques we've seen so far use properties
of solution spaces to organize efficient searches
of these spaces; in many cases, we have lowered
time complexities from exponential to low-order
polynomial.
- Lower bound on time complexity of any algorithm for
a problem is typically linear (must at least read
all the input). To get as close as possible to this
bound, need to choose data structures that provide
best time complexities for operations required by
an algorithm.
- Example: Dynamic sets
- Variety of data structures for implementing
dynamic sets:
|
Data Structure |
Time Complexity (Worst Case) |
| INSERT |
SEARCH |
DELETE (key) |
DELETE (pos) |
MIN |
Generic Sorting |
| Array (Unsorted) |
O(n) |
O(n) |
O(n) |
O(n) |
O(n) |
O(n^2) |
| Linked list (Unsorted) |
O(1) |
O(n) |
O(n) |
O(1) |
O(n) |
O(n^2) |
| Array (Sorted) |
O(n) |
O(log n) |
O(n) |
O(n) |
O(1) |
O(n^2) |
| Linked list (Sorted) |
O(n) |
O(n) |
O(n) |
O(1) |
O(1) |
O(n^2) |
| Search tree |
O(n) |
O(n) |
O(n) |
O(1) |
O(n) |
O(n^2) |
| Balanced search tree |
O(log n) |
O(log n) |
O(log n) |
O(log n) |
O(log n) |
O(n log n) |
| Binary heap |
O(log n) |
O(n) |
O(n) |
O(log n) |
O(1) |
O(n log n) |
- Binary search trees
- Structured such that for any node x
in the subtree with subtrees L and R,
key(l) <= key(x) < key(r) where key(y)
is the key labelling node y and l and r
are nodes in L and R, respectively.
- This property has nice consequences:
- Minimum key at leftmost node in tree.
- Maximum key at rightmost node in tree.
- Search for a particular key is very
easy!
- Time complexity of operations tied to
tree height, i.e., number of levels of
nodes in tree; though height can be as
low as log n for n nodes, can be n
in worst case => all operations require
O(n) time!
- Balanced binary search trees, e.g.,
AVL trees, red-black trees
- Employ relatively sophisticated insertion
and deletion operations that automatically
rebalance tree and guarantee O(log n)
height => all operations require O(log n)
time!
- To obtain good time complexities for all set
operations can require fairly intricate
algorithmic machinery, e.g.,
balanced search trees. However, if only
require good times for a subset of these
operations, can do much better, e.g.,
heaps.
- (Binary) Heaps
- Structured such that for any node x
in the subtree with subtrees L and R,
either key(l), key(r) <= key(x) (max-heap)
or key(l), key(r) >= key(x) (min-heap).
- Note that is a weakened version of the
binary search tree property!
- Store keys in array such that no node
pointers needed to implement tree; under
this scheme, heap is nearly complete
binary tree with height O(log n).
- This property has some nice consequences:
- Minimum key at root node in tree
(min-heap).
- Maximum key at root node in tree
(max-heap).
- This doesn't seem like much compared to
binary search trees, as there is no
O(log n) search algorithm for heaps;
however, the weakened property does allow
for O(log n) insertion and pos-deletion
operations!
- Trades off time complexity of several
operations for ludicrously simple and
efficient algorithms for the remaining
operations.
Tuesday, March 5 (Lecture #13)
[Sections 6.4-6.5, 21.1; Class Notes]
- Algorithm Design Techniques (Cont'd)
- Data structures (Cont'd)
- Example: Dynamic sets (Cont'd)
- (Binary) Heaps (Cont'd)
- Insert(key) operation
- Delete(pos) operation
- As heap property is weaker version of
binary search tree property, cannot
exploit heap property to do fast search on
heaps; however, as weaker properties are
easier to maintain during changes to a
data structure, gives simple and efficient
algorithms for insert and delete on
heaps!
- BUILD-MAX-HEAP algorithm
- Takes unordered list and does in situ
"trickle down" operation / HEAPIFY
moving back from position n/2 to
1 to make a heap.
- Naive analysis predicts O(n log n)
running time; however, is actually
O(n)!
- Priority queues
- Require subset of dynamic set
operations, e.g.,
insert(key), min(), extract-min().
- Used in many greedy algorithms,
e.g,, Prim's algorithm
for minimum spanning trees.
- Example: Disjoint sets
- Instead of maintaining a dynamic set,
consider maintaining a dynamic partition
of a (fixed) set, i.e., a
collection of disjoint sets whose union is
a particular set.
- View set elements as pointers into sets.
- Operations:
- Make-set(x): Create set with element x.
- Find-set(x): Return the representative
element of the set containing x.
- Union(x,y): Create the union-set of the
sets containing x and y.
- Useful in a variety of algorithms, e.g.,
Kruskal's algorithm for minimum spanning trees,
determining the connected components of an
undirected graph.
Thursday, March 7 (Lecture #14)
[Sections 21.2-21.3, 22.1; Class Notes]
- Algorithm Design Techniques (Cont'd)
- Data structures (Cont'd)
- Example: Dynamic sets (Cont'd)
- Variety of data structures for implementing
disjoint sets:
|
Data Structure |
Time Complexity (Worst Case) |
| MAKE-SET |
FIND-SET |
UNION |
| List (naive) |
O(1) |
O(n) |
O(1) |
| List (rep-ptr) |
O(1) |
O(1) |
O(n) |
| Forest (naive) |
O(1) |
O(n) |
O(1) |
| Forest (rep-ptr) |
O(1) |
O(1) |
O(n) |
| Forest (UR + PC) |
O(1) |
O(n) |
O(1) |
- In tree representation of sets, nodes are
set elements and we only need to travel
from any node to the root => store tree as
set of node-parent relations!
- Consider forest augmented by union-by-rank (UR)
and path compression (PC) heuristics.
- Implement dynamic rebalancing of trees
in forest; effectively collapse them
to O(1) depth over a sequence of
disjoint set operations.
- Individual operation time complexities
not improved, but can show that
any sequence of m operations
on a base set of n elements can be done in
O(m alpha(n)) time, where alpha(n) <= 4
for any application =>
Heuristic-augmented forest allows
constant-time disjoint set
operations!.
- In general, an efficient algorithm is a collaboration
between an appropriate algorithm design technique
and one or more data structures whose operation-sets
are optimally efficient for the subset of operations
required by the algorithm.
- Need good knowledge of both algorithm design
techniques and data structures to create efficient
algorithms.
- There is actually a third component to efficient algorithm
design -- namely, knowing (or at least knowing where you can
find out about) standard algorithms for basic problems.
- You may often find when you phrase a problem of interest
abstractly enough that what you want to solve has either
already been solved, i.e., it is a standard
problem, or it can be solved using calls to subroutines
that solve standard problems.
- Algorithmic analogue of software reuse -- why reinvent the
(algorithmic) wheel?
- Many basic problems are concerned with structures composed of
objects and their relationships, and hence can be modeled very
nicely by graphs.
- Vertex = object
- Relationship between objects = edge between vertices.
- Strength of relationship = edge weight.
- Costs or values of object = vertex weight.
- Two-way relationship = undirected edge.
- One-way relationship = directed edge.
- Example: Problems 1(c) and (d), Assignment #1.
- Graph algorithms
- Representations of graphs (Section 23.1)
- Two basic choices:
- Adjacency list
- Adjacency matrix
- Adjacency list more compact for sparse graphs,
i.e., graphs such that |E| << |V|(|V| - 1)/2;
however, as they store only one bit per adjacency,
adjacency matrices better for non-sparse graphs.
- Common operations on graphs:
- next-adj(v): Return next vertex in enumeration
of vertices adjacent to v.
- adj(v, v'): Return TRUE if v is adjacent to v'
in G and FALSE otherwise.
- Representations differ in terms of the efficiency of
these operations.
|
Data Structure |
Time Complexity (Worst Case) |
| NEXT-ADJ |
ADJ |
| Adjacency List |
O(1) |
O(|V|) |
| Adjacency Matrix |
O(|V|) |
O(1) |
- As with any data structure, need to optimize graph
representation to algorithm in terms of number of
type of operations used.
- Handed back midterm exam and briefly went over some of
the answers. I plan to do up and post full answers on-line
sometime soon; watch this space for details.
Tuesday, March 12 (Lecture #15)
[Chapter 23 and Section 22.2-22.3]
- Graph algorithms (Cont'd)
- Graph traversal
- Graph traversal useful in two ways:
- Visit all nodes in a graph.
- Impose annotation / structure on the vertices
and/or edges of a graph that has subsequent
uses.
- Two types of graph traversal: breadth-first search
(BFS) and depth-first search (DFS).
- Breadth-first search on general graphs
- Intuition:
Exploration of shells of vertices at successively
larger distances from a selected point.
- Basic BFS algorithm:
BFS(G, s)
for each vertex u in G - {s} do
color[u] = white
color[s] = gray
Q = {}
ENQUEUE(Q, s)
while Q is not empty do
u = DEQUEUE(Q)
for each v in adj[u] do
if color[v] = white then
color[v] = gray
ENQUEUE(Q,v)
color[u] = black
- Interpretation of colors:
- black = fully explored (node + children)
- gray = partially explored (node only)
- white = unexplored
- Visualize as ink spreading from a central point
outwards along edges until it has reached all
vertices in a graph.
- Analysis of running time
- Running time = |Q|(T(adj) + O(1)) + |E|(O(1))
= |V|(T(adj) + O(1)) + |E|(O(1))
- Adjacency list: O(|V| + |E|) time
- Adjacency matrix: O(|V|^2 + |E|) time
- Though |E| <= |V|)|V| - 1)/2 = O(|V|^2),
often keep |E|-term in expressions to
show running time relative to sparse
graphs.
- Did not analyze algorithm in terms of
rigorously-derived T(n) function --
worked less formally and directly in terms
of big-Oh behavior of individual algorithm
components.
- Did not analyze algorithm by nested control
structure -- had to reason about behavior
of algorithm as a whole to derive certain
quantities, e.g., total length of
queue over execution of algorithm.
- Full BFS algorithm (page 532):
BFS(G, s)
for each vertex u in G - {s} do
color[u] = white
** d[u] = INFTY
** pi[u] = NIL
color[s] = gray
** d[s] = 0
** pi[s] = ni;
Q = {}
ENQUEUE(Q, s)
while Q is not empty do
u = DEQUEUE(Q)
for each v in adj[u] do
if color[v] = white then
color[v] = gray
** d[v] = d[u] + 1
** pi[v] = u
ENQUEUE(Q,v)
color[u] = black
- What are the pi and d arrays for?
- BFS defines a search tree based on all
vertices of G that is rooted at the start
vertex s; the pi-array defines the
node-parent relations in this tree.
- This search tree is structured such that
the length of the path from s to u in this
tree is the length of the shortest path
(in terms of # edges) between s and u
in G; the d-array contains these
shortest-path length values.
- Intuition behind proof of correctness for
d-values: If there were a
shorter path from s to v via some
vertex u, during exploration of children
of u, v would have previously been
colored gray!
- Can use BFS to derive shortest paths
between a specified vertex s and all other
vertices in a graph.
- Depth-first search on general graphs
- Interpretation of colors:
- black = fully explored (node + descendants)
- gray = partially explored (node only)
- white = unexplored
- Analysis of running time
- Running time = |V|(O(1)) + |V|(T(DFS-visit))
= O(|V|) + |V|(T(adj) + O(1))
= O(|V|) + |V|(T(adj)
- Adjacency list: O(|V| + |E|) time
- Adjacency matrix: O(|V|^2) time
- Once again, used aggregate instead of
line-by-line analysis.
- Note that BFS as stated in textbook will only visit
vertices in same component as start-vertex s; DFS
algorithm visits all vertices, and if there are
multiple components, will create DFS forest, i.e.
, a collection of depth-first search trees, one per
component.
Thursday, March 14 (Lecture #16)
[Chapter 23 and Sections 22.3-2.4]
- Graph algorithms (Cont'd)
- Graph traversal (Cont'd)
- Depth-first search on general graphs (Cont'd)
- What are the pi, d, and f arrays for?
- Instead of a search tree, we now have a
search forest; the pi-array defines
the node-parent relations in this
forest.
- The d- and f-values indicate discovery and
finishing times of vertex-processing
during depth-first search.
- Informal argument for why d[u] < d[v] < f[v] < f[u]
if u is an ancestor of v in a depth-first search
tree; formal proof given
in Parenthesis Theorem (Theorem 22.7, pages 543, 545)
and Corollary 22.8 (page 545).
- Classification of edges relative to a DFS-derived
search forest (tree / back / forward / cross edges).
- Can classify edges as tree,
back. or forward / cross during depth-first
search as a function of new-vertex color; can
then differentiate forward and cross edges
using vertex discovery times => constant
time edge-classification test!!
- Application: Testing graph acyclicity
- Do depth-first search, and if there are no back
edges, graph is acyclic! (Lemma 22.11).
- Application: Topological sorting
- Useful in deriving event-orderings that are
consistent with a given set (graph) of
pairwise event-orderings.
- Can be done very simply using a slightly modified
depth-first search -- yet another use of vertex
finishing times!
- Minimum Spanning Trees (MST)
- A very useful property of minimum spanning trees:
Given a subset A of E that is part of some minimum
spanning tree for G and a partition of vertex-set
V of a graph G into sets U and V - U such that no
edge has one endpoint in U and the other in V - U,
each minimum-weight edge in G linking U and V - U
must be in some minimum
spanning tree for G. Such an edge is said to be
a safe edge relative to A.
- Generic MST algorithm (page 563):
GENERIC-MST(G, w)
A = {}
while A does not form a spanning tree do
Find an edge (u,v) that is safe relative to A
A = A u {(u,v)}
return A
- Under appropriate interpretations of A, U,
and V - U, the property above underlies both
of the classic minimum spanning tree algorithms
considered below.
- Kruskal's algorithm
- Start with trivial forest of G that just includes the
vertices in V; add edges greedily, minimum-weight
first, growing minimum spanning forest that eventually
coalesces into a minimum spanning tree for G.
- Algorithm:
MST-KRUSKAL(G, w)
A = {}
for each vertex v in V do
MAKE-SET(v)
sort the edges in E in nondecreasing order by weight
for each edge in (u,v) taken in nondecreasing order by weight do
if FIND-SET(u) <> FIND-SET(v) then
A = A u {(u,v)}
UNION(u,v)
return(A)
- Interpret U as minimum spanning subtree and V - U
as union of remainder of minimum spanning forest.
- Analysis of running time
- Running time =
|V|(T(MAKE-SET)) + 2|E|(T(FIND-SET)) + |E|(T)UNION)) + SORT(|E|)
= |V|(T(MAKE-SET)) + 2|E|(T(FIND-SET)) + |E|(T)UNION)) + |E| log |E|
= |V|(T(MAKE-SET)) + 2|E|(T(FIND-SET)) + |E|(T)UNION)) + |E| log |V|
- Running time relative to various data
structures for disjoint sets:
|
Data Structure |
Time Complexity (Worst Case) |
| MAKE-SET |
FIND-SET |
UNION |
Kruskal's Algorithm |
| List (naive) |
O(1) |
O(n) |
O(1) |
O(|E||V|) |
| List (rep-ptr) |
O(1) |
O(1) |
O(n) |
O(|E||V|) |
| Forest (naive) |
O(1) |
O(n) |
O(1) |
O(|E||V|) |
| Forest (rep-ptr) |
O(1) |
O(1) |
O(n) |
O(|E||V|) |
| Forest (UR + PC) |
O(1) |
O(n) |
O(1) |
O(|E| log |V|) |
Prim's algorithm
- Start with a single vertex in G; add edges greedily,
minimum-weight first, growing minimum spanning tree
for successively larger subsets of V until you
have a minimum spanning tree for G.
- Algorithm:
MST-PRIM(G,w,r)
for each u in V do
key[u] = INFTY
pi[i] = nil
key[r] = 0
Q = V
while Q is not empty do
u = EXTRACT-MIN(Q)
for each v in adj[u] do
if v in Q and w(u,v) < key[v] then
pi[v] = v
key[v] = w(u,v)
Interpret U as minimum spanning subtree and V - U
as vertices in V that are not already in spanning
subtree.
Tuesday, March 19 (Lecture #17)
[Sections 23.2 and 24.3]
- Graph algorithms (Cont'd)
- Minimum Spanning Trees (MST) (Cont'd)
- Prim's algorithm
- Analysis of running time
- Running time =
T(BUILD) + |V|T(EXTRACT-MIN)) +
|E|(T(DECREASE-KEY))
- Running time relative to various data
structures for priority queues:
|
Data Structure |
Time Complexity (Worst Case) |
| BUILD |
EXTRACT-MIN |
DECREASE-KEY |
Prim's Algorithm |
| Array (Unsorted) |
O(n) |
O(n) |
O(n) |
O(|E||V|) |
| Linked list (Unsorted) |
O(n) |
O(n) |
O(n) |
O(|E||V|) |
| Array (Sorted) |
O(n log n) |
O(n) |
O(n) |
O(|E||V|) |
| Linked list (Sorted) |
O(n log n) |
O(1) |
O(n) |
O(|E||V|) |
| Search tree |
O(n) |
O(n) |
O(n) |
O(|E||V|) |
| Balanced search tree |
O(n) |
O(log n) |
O(log n) |
O(|E| log |V|) |
| Binary heap |
O(n) |
O(log n) |
O(log n) |
O(|E| log |V|) |
- Note that Prim's algorithm is somewhat like BFS;
both are expanding a frontier outwards from
an initial vertex in the graph,
but while BFS expands this frontier uniformly
by distance from the start point, the expansion
by Prim is relative the the smallest-weight
edge extending into the frontier.
- Note that Kruskal's and Prim's algorithms are, in a
broad sense, converses of each other -- by adding
minimum-weight edges,
Kruskal's algorithm coalesces a (initially trivial)
spanning forest for the given graph G into a spanning tree
for G and Prim's algorithm extends spanning tree for
successively larger subgraphs of G until it has a spanning
tree for G.
- Shortest Paths in Edge-Weighted Graphs
- Has many applications depending on how you interpret
the edge-weights, e.g., distances, costs.
- Can solve by applying edge-expansion and
ordinary BFS; however, this yields an exponential
time algorithm. Can we do better?
- Two types of problems:
- Single-source shortest paths
- All-pairs shortest paths
- Can further break down algorithms based on whether
or not they detect negative-weight cycles. Such
cycles are problematic because they render
shortest-paths estimates meaningless, e.g.,
can make length of path arbitrarily short by
appropriate number of traversals around the cycle.
- Properties of shortest paths:
- Triangle inequality: d(s,v) <= d(s,u) + w(u,v)
- Shortest path has at most |V| - 1 edges,
i.e., can eliminate negative-, zero-,
and positive-weight cycles from shortest paths.
- Shortest paths satisfy the optimal substructure
property, i.e., given a shortest path
from s to v and a vertex u on this path, the
subpaths s->u and u->v are also shortest paths;
this means that there are many nice algorithmic
possibilities, i.e., DP, D&C, Greedy.
- The Relax operation
- Algorithm (pages 585-586):
Initialize-Single-Source(G, s)
for each vertex v in adj[v] do
d[v] = INFTY
pi[v] = nil
d[s] = 0
Relax(u, v, w)
if (d[v] > d[u] + w(u,v)) then
d[v] = d[u] + w(u,v)
pi[v] = u
- The relax operation is trivial; however,
sequences of relaxations can be quite powerful.
- All algorithms we consider for the next 1 1/2
lectures will essentially differ only in the
the order in which edges are relaxed and the
number of times each edge is relaxed.
- Single-source shortest paths algorithms
- Dijkstra's algorithm
- This algorithm essentially applies relaxations in
a systematic matter to grow a set of vertices with
known shortest-path distances outwards from the
source vertex to encompass the whole graph. In
particular, relaxations are applied to vertices
(or rather, the adjacencies of those vertices)
as they are added to this set and can be seen as
extending out from the frontier of this region into
the remainder of the graph.
- Algorithm:
Dijkstra(G, w, s)
Initialize-Single-Source(G, s)
S = {}
Q = V[G]
while Q is not empty do
u = EXTRACT-MIN(Q)
S = S u {u}
for each vertex v in adj[u] do
Relax(u, v, w)
- Analysis of running time
- Running time
= T(INITIALIZE) + |V|T(EXTRACT-MIN)) + |E|T(RELAX))
= T(INITIALIZE) + |V|T(EXTRACT-MIN)) + |E|T(DECREASE-KEY))
- Runs in O(|E| log |V|) time relative to a
binary heap.
- Cannot detect negative cycles, and will produce
erroneous results if one or more such cycles
are present.
- Doesn't Dijkstra's algorithm look an awful lot
like Prim's algorithm for minimum
spanning trees? It should!
Try writing the code for the initialization and
relaxation procedures into Dijkstra's
algorithm, eliminating the S-lines (which are
not being used), and then comparing with Prim's
algorithm -- oddly enough, you'll see that
relaxation is a generalization of the key-value
adjustment done by Prim on lines 9 and 11,
e.g.,
Dijkstra-Rewrite(G, w, s)
for each u in V do
d[u] = INFTY
pi[u] = nil
d[s] = 0
Q = V
while Q not empty do
u = EXTRACT-MIN(Q)
for each v in adj[u] do
if d[u] + w(u,v) < d[v] then
d[v] = d[u] + w(u, v)
pi[v] = u
Thursday, March 21 (Lecture #18)
[Sections 24.1 and 25.1]
- Graph algorithms (Cont'd)
Tuesday, March 26 (Lecture #19)
[Sections 25.2 and 34.1]
- Graph algorithms (Cont'd)
- Shortest Paths in Edge-Weighted Graphs (Cont'd)
- All-pairs shortest paths algorithms (Cont'd)
- The Floyd-Warshall algorithm
- Running time = O(|V|^3)
- Note that both of these algorithms are technically
operating on a 3-D matrix of size O(|V|^3); however,
as we only need
to save adjacent 2-D "slices" of this matrix,
we actually use O(|V|^2) space.
- Note that the running times given for the three
algorithms above assume that the input graph is
represented as an adjacency matrix -- if the graph
is instead represented as an adjacency list, the
running times of each algorithm increase by a
a multiplicative factor of |V|!
Consider some examples of how to solve problems using standard
graph algorithms.
Example: Solving Problem 1(c), Assignment #1 with
a MST subroutine (one call, instance / solution
transformation required):
PROB1C(G, w, B)
for each edge e in E' do
w'(e) = B - w(e)
T = MST(G, w')
return(T)
Example: Solving Problem 1(d), Assignment #1 with
a BFS subroutine (O(|V|) calls, no instance / solution
transformation required):
PROB1D(G, S)
for each vertex v in S do
SP(v) = BFS(G, v)
return(union of all SP(v))
Example: Solving the single-source shortest paths
problem with a BFS subroutine (one call, instance /
solution transformation required):
SS-SP(G, w, s)
V' = V
E' = {}
for each edge (x,y) in E do
add vertices {v_xy1,..., v_xy(w(x,y) - 1)} to V'
add edges (x,v_xy1),
(v_xyi,v_xy(i + 1)) for 1 <= i <= w(x,y) - 2,
(v_xy(w(x,y) - 1),y) to E'
SP(s) = BFS(G', s)
return(SP(s))
Note that as long as all transformations involved and the
invoked subroutine run in polynomial time, the whole
process runs in polynomial time. For example, the first
two examples above yield polynomial-time algorithms but
the third example yields an exponential-time algorithm.
Such a process can be viewed as relating problems in the sense
that if A reduces to B, B is at least as computationally
difficult as A -- indeed, it
seems to rank problems by relative
computational difficulty. Remember this -- it will come
in handy later.
Using the techniques we've talked about so far in this course,
we can often derive a polynomial-time algorithm for a problem
of interest and hence prove that this problem is solvable in
polynomial time.
However, it is also often the case that we run across problems
for which we cannot find a polynomial time algorithm. Are we
just not thinking hard enough or is it really the case that
there isn't a polynomial-time algorithm for that problem?
That is, how can we prove that a problem is not
solvable in polynomial time?
Proving Polynomial-Time Unsolvability
- Helps to review our conceptions of polynomial-time
solvability and computational problems.
- Polynomial-time solvability
- A good standard of algorithm efficiency for both
practical and theoretical reasons.
- Polynomials grow slowly as a function of n;
therefore, polynomial-time algorithms can
solve larger instances in a given amount of
time than non-polynomial-time, e.g.,
exponential time, algorithms.
- Polynomial-time solvability is stable, as many
reasonable models of computation are
equivalent under polynomial time, i.e.,
an algorithm relative to one model can be
simulated on another model with at most a
polynomial-factor change in running time.
- Polynomial time may seem to encompass too much
territory to be a true measure of algorithm
efficiency, e.g., O(n^{1761}) time
algorithms; however, in practice, algorithms
tend to be low-order polynomial time (which is
very odd when you think about it).
- Computational problems
- Problems as relations between inputs and outputs.
- Classify problems by types of output:
- Decision problem: Returns Boolean value
(typically the answer to a question about
the optimal solution for a given instance of
a problem).
- Cost / Evaluation problem: Returns a number
(typically the cost of an optimal solution for
a given instance of a problem).
- Example / Solution problem: Returns a structure
(typically an optimal solution for
a given instance of a problem).
- Example: Some computational problems that
we will be looking out (handout)
[DVI/
PDF/
PostScript]
Vertex Cover (VC)
Input: An undirected graph G = (V,E) and an integer k > 0.
Question: Is there a vertex cover of G of size at most k,
i.e, is there a subset V' \subseteq V such that |V'| <= k
and for all edges (u,v) \in E, at least one of u and v is in V'?
Vertex Cover Cost (VC-C)
Input: An undirected graph G = (V,E).
Output:} The size of the smallest vertex cover of G.
Vertex Cover Example (VC-E)
Input: An undirected graph G = (V,E).
Output: One of the smallest vertex covers of G.
Clique
Input: An undirected graph G = (V,E) and an integer k > 0.
Question: Is there a clique in G of size at least k, i.e.,
is there a subset V' \subseteq V, |V'| >= k, such that for all
u, v \in V', (u,v) \in E?
Subset sum (SS)
Input: A set S of integers and an integer k \geq 0.
Question: Is there a subset S' of S whose elements sum to k?
Steiner tree in graphs (STG)
Input: An undirected graph G = (V,E), a set V' \subseteq V,
and an integer k > 0.
Question: Is there a tree in G that connects all vertices in
V' and contains at most k edges?
- Can also classify problems by how input and output
are represented -- in terms of abstract entities
(abstract problems) or as bit-encodings
of those entities (concrete problems).
- Proofs operate on abstract problems; algorithms
operate on concrete problems.
- Abstract and concrete problems are related by
encodings, i.e., specifications of how
abstract entities are encoded into bit-strings.
- The standard encoding, e.g, all numbers are
stored in binary, all graphs have their vertices
labelled with the integers {1,..,|V|}, etc.
- Mirrors encoding in conventional computers.
- Can distort running time by specifying a
pathologically terse or padded encoding, which in
turn artificially compresses or expands the
input size, e.g., unary encoding of integers,
encoding of edges of weight w as paths with w
edges.
- Focus on encodings that are polynomially-related
to the standard encoding, i.e., encodings
that produce instance-encodings whose sizes are
within a polynomial factor of that created under
the standard encoding; such encodings provide
a solid foundation for polynomial-time solvability
that is anchored to encodings used in everyday
computing!
- Two types of arguments for polynomial-time unsolvability of
a problem:
- Absolute (based on examining the problem in isolation)
- Lower bound theory / information-theoretic arguments.
- Relative (based on examining a problem in relation to
other problems)
- Focus in this course is on the relative approach.
Thursday, March 28 (Lecture #20)
[Sections 34.2-34.3]
Example: Class EXPTIME is the set of all decision
problems that have exponential-time algorithms.
Decision versions of all problems we've looked at so far
are in EXPTIME.
P is properly included in EXPTIME.
Example: Class NP is the set of all decision
problems that have nondeterministic polynomial-time
algorithms, i.e., problems such that candidate
solutions can be verified in polynomial time.
Decision versions of all problems we've looked at so far
are in NP.
Example: VERTEX COVER is in NP courtesy of the
O(|E||V'|) time verification algorithm that determines
if every edge in G has an endpoint in the given candidate
vertex cover V'.
Example: KNAPSACK is in NP courtesy of the
O(|U'|) time verification algorithm that determines if
the summed size of the items in candidate solution U' is
<= B and the summed value of the items in U' is >= k.
Example: STEINER TREE IN GRAPHS is in NP courtesy
of the polynomial time verification algorithm that
determines if the specified set of candidate solution
tree edges has
size <= k and forms a tree in G that connects all
vertices in V'.
P is certainly included in NP; however. it is not known
if this inclusion is proper (hence, the "P = NP?"
conjecture). Still, if EXPTIME-complete problems are
not useful to us, NP-complete problems might be.
Now, all we need is one EXPTIME- or NP-hard problem
to start the ball rolling ...
Tuesday, April 2 (Lecture #21)
[Sections 34.3-34.4; Class Notes]
- Proving Polynomial-Time Unsolvability (Cont'd)
- One strategy for obtaining an initial C-hard problem X
for class C is to create a generic reduction from any
problem in C to X. How might one do this, though?
- Such a generic reduction is relatively easy to construct
relative to a class C of decision problems relative to
some type of language-recognition machine.
- Equivalence of decision problems and language
acceptance problems (map decision problem to
language of bitstrings consisting of
instance-encodings of all instances for which
answer is "yes").
- All problems in a class C of decision problems
relative to a language-recognition-machine M are
reducible to the following problem (which is also
typically itself in class C):
M-Acceptance
Input: Machine m of type M, input x.
Question: Does m accept, i.e.,
produce output "yes", on
input x?
- Example: P is the set of all decision problems
whose associated languages of "yes"-instances can be
recognized by a polynomial-time deterministic Turing
machine (DTM).
- Example: EXPTIME is the set of all decision
problems whose associated languages of "yes"-instances
can be recognized by a exponential-time deterministic
Turing machine (EDTM).
- Hence, we have one polynomial-time unsolvable problem --
EDTM-ACCEPTANCE! However, there are many problems of
interest for which one cannot construct reductions from
EDTM-ACCEPTANCE. Perhaps these problems are not
EXPTIME-hard; need something a bit less difficult,
computationally-speaking.
- Example: NP is the set of all decision problems
whose associated languages of "yes"-instances can be
recognized by a polynomial-time nondeterministic Turing
machine (NDTM).
- Nondeterministic TM may have arbitrary-choice
conditional statements; machine computes "yes" on
an input if there is some set of choices at
these arbitrary-choice conditionals that yields
an accepting computation for the given string.
- Arbitrary-choice conditionals = way of looking at all
possible solutions for a given instance within
an NDTM! Hence, nicely encodes notion of
polynomial-time verification.
- NDTM ACCEPTANCE is NP-complete!
- NDTM ACCEPTANCE may appear as unnatural a problem as EDTM
ACCEPTANCE at first glance; however, it is not.
- Two routes to useful NP-complete problems:
- Historical route
- NTDM ACCEPTANCE many-one reduces to CNF SATISFIABILITY
(Cook (1971))
- CNF SATISFIABILITY many-one reduces to 3-CNF
SATISFIABILITY (Karp (1972))
- The real genius of Karp (1972) was to show reductions from
CNF SATISFIABILITY and 3-CNF SATISFIABILITY to 18
problems of interest drawn from graph theory, scheduling,
and network design. This initial list of 18 has since
blossomed to several thousand (see the Appendix of
Garey and Johnson (1979) for a small but nonetheless
significant list of known NP-hard problems).
- Note that one of Karp's reductions (MINIMUM COVER to
STEINER TREE IN GRAPHS) is wrong (Question 2(a),
Assignment 5); however, with a slight modification,
the general idea works (Question 2(b),
Assignment 5).
- Stresses the need for proofs of correctness for
reductions.
- Textbook route
- NDTM ACCEPTANCE many-one reduces to CIRCUIT
SATISFIABILITY (Lemma 34.6)
Circuit Satisfiability
Input: A circuit C.
Question: Is there a set of input-values for
C that
makes C output true on its output line?
- CIRCUIT SATISFIABILITY many-one reduces to GENERAL
SATISFIABILITY (Theorem 34.9)
General Satisfiability
Input: A combinational logic formula F.
Question: Is there a satisfying assignment
for F?
- GENERAL SATISFIABILITY many-one reduces to 3-CNF
SATISFIABILITY (Theorem 34.10)
- By transitivity of reductions, all problems above are
NP-hard. This thus makes for the second (and usually
far easier) strategy for proving a problem X C-hard:
- Strategy #1: Give a generic reduction from every
problem Y in C to X.
- Strategy #2: Give a reduction from known C-hard
problem Y to X.
- Example: 3-CNF SATISFIABILITY many-one reduces to CLIQUE
(Theorem 34.11)
- Example: 3-CNF SATISFIABILITY many-one reduces to SUBSET SUM
(Theorem 34.15)
- Example: CLIQUE many-one reduces to VERTEX COVER
(Lemma 3.1, Garey and Johnson (1979))
- Relies on a nice graph-theoretic characterization
of cliques and vertex covers on the same set of
vertices.
- Define the complement Gc = (V,Ec) of a graph
G = (V,E) as the graph on V with edge-set
Ec = {(u,v) | (u,v) is not in E}.
- Consider vertices not in a vertex cover V' for
graph G = (V,E); none of these vertices are
connected by an edge in E (else, one of them
would have to be in the vertex cover). Hence,
there is an edge between each pair of these
vertices in Gc, and indeed these vertices form a
clique in Gc. Hence, V' is a vertex cover in G iff
V - V' is a clique in Gc!
- Given an instance (G,k) of CLIQUE, the reduction
constructs an instance (Gc,|V| - k) of VERTEX COVER.
This reduction is correct by the property noted
above.
- As VERTEX COVER many-one reduces to STEINER TREE IN
GRAPHS, SUBSET SUM many-one reduces to KNAPSACK (see
previous lecture), and both VERTEX COVER and SUBSET SUM
are NP-hard, STEINER TREE IN GRAPHS and KNAPSACK are
NP-hard! Which means that these problems are not
solvable in polynomial time!
... Doesn't it? ...
- We don't know whether or not P is properly
contained in NP. Hence, there must always be a
qualifying clause attached to an interpretation of an
NP-hardness result -- namely, an NP-hard problem X is
not solvable in polynomial time unless P = NP.
- To this day, it is not known if P = NP or not -- however,
the best
evidence that P != NP is that no-one has yet found a
polynomial-time algorithm for an NP-hard problem.
- All this is fine and good; however, the inconvenient
fact remains that even if a problem is NP-hard, we
still need to be able to solve it.
- Are there alternatives to exponential-time
optimal-solution algorithms for NP-hard problems? Yes,
there are! To see why, we need to understand
exactly what an NP-hardness result means:
if a problem is NP-hard
then there is no deterministic algorithm
that gives optimal solutions
for all instances of the problem
in polynomial time
unless P = NP.
- Note, however, that there may still be usable
algorithms that violate one or more of the
italicized clauses above, e.g.,
polynomial-time randomized optimal-solution
algorithms, polynomial-time deterministic
approximate-solution algorithms.
- There are many such options; we will look at several
of them in our remaining lectures.
Thursday, April 4 (Lecture #22)
[Section 35.1; Class Notes]
- Dealing with Polynomial-Time Unsolvability
- Polynomial-time randomized algorithms
- Algorithms that are allowed to make random choices; in
conjunction with assumption of various distributions
(typically uniform) on the space of inputs, can
derive algorithms with an expected probability of
producing a correct (optimal) answer or an expected
running time.
- Two types of algorithms:
- Monte Carlo: Running time fixed,
expected probability of producing a correct
(optimal) answer (you don't know if you'll
win).
- Las Vegas: Expected running time,
produces correct (optimal) answer (you don't
know when you'll win).
- Typically used to give faster algorithms for problems
already known to be solvable in low-order polynomial
time, e.g., an O(|V| + |E|) expected time
Las Vegas algorithm for deriving minimum spanning
trees (Section 10.3, Motwani (Motwani and
Raghavan (1995)).
- Also useful for certain problems not known to be
in P, e.g., the Miller-Rabin Monte Carlo
algorithm for primality testing (Section 31.8).
- Polynomial-time bounded-cost approximation algorithms
Tuesday, April 9 (Lecture #23)
[Class Notes]
- Dealing with Polynomial-Time Unsolvability (Cont'd)
- Polynomial-time bounded-cost approximation algorithms
- Example: A 2-approximation algorithm for
KNAPSACK (Garey and Johnson (1979), page 135;
Section 16.2).
- Without loss of generality, assume that each
item in U has size <= B, i.e., each
item can individually fit in the knapsack.
APPROX-KNAPSACK(U,s,v,B)
Order items in u in decreasing ratio of value/size, i.e.,
v(u1)/s(u1) >- v(u2)/s(u2) >= ... >= v(un)/s(un)
U' = {}
for i = 1 to n do
if size(U') + s(ui) <= B then
U' - U' u {ui}
max = 1
for i = 2 to n do
if v(u_max) < v(ui) then
max = i
if (value(U') > v(u_max))
return(U')
else
return({u_max})
- Very few problems have additive approximation
algorithms.
- Example: If P != NP then there is no
polynomial time approximation algorithm with
an additive polynomial factor p(|I|) for KNAPSACK
(Theorem 6.6, Garey and Johnson (1979)).
- Given an instance I of KNAPSACK, create a new
instance I' such that v'(ui) =
(p(|I|) + 1)v(ui). It is obvious that C_O(I') =
(p(|I|) + 1)C_O(I). As all solution values
for I' are multiples of p(|I|) + 1 and
solution values are integers, C_O(I') -
C_A(I') <= p(|I|) = 0. As any solution for I'
is also a solution for I, the approximation
algorithm actually produces an optimal solution
for KNAPSACK in polynomial time, which (as
KNAPSACK is NP-complete) implies that P = NP.
- Few problems have FPTAS or PTAS -- KNAPSACK is
one of them (Garey and Johnson (1979),
pages 135-137).
- Use appropriately-defined reductions and complexity
classes to show that certain types of approximation
algorithms do not exist for certain problems,
e.g., use of MAX SNP-hardness relative to
polynomial-time L-reducibility to show that
VERTEX COVER and STEINER TREE IN GRAPHS do not
have PTAS unless P = NP (see Chapter 5,
Wareham (1993) and references).
- Fixed-parameter algorithms
(Downey and Fellows (1999), Fellows (2001))
- An algorithm whose non-polynomial running time
terms are expressed in terms of aspects of the
input that are of very small size in practice.
- Such algorithms are of interest, as many
problems that are polynomial-time unsolvable in
general occur in restricted versions in practice.
- Example: 3-D Robot Motion Planning
3-D Robot Motion Planning (3DRMP)
Input: A jointed robot R, a set of obstacles
O embedded in some 3-D environment E,
and initial and final positions p_I
and p_F of R in E.
Question: Is there a sequence of robot
motions that move R from p_I to p_F
without colliding with any obstacles
in O?
- Is PSPACE-hard for arbitrary robots and
environments. However, robots are seldom
arbitrary -- for example, the number of
joints in a robot arm <= 7 and the number of
joints in a robot hand <= 20.
- Is there a non-polynomial-time algorithm for
3DRMP whose non-polynomial terms are expressed
as functions of the number of joints in
the robot?
- Two key notions:
- A parameterized (decision) problem has
instances of the form (x,k) where x is the
main part and k is the parameter.
- A parameterized problem is fixed-parameter
tractable if it has an algorithm whose
running time is O(f(k)x^c) where f() is an
arbitrary function and c is some constant
independent of k.
- Example: (k)-VERTEX COVER is the parameterized
version of VERTEX COVER in which the size k of the
vertex cover is the parameter.
- Example: (k)-CLIQUE is the parameterized
version of CLIQUE in which the size k of the clique
is the parameter.
- Example: (d)-3DRMP is the parameterized
version of 3DRMP in which d, the number of joints
in the robot, is the parameter.
- Show fixed-parameter tractability of a parameterized
problem by giving a fixed-parameter algorithm for
that problem.
- Example: (k)-VERTEX COVER is fixed-parameter
tractable.
FP-VERTEX-COVER(G,k)
L = {({{},G)}
VC = {}
for i = 1 to k do
for each item(S',G') in L do
select arbitrary edge e = (u,v) in E'
if G - {u} is empty then
VC = VC u {S' u {u}}
else
L = L u {({S' u {u}}, G - {u})}
if G - {v} is empty then
VC = VC u {S' u {v}}
else
L = L u {({S' u {v}}, G - {v})}
return(VC)
- Implicitly creates binary search tree of
depth k and spends O(|V| + |E|) time
processing each node in that tree; hence,
algorithm runs in O(2^k(|V| + |E|)) time.
- Show that parameterized problems are not
fixed-parameter tractable by showing these
problems complete or hard for the appropriate
classes of problems that are either known or
strongly conjectured to properly include FPT,
the class of all fixed-parameter tractable
parameterized problems.
- Example: (k)-CLIQUE is W[1]-complete and
hence is not fixed-parameter tractable unless
FPT = W[1].
- Example: (d)-3DRMP is W[SAT]-hard and
hence is not fixed-parameter tractable unless
FPT = W[SAT] (Cesati and Wareham (1995)).
- Parameters can be combinations of aspects, so single
fixed-parameter tractability or intractability
results relative to particular parameters are only
spurs to further research ...
The Future of Algorithm Analysis:
Experimental Algorithmics (Moret (2001))
- Worst-case time complexity has served us well; however,
there are undeniable problems:
- Asymptotic behavior may only matter at absurdly large input
sizes.
- Worst-case behavior may only hold for a very small
and infrequently-encountered set of inputs.
- Absurdly large constants may be hidden in the
big-Oh notation.
- Deriving bounds on running time is very difficult for
complex algorithms.
- Machine model underlying asymptotic worst-case
complexity is inadequate.
- Need to balance theoretical analysis with implementation
of and well-designed experiments on the behavior of
algorithms => experimental algorithmics!
- Example: Experimental study of MST algorithms.
- Examined a variety of algorithms across a variety of
platforms (computers / operating systems /
compilers).
- Accounted for platform-dependent effects
by comparing algorithm on each platform against
simple linear-time graph-traversal algorithm with
same memory-access patterns, e.g., graph
traversal.
- Prim's algorithm with heap priority queue
performed best => when algorithms get sufficiently
complex, theoretical logarithmic-factor improvements
do not matter in practice.
- Implementation of Prim's algorithm sped up 10x over
course of experiments.
- Benefits of experimental algorithmics:
- Better assessment of relative utility of algorithms.
- Better implementations of algorithms.
- Improved theories for assessing algorithm efficiency ...
... Which brings us back, tired but hopefully wiser, to where
we started in the first lecture of this course ...
To travel is to go far.
To go far is to return.
Saturday, April 13
- Final Exam Notes
I've finished making up the upcoming final exam.
The exam will be closed-book.
It will be 120 minutes long and has a total of 120 marks
(this is not coincidental; I have tried to make the number
of marks for a question equivalent to the number of minutes
it should take you to do it). There
may still be some minor adjustments of marks allotted to
particular questions and some deletions of sub-parts in
various questions, but the current layout is as follows:
- Asymptotic notation / Asymptotic worst-case
time complexity (2 parts, 6 sub-parts) (33 marks)
- Dynamic programming (1 part) (14 marks)
- Graph algorithms (2 parts) (22 marks)
- Algorithm design (1 part) (25 marks)
- Proving polynomial-time
unsolvability (2 parts, 6 sub-parts) (26 marks)
Things that will not be covered are solving
recurrences, detailed derivation of "raw" worst-case time
complexity functions, combinatorial solution-space trees, or
very complex algorithms like the DP algorithm for matrix-chain
parenthesization. However, all else is fair game. You should
look over and make sure you understand all assignment answers;
you may also want to look back in your notes to anytime
I said "This would make / has in the past made a good exam
question", "Does this algorithm run in polynomial time?", and
"I really find this particular aspect of this algorithm
fascinating." The various midterm exams and last year's final exam
may also be of help:
I wish you all the best of luck with both this final exam and
all of your other final exams.
References
Unless otherwise stated, all references above are to Cormen
et al. (2001).
- Brassard, G. and Bratley, P. (1996) Fundamentals of
Algorithmics. Prentice-Hall; Englewood Cliffs, NJ.
[Abbreviated above as B&B]
-
Cesati, M. and Wareham, H.T. (1995) "Parameterized Complexity
Analysis in Robot Motion Planning." In Proceedings of the 25th IEEE
International Conference on Systems, Man, and Cybernetics: Volume 1.
IEEE Press; Los Alamitos, CA. 880-885.
(PostScript/6 pages)
- Cook, S. (1971) The complexity of theorem proving procedures.
In Proceedings of the Third Annual ACM Symposium on Theory of
Computing. ACM Press; New York. 151-158.
- Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C. (2001)
Introduction to Algorithms. The MIT Press; Cambridge, MA.
[Abbreviated above as CLRS]
- Downey, R.G. and Fellows, M.R. (1999) Parameterized
Complexity. Springer-Verlag; Berlin.
(
description)
- Fellows, M.R. (2001) "Parameterized Complexity: New Developments
and Research Frontiers". Manuscript.
(PostScript/22 pages)
- Garey, M.R. and Johnson, D.S. (1979) Computers and
Intractability: A Guide to the Theory of NP-Completeness.
W.H. Freeman; San Francisco, CA.
- Karp, R.M. (1972) Reducibility among combinatorial problems. In
R.E. Miller and J.W. Thatcher (eds.) Complexity of Computer
Computations. Plenum Press. 85-103.
- Moret, B.M.E. (2001) "Towards a discipline of experimental
algorithmics." To appear, DIMACS Monograph in Discrete
Mathematics and Theoretical Computer Science. AMS Press.
( PostScript/13 pages)
- Motwani, R. and Raghavan, P. (1995) Randomized Algorithms.
Cambridge University Press.
- Wareham, H.T. (1993) On the Computational Complexity of
Inferring Evolutionary Trees. M.Sc. thesis. Technical Report
no. 9301, Department of Computer Science, Memorial University of
Newfoundland, March 1993.
(
Abstract (text);
Document (PostScript/101 pages)
)
Created: January 15, 2002
Last Modified: April 13, 2002