Computer Science 3711, Winter '03
Course Diary
Copyright 2003 by H.T. Wareham
All rights reserved
Week 1,
Week 2,
Week 3,
Week 4,
Week 5,
Week 6,
Week 7,
Week 8,
Week 9,
(Midterm Exam Notes),
Week 10,
Week 11,
Week 12,
Week 13,
Week 14,
(Final Exam Notes),
(end of diary)
Thursday, January 9 (Lecture #1)
[Sections 1.1-1.2]
- Introduction to course; dates and conventions.
- Background: Problems and Algorithms
- Computational Problems
- A computational problem is a relation between inputs and
outputs.
- Types of computational problems.
- By type of output:
- Decision problem: Returns a Boolean value
(typically the answer to a question about
the optimal solution for a given instance of
a problem).
- Cost / Evaluation problem: Returns a number
(typically the cost of an optimal solution for
a given instance of a problem).
- Example / Solution problem: Returns a structure
(typically an optimal solution for
a given instance of a problem).
- Example: Versions of the Shortest Path
problem:
Shortest Path (Unweighted) [Decision]
Input: A graph G = (V,E), two
vertices u and v in V, and a positive integer k.
Question: Is there a path P in G from u to v such
that the number of edges in P is <= k?
Shortest Path (Unweighted) [Evaluation]
Input: A graph G = (V,E) and two
vertices u and v in V.
Output: The length of the path in G from u to v
with the smallest possible number of edges.
Shortest Path (Unweighted) [Solution]
Input: A graph G = (V,E) and two
vertices u and v in V.
Output: A path P in G from u to v such that
the number of edges in P is the smallest possible.
- By entities in I/O relation:
- Discrete problem: Manipulates integers
and/or discrete structures.
- Numerical problem: Manipulates
(arbitrary-precision) real
numbers in addition to integers and/or
discrete structures.
- This course will focus on various discrete problems.
In particular, we will be looking at a number of
discrete problems in which we are looking for a
substructure of a given structure that is optimal
relative to some function (combinatorial
optimization problems).
- Guidelines for defining computational problems
- Strike balance between abstraction, formalization,
and phrasing in terms of stated entities -- the
former exposes the mathematical structures
underlying a problem, the latter keeps the problem
relevant.
- Abstraction useful as it may show that your problem
is the same as one that has already been solved
=> The first rule of algorithm design is don't <=
- Example Bob's Trucking Company.
- Examine contents of all depots in a shipping network
=> graph traversal (assuming shipping network can be
modeled as a graph in which vertices are depots and
edges are routes).
- Minimizing costs associated with deliveries =>
shortest paths (# edges / summed edge-lengths)
Shortest Path (Unweighted)
Input: A graph G = (V,E) and two
vertices u and v in V.
Output: A path P in G from u to v such that
the number of edges in P is the smallest possible.
Shortest Path (Weighted)
Input: An edge-weighted graph G = (V,E,W) and two
vertices u and v in V.
Output: A path P in G from u to v such that
the sum of the weights of the edges in P is the
smallest possible.
- Maximizing reimbursements associated with deliveries
=> longest paths (# edges / summed edge-lengths)
Longest Path (Unweighted)
Input: A graph G = (V,E) and two
vertices u and v in V.
Output: A path P in G from u to v such that
the number of edges in P is the largest possible.
Longest Path (Weighted)
Input: An edge-weighted graph G = (V,E,W) and two
vertices u and v in V.
Output: A path P in G from u to v such that
the sum of the weights of the edges in P is the
largest possible.
- Maximizing the value of items loaded on a truck with
a specified maximum capacity, where truck must carry
all or none of each item => 0/1 Knapsack
0/1 Knapsack
Input: A set U of items, an item-size
function s(), an item-value function v(), and
a positive integer B.
Output: A subset U' of U such that the
sum of the sizes of the items in U' is less
than or equal to B and the sum of the values of
the items in U' is the largest possible.
space.
- Minimizing the number of emergency aid depots
required in a shipping network such that at least
one of the endpoints of each route has such a
depot => Vertex Cover
Vertex Cover
Input: An undirected graph G = (V,E).
Output: A subset V' \subseteq V such that
for all edges (u,v) \in E, at least one of u and v is
in V', and the size of V' is the smallest possible.
- Algorithms
- An algorithm for a problem is a finite sequence of
instructions relative to a particular computer-model
that, given an input for that problem, computes the
corresponding output.
- Types of algorithms.
- By type of computer-model:
- Instruction execution mode: Deterministic /
Randomized / Nondeterministic
- Number of processors: Serial (single processor)
/ Parallel (multiple processors)
- By type of computation: Discrete / Numerical.
- This course will focus on deterministic serial
discrete algorithms.
- Algorithms as technology: Algorithms are at
least as (and sometimes more) important to efficient
computing as traditional electronic technologies,
e.g., microprocessors, memory chips.
Tuesday, January 14 (Lecture #2)
[Sections 2.1-2.2 and 3.1]
- Background: Time Complexity
- Desirable properties of a measure of algorithm efficiency:
- Function of input size.
- Machine independent.
- Mathematically tractable, i.e., can be derived
(relatively) easily for any algorithm.
- Useful for both assessing individual and comparing groups
of algorithms.
- For most of this course, we will focus on efficiency wrt
algorithm running time; however, we will occasionally mention
other measures, i.e., space (computer memory).
- The road from actual running time to asymptotic worst-case time
complexity -- necessary lies (or rather, abstractions):
|
|
|
|
T(n) |
|
O(g(n)) |
Actual |
|
Abstract |
|
(Worst-Case) |
|
Asymptotic |
Running |
-----> |
Running |
-----> |
Time Complexity |
-----> |
Worst-Case |
Time |
|
Time |
|
|
|
Time Complexity |
|
|
Instruction |
|
Input Size / |
|
Asymptotic |
|
|
counts |
|
Worst Case |
|
Notation |
- Let's look at various concepts and techniques mentioned above
in more detail.
- Input size.
- Isolate a small set of parameters that describe
amount of memory used to store input.
- Hard to define formally; acquired informally via
experience doing time complexity analyses.
- We will assume where possible that we are dealing with
small numbers in the input that fit inside a single
computer-memory "word"; hence, the size of numbers in
the input need not be a parameter in the input size.
- Counting instructions
- Basic instructions in a program can have a wide range
of running times. Can indicate this in an analysis
by assigning a separate constant for the running time of
each kind of instruction; is accurate, but very
cumbersome.
- We will assume that each kind of basic instruction runs
in 1 time unit. Is not accurate but does simplify
analyses; moreover, can be justified by our ultimate
focus on asymptotic time complexity.
- We will assume a RAM model of memory in which all
stored elements are accessible in 1 time unit. Is not
accurate, as we ignore memory hierarchy; however,
does simplify analyses.
- Each type of control-structure has its own rule for
counting instructions.
- Two general methods to deriving a time complexity
function for an algorithm:
- Compute counts for individual instructions and
add together to derive function.
- "Grow" function by starting at innermost level
of nesting in algorithm and working outwards.
- Which method you use depends on the intricacy of
the given algorithm.
- while-loop rule: #iter(body + 1) + 1
- for-loop rule: #iter(body + 2) + 2
- Example: Doubly-nested for-loop.
- Programs involving loops and simple statements
naturally give time complexity functions such
that every input of a particular size runs in
the same time; unfortunately, once we allow
if-then-else conditional statements,
different inputs of the same size may run in
different times, e,g., sorting algorithms
on sorted and unsorted lists of the length.
- Need to summarize set of running times for each
input size s by a single number. Several
options:
- Best-case, e.g., lowest running time
over all inputs of size s.
- Average-case, e.g., average running time
relative to some probability distribution
over all inputs of size s.
- Worst-case, e.g., highest running time
over all inputs of size s.
- Worst-case time complexity functions are useful for
a variety of reasons:
- Worst-case is upper bound, which is mentally
useful.
- Worst-case may often occur in practice.
- Average-case may often be equivalent to
worst-case.
- Worst-case assumption is easiest to deal with
mathematically; can also on occasion simplify
analyses.
- if-then-else-statement rule:
max(if-body, else-body) + 1
- Example: Doubly-nested for-loop with
inner-loop if-then-else.
- Example: Some simple algorithms and
their time complexities
(tc_examples_1.txt)
- Example: Searching: the linear and binary search
algorithms (tc_searching.txt)
- Example: Sorting: the selection and insertion sort
algorithms (tc_sorting.txt)
- Asymptotic notation
- Used to smooth "raw" time complexity functions.
- Asymptotic upper bounds: Big-Oh notation
(O(g(n)).
- Example: T(n) = 3n^2 + 4n + 5 is O(n^2)
(big_Oh.txt)
- One could derive the required constant c by fancy
algebra; I prefer to set c to the sum of the
positive coefficients -- it's very dirty but it's
also quick.
Thursday, January 16 (Lecture #3)
[Section 3.1; Class Notes]
- Background: Time Complexity (Cont'd)
- Asymptotic notation (Cont'd)
- Asymptotic upper bounds: Big-Oh notation
(O(g(n)).
- Example: T(n) = 3n^2 + 4n + 5 is not O(n)
(big_Oh.txt)
- Try to avoid ludicrously loose upper bounds.
- Example: T(n) = 2n is O(n^57)
- Relationship between informal "T(n) is O(g(n))" definition
and "O(g(n)) as the set of all g(n)-upper-bounded functions"
definition given in textbook; keep the latter in mind,
but the former is colloquial CS usage. May stem from
algorithm-centric and math-centric views of asymptotic
notation, respectively.
- Asymptotic lower bounds: Omega notation
(OMEGA(g(n)).
- Example: T(n) = 3n^2 + 4n + 5 is OMEGA(n^2)
- Example: T(n) = 3n^2 + 4n + 5 is not OMEGA(n^3)
- Try to avoid ludicrously loose lower bounds.
- Example: T(n) = 2n^57 is OMEGA(n)
- Asymptotic exact bounds: Theta notation
(THETA(g(n)).
- Three caveats on using asymptotic notation:
- How you interpret asymptotic notation depends on how
the function T(n) was derived. If T(n) is an exact
value for all inputs of a size n, then can say
T(n) = O(g(n)) (OMEGA(g(n))) [THETA(g(n))] means
"the running time is upper (lower) [exactly] bounded
by (g(n)"; however, if T(n) is merely an upper bound
on the running time (as has been the case in all of our
examples), must be more careful in interpreting
OMEGA and THETA statements.
- Though it is in a sense an abuse of notation, you
will frequently see O(1) or THETA(1) used to indicate
an unspecified constant in various expressions.
- The rules about combining separate asymptotic-notation
expressions are tricky; be VERY careful if you ever
do this in your math -- it might be better off to
write out the full constant-bounds and work with those,
so you don't make mistakes.
- Guidelines for deriving asymptotic worst-case time
complexity functions (non-recursive algorithms)
- If you have a worst-case time complexity function
on hand, the leading term (minus coefficient) is
usually the asymptotic worst-case time complexity
function.
- Example: Though the selection and insertion
sort algorithms run in T(n) = 4n^2 - 2 and T(n) =
5n^2 - 2n - 1 time, respectively, both algorithms
run in O(n^2) time.
- As constants associated with different loops and
different numbers of embedded simple statements
vanish into the leading constant c in the definition
of asymptotic notation, can often derive a
near-optimal asymptotic worst-case time complexity
function by simply multiplying out the number of
iterations for each loop in the deepest chain of
nested loops.
- Example: As both the insertion and selection
sort algorithms consist of a pair of nested loops,
each of which has at most n iterations, both
algorithms run in O(n * n) = O(n^2) time.
- If you are ever confronted with an algorithm which
uses operations of unspecified time complexity,
build a parameterized time-complexity expression
and solve this expression as necessary when you are
given the time complexities of those operations.
- Example: Two nested for-loops,
for-bodies composed of procedure calls of
unspecified time complexity.
- The usual way to handle nested if-then-else
statements is to assume the largest sub-case
always executes. However, if you have a function
describing how often the conditional is true, you
may be able to do better, e,g., derivation
of O(|V| + |E|) time complexity for breadth-first
graph traversal algorithm.
- Example: for-loop with nested
if, for- and if-bodies and
if-conditional expression composed of
procedure calls of unspecified time complexity.
Tuesday, January 21 (Lecture #4)
[Sections 2.3, 4.1-4.2, and Appendix A]
- Background: Time Complexity (Cont'd)
- Deriving Time Complexities of Recursive Algorithms
- Example: Sorting: The merge sort algorithm
- Use recurrences to model running times of recursive
algorithms (amongst other things).
- Recurrence = base case(s) +
recursive (inductive) case
- Example: Recurrence for Fibonacci numbers
- Example: The recurrence for the number of nodes in
a complete binary tree of height n >= 0.
- Example: Recurrence for running time of Mergesort
algorithm.
- The relationship between a recursive algorithm and
the recurrence describing that algorithm's running time.
- Dealing with recurrences
- Simplifications of recurrences in context of asymptotic
worst-case bounds.
- Deriving upper bounds on recurrences
- Substitution method (mathematical induction in disguise)
(Section 4.1)
- Example: T(n) = 2T(n - 1) + 1 <= 2^{n + 1} - 1
- Example: T(n) = 2T(FL(n/2)) + 1 <= n - 1
- Oops! This fails when we try to show it
for n = 1. Why? The inductive hypothesis
is too strong!
- Example: T(n) = 2T(FL(n/2)) + 1 <= 2n - 1
- Example: T(n) = 2T(n/2) + cn <= cn(log2 n) + cn
- This is all fine and good for verifying guesses;
however, where do we get the guesses?
- Recursion tree method (diagrammatic expansion of
recurrence into summation and solution of summation) (Section 4.2)
- Example: The Mergesort algorithm runs in
O(n log n) time (Section 2.3.2).
- The relationship between the recurrence describing the
running time of a recursive algorithm and the
various portions of the diagram constructed via
the recursion tree method.
- Example: T(n) = 3T(n/4) + cn^2
- To solve these, you still need to be able to
solve or at the very least upper bound summation
expressions.
- Dealing with Summations (Appendix A)
- Three algebraic rules for simplifying
summations.
- First rule of solving / bounding summations:
Look it up!
- Simple summations: linear, quadratic, cube
- Geometric-series summations
Thursday, January 23 (Lecture #5)
[Appendix A; Class Notes]
- Background: Time Complexity (Cont'd)
- Deriving Time Complexities of Recursive Algorithms (Cont'd)
- Dealing with recurrences (Cont'd)
- Deriving upper bounds on recurrences (Cont'd)
- Dealing with Summations (Appendix A) (Cont'd)
- Example: T(n) = 3T(n/4) + cn^2 (Cont'd)
- If you are really comfortable with summations and
good at formulating them given the terms of
a series, there is a final method for dealing
with recurrences.
- Iteration method ("plain" expansion of
recurrence into summation and solution of summation)
- Example: T(n) = 2T(n/2) + cn
- Example: T(n) = 3T(n/4) + cn^2
- Guidelines for deriving asymptotic worst-case time
complexity functions (non-recursive algorithms)
- If the recurrence describing the running
time of a recursive algorithm is too bizarre,
e.g., multiple base and/or inductive cases,
simplify the recurrence -- we are typically only
interested in asymptotic upper bounds.
- Example: Simplifying the Fibonacci number
recurrence.
- Algorithm Design Techniques
- Combinatorial optimization problems redux
- Each solution associated with an instance of a
combinatorial optimization problem has
a cost, and we want the solutions of optimal
(min/max) cost over the whole solution space.
- Any time you have a problem of the form "Find
the best X" where X is some discrete structure,
chances are you're dealing with a combinatorial
optimization problem.
- Distinguish several types of solutions:
- Candidate solutions = all possible solutions to
the instance, e,g., all possible
edge-sequences of a given graph that start with
vertex u and end with vertex v.
- Viable solutions = all potentially optimal
candidate solutions, e.g., all
edge-sequences of a given graph that are paths
between vertices u and v.
- Optimal solutions = those viable solutions that
have optimal value relative to the cost
function over the set of viable solutions,
e.g., all paths in a given graph between
vertices u and v that have the
smallest number of edges over the space of
all such paths.
- Each such problem instance has a combinatorial,
i.e., exponential in instance size, candidate
(and often
viable) solution space -- the job of an algorithm
is to to find the optimal solutions in this space
(of which there will always be at least one).
- Example: The 0/1 Knapsack problem
- Candidate solutions = subsets of U.
- Viable solutions = subsets of U whose summed size
is <= B.
- Optimal solutions = subsets of U whose summed size
is <=B and whose summed value is maximum over the
space of viable solutions.
- Each subset U' can be modeled as a binary vector
of length |U| (v_i = 1 (0) => item i is (not) in
U'); hence, there are 2^{|U|}
candidate solutions for any instance of 0/1 Knapsack.
- Sometimes, you can't generate viable solutions --
you have to generate all candidate solutions and
check viability along the way.
- Example: The Longest Common Subsequence problem
Longest common subsequence (LCS)
Input: Two strings s, s' over an alphabet
Sigma.
Output: All strings s'' over Sigma such that s'' is
a subsequence of both s and s' and the length of
s'' is of is maximized over the solution space.
- Candidate solutions = all possible strings over
alphabet Sigma of length <= min(|s|,|s'|).
- Viable solutions = all subsequences of the shorter
given string.
- Optimal solutions = all subsequences of the shorter
given string that are also subsequences of the
longer given string and have the maximum length over
the space of viable solutions.
- Each subsequence s'' can be modeled as a binary
vector of length min(|s|,|s'|) (v_i = 1 (0) =>
symbol i is (not) included in the subsequence);
hence, there are
O(2^{min(|s|,|s'|)}) viable solutions for any
instance of LCS (note that this worst case only
occurs if each symbol in s and s' only occurs once).
- Sometimes, you can generate viable solutions
up front -- this depends in part on your ingenuity.
- Example: The Minimum Spanning Tree problem
Minimum spanning tree (MST)
Input: An edge-weighted graph G = (V, E, w).
Output: All spanning trees T of G, i.e., trees
that connect all vertices in G, such that the sum
of the weights of the edges in T is minimized
over the solution space.
- Candidate solutions = all (|V| - 1)-sized subsets of
the edges of G.
- Viable solutions = all (|V| - 1)-sized subsets of the
edges of G that are spanning trees of G.
- Optimal solutions = all spanning trees of G that have
minimum summed edge-weight over the space of
viable solutions.
- By Cayley's Theorem, a graph G = (V, E) has
O(|V|^{|V| - 2}) spanning trees (the worst case
here is when G is a complete graph); hence, there
are O(|V|^{|V| - 2}) viable solutions for any
instance of MST.
- Sometimes even the set of viable solutions is
dauntingly large ...
- Can solve combinatorial optimization problems by
looking at all (candidate/viable) solution and
saving those of the optimal cost (which we can
determine on the fly by saving those solutions
with the best cost we've seen so far at any point
in the enumeration of all solutions).
- This is inherently inefficient given the exponential
sizes of the (candidate/viable) solution spaces
involved. Can we do better?
- We most certainly can. Consider MST; though the
set of viable solutions for each instance of
MST is potentially exponentially large, we can
solve all instances of MST in
low-order polynomial time.
- How good we can do algorithm-wise depends on
how much of the (candidate/viable) solution
space our algorithm can ignore. Over the next
8 lectures, we'll examine a set of techniques
for doing this whose applicability depend on
some fairly simple properties of problem
solution-spaces.
Tuesday, January 28 (Lecture #6)
[Class Notes]
- Algorithm Design Techniques (Cont'd)
- Combinatorial Solution-Space Trees (Sections 9.1-9.7, B&B)
- The most basic approach to solving combinatorial
optimization problems.
- As a first step, simplify generation of solutions
for a problem instance by using a combinatorial
solution-space tree (CST).
In such a tree, internal nodes are partial
solutions and the full solutions labeling the
leaves comprise the solution space.
- Example: A CST for an instance of 0/1 Knapsack
in which |U| = 4.
- Example: A CST for an instance of LCS
in which x = "play" and y = "copay".
- Can traverse such a tree and examine all of its
leaves using depth-first search (DFS) or
breadth-first search (BFS).
- As a second step, decrease space requirements by
operating on an implicit CST, i.e., generate tree
nodes as necessary during traversal.
- Example: Algorithm for DFS-based traversal
of implicit CST (minimization version).
DFS-U(i, sol)
if (i == n + 1)
if (viable (sol))
return(cost(sol))
else
return(INFINITY)
else
return(min(DFS-U(i + 1, sol),
DFS-U(i + 1, sol U item[i])))
- Initial call: DFS-U(1, empty set)
- Note that n is the number of items in the
set in the instance; hence "i == n + 1" means
that you have reached a leaf in the CST.
- Example: Algorithm for BFS-based traversal
of implicit CST (minimization version).
BFS-U()
Q = {(1, empty set)}
best = INFINITY
while not empty(Q)
(i, sol) = pop(Q)
if (i == n + 1)
if (viable(sol))
best = min(best, cost(sol))
else
push(Q,(i + 1, sol))
push(Q,(i + 1, sol U item[i]))
return(best)
- Modifications:
- To make the above work for maximization
problems, change min to max and INFINITY
to -INFINITY.
- To obtain optimal solutions, the simplest way
is to run a second tree-traversal that
takes the optimal cost produced previously
as a parameter and stores / prints all
viable solutions with that cost.
- This is all well and good; however, we are still
looking at every node in the tree, and as this
tree is of exponential size, our algorithm runs
in exponential time. Can we do better?
- As a third step, where possible, use information
about partial solution viability and potential
optimality to terminate search of unfeasible
subtrees of the CST (dynamic pruning).
- Traditionally, DFS + dynamic pruning is known as
backtracking and BFS + dynamic pruning
is known as branch and bound.
- Backtracking typically used; however, if search
tree does not have finite depth, branch and
bound is preferable.
- Dynamic pruning works in CST because all
"full" candidate solutions that incorporate
a particular particular partial solution are
leaves in the subtree rooted at the node
corresponding to that partial solution; hence,
non-viability of a partial solution implies the
non-viability of all solutions in the subtree
rooted at the node corresponding to that
partial solution!
Thursday, January 30 (Lecture #7)
[Class Notes]
- Algorithm Design Techniques (Cont'd)
- Combinatorial Solution-Space Trees (Cont'd)
- Example: Viability and potential optimality
conditions for 0/1 Knapsack, LCS, and MST.
- Viability (continue if answer is "yes")
- Do items in partial solution fit in
knapsack? (0/1 Knapsack)
- Is partial solution string a subsequence of
both given sequences? (LCS)
- Do edges in partial solution form a forest?
(MST)
- Potential optimality (continue if answer
is "yes")
- Does sum of values of items in partial
solution plus sum of values of all remaining
items that could be added exceed the
value of the best solution seen so far?
(0/1 Knapsack)
- Does the length of the partial solution
string plus the length of the string of
all remaining characters that could be
added exceed the length of the best
solution string seen so far? (LCS)
- Is the sum of the weights of the edges in
the partial solution less than the
cost of the best solution found so far?
(MST)
- Example: Algorithm for DFS-based traversal
of implicit CST + dynamic pruning (minimization version).
DFS-C(i, sol, best)
if (not viable(sol)) or (best < cost(sol))
return(INFINITY)
else if (i == n + 1)
best = min(best, cost(sol))
return(cost(sol))
else
return(min(DFS-C(i + 1, sol, best),
DFS-C(i + 1, sol U item[i], best)))
- Initial call: DFS-C(1, empty set, INFINITY)
- Note that variable BEST is passed by reference,
e.g., it is a pointer to an integer; if it is
passed by value, the value of BEST is not
saved from any of the leaf-evaluations.
- Example: Algorithm for BFS-based traversal
of implicit CST + dynamic pruning (minimization version).
BFS-C()
Q = {(1, empty set)}
best = INFINITY
while not empty(Q)
(i, sol) = pop(Q)
if (viable(sol) and (cost(sol) < best)
if (i == n + 1)
best = min(best, cost(sol))
else
push(Q,(i + 1, sol))
push(Q,(i + 1, sol U item[i]))
return(best)
- Example: An implementation of a DFS-based CST
algorithm for solving 0/1 Knapsack in C-language
[knapCST.c,
[knapCST.script]
- Such tricks can drastically reduce actual running
time for CST traversal; however, all of these
algorithms are still asymptotically exponential time
in the worst case (you still might have to visit all
the nodes in the tree).
- The CST approach to solving combinatorial optimization
problems assumes nothing about the structure of
the solution space for an instance. If we did have
such knowledge, can we do better?
There are other types of computation trees besides
CST. There are, for example, the trees associated
with recursive computations, in which nodes represent
individual recursive algorithm calls on subproblems of
smaller size. Such a recursive-call tree is also
called a recursive decomposition.
Characteristics of recursive decomposition:
- Can decompose instance of problem into smaller
instance of the same problem, i.e.,
subproblems.
- At any point in the computation, may be several
choices for decomposition.
Problems that are solvable by recursive decomposition
have the optimal substructure property,
i.e, an optimal solution for an instance of
the problem can be constructed using optimal
solutions for instances of that problem that are
of smaller size (subproblems)..
Our remaining algorithm design techniques will be
applicable if a problem's recursive decomposition
satisfies additional properties.
Divide-and-Conquer (D&C)
- Applies if a problem has recursive decomposition in
which each subproblem only once, i.e.,
the subproblems do not repeat / overlap.
- Will typically only be efficient if the depth of
recursion is logarithmically bounded, thus ensuring
that the total number of recursive calls is
polynomially bounded.
- Example: Merge sort
- Example: Binary search
- Example: Fast exponentian (a^n)
- Example: Strassen's matrix multiplication
algorithm (Section 28.2)
Tuesday, February 4 (Lecture #8)
[Sections 15.3-15.4; Class Notes]
- Algorithm Design Techniques (Cont'd)
If we are lucky, there are a polynomial number of
distinct subproblems. If we
are really lucky, we can use the recurrence to solve
these subproblems in a "bottom up" rather than a
"top down fashion", solving the smallest subproblems
first and solving progressively larger subproblems
until we solve the original problem instance.
Dynamic programming (DP) => table-driven, bottom-up
recursive decomposition algorithms!
Example: DP algorithm for Longest Common
Subsequence (LCS)
- Recursive decomposition: three cases.
- Recurrence (recursive + base cases)
- D(i,j) = length of longest common
subsequence of first i characters of
s and first j characters of s'.
- Example run: s = cbab and s' = abacab
- Matrix fill-in (including backpointers)
for deriving length of longest common
subsequence.
- Rectangular (m + 1) x (n + 1) matrix;
fill-in proceeds from cell (1,1)
row-wise, column-wise, or
anti-diagonal-wise.
- Fill-in of an individual cell
involves consulting filled-in
cells in the 3-cell "overhang" to
left and behind of that cell.
Thursday, February 6 (Lecture #9)
[Sections 15.3]
- Algorithm Design Techniques (Cont'd)
- Dynamic Programming (Cont'd)
- Example: DP algorithm for Longest Common
Subsequence (LCS) (Cont'd)
- Traceback for deriving longest common
subsequences.
- DP algorithm (table fill-in + traceback)
- Time complexity of matrix fill-in: (# cells) x (#choices per cell) x
(max # subproblems per choice) = (mn) x (1) x
O(1) = 2mn = O(mn)
- Space complexity: # cells = (((m + 1) x (n + 1)) = O(mn)
- Deriving an arbitrary longest common subsequence
given a filled-in matrix takes O(m + n) time;
however, enumerating all longest common
subsequences may require exponential time!
Tuesday, February 11
- University closed due to blizzard; lecture canceled.
Thursday, February 13 (Lecture #10)
[Sections 15.2]
- Algorithm Design Techniques (Cont'd)
- Dynamic Programming (Cont'd)
- Example: DP algorithm for Matrix Chain
Parenthesization (MCP)
- Review of 2-matrix multiplication: O(n^3) time
(n = max matrix dimension)
- 3-matrix multiplication: Parenthesization of
matrix "chain" for 2-matrix multiplication
can drastically affect the number of
individual element multiplications.
- Recurrence (recursive + base cases)
- m(i,j) = minimum number of element
multiplications required to multiply
matrices i through j in the given chain.
- Note that this is the first recurrence
we've seen that involves multiple
subproblem decomposition choices.
- DP algorithm
- Example run: matrix chain is A_1 A_2 A_3 A_4,
where A_1 = 10 x 5, A_2 = 5 x 20, A_3 =
50 x 5, and A4 = 5 x 20. Thus, n = 4 and
dimension list p = (10,5,20,5,20)
- Matrix fill-in (including backpointers)
for deriving minimum number of
multiplications relative to optimal
matrix chain parenthesization.
subsequence.
- Ragged-bottom pyramid / B1 bomber
shape matrix; fill-in proceeds
from left to right in ascending
rows.
- Fill-in of an individual cell
involves consulting filled-in
cells in the "boomerang" hanging
off that cell.
- Traceback for deriving optimal matrix chain
parenthesization.
- Time complexity of matrix fill-in: (# cells) x (#choices per cell) x
(max # subproblems per choice) = (n(n - 1)/2) x O(n) x
O(1) = O(n^3).
- Space complexity: # cells = (n(n - 1)/2) = O(n^2)
Tuesday, February 18 (Lecture #11)
[Sections 15.3 and 16.1; Class Notes]
- Algorithm Design Techniques (Cont'd)
- Dynamic programming and divide-and-conquer algorithms
have enabled us to make many apparently difficult
problems solvable in low-order polynomial time. Can we
do still better? In certain cases, it seems that we can.
-
Consider binary search; it is a particularly interesting
example of
a D&C algorithm in which there is, at each point in the
computation, at most 1 choice and 1 subproblem to be
solved. In this case, the recursive computation collapses
to a path of recursive calls from the "root" problem to a
"leaf" subproblem.
- The subproblem structure for binary search is
unusually
simple, in that a choice can be made before
the subproblem is solved, i.e., the choice can be
made in a top-down, locally optimal (greedy) manner.
Compare this with divide-and-conquer and dynamic
programming algorithms, which solve subproblems in
a bottom-up manner and make choices only after all
necessary subproblems have been solved.
- Greedy Algorithms
- Example: The Activity Selection problem
- Problem statement
- DP algorithm
Thursday, February 20 (Lecture #12)
[Section 16.1; Class Notes]
- Algorithm Design Techniques (Cont'd)
- Greedy Algorithms (Cont'd)
- Example: The Activity Selection problem
- Can we do better? In particular, do we have
to evaluate all subproblem-pairs induced by
a choice ak from S(i,j), or can
we get away with a single ak-choice and
subproblem?
- It turns out that we can -- pick the
activity am from S(i,j) with
the earliest finishing time -- we are then
left with the subproblem S(m,j)!
(Theorem 16.1).
- This gets even better -- note that instead of
solving all S(i,j) in the manner
described above, we start with S(0,n+1)
and solve directly!
- Greedy algorithm (iterative)
Tuesday, February 25
- Midterm Break; no lecture
Thursday, February 27 (Lecture #13)
[Sections 23.1-23.2; Class Notes]
- Algorithm Design Techniques (Cont'd)
- Greedy Algorithms (Cont'd)
- Example: The Minimum Spanning Tree problem
- DP algorithm.
- Based on problems of the form P[G =(V,E,w),n] =
the cost of the minimum spanning forest
for G that has n edges (note that what
we want is P[G, |V| - 1]).
- We can imagine a recurrence for P[G,n] in
which we add all possible edges to
solution-forests for P[G, n-1] that connect
two formerly distinct components in the forest
and select those edges that result in
minimum-cost forests on n edges. To do this,
we need to associate the solution-forests
explicitly with P[G,n], i.e., we cannot
reconstruct them after matrix fill-in by
traversing backpointers.
- The resulting recurrence has a double
minimization over solution-forests for
P[G, n - 1] and edges that can be added to
these forests, respectively.
- Can we do better? In particular, do we have to
evaluate all of the subproblems induced by all edge
choices at each point, or can we get away with a
single edge-choice and subproblem?
- Consider the following very useful property of
minimum spanning forests: Given such a
forest for a graph G, each minimum-weight edge
linking separate components in the forest must
be in some minimum spanning tree for G
(Theorem 23.1; see also Corollary 23.2).
Such an edge is said to be safe.
- What this means is that all we have to do is
choose a safe edge and evaluate the resulting
subproblem! This gets rid of the inner
minimization over edges in the recurrence;
moreover, as the
choice of a safe edge always leads to some
minimum spanning tree, we can also get rid of
the outer minimization over spanning-forests.
- Generic MST algorithm (page 563):
GENERIC-MST(G = (V,E,w))
F = (V, {})
for (i = 1; i <= |V| - 1; i++)
Find a safe e in E
F = (V, E' u {e})
return F
- Under appropriate selections of safe edges,
this generic algorithm underlies both
of the classic minimum spanning tree algorithms
considered below.
- Kruskal's algorithm
- Start with trivial forest of G that just includes the
vertices in V; add edges greedily, minimum-weight
first, growing a minimum spanning forest that eventually
coalesces into a minimum spanning tree for G.
- Algorithm:
MST-KRUSKAL(G, w)
A = {}
for each vertex v in V do
MAKE-SET(v)
sort the edges in E in nondecreasing order by weight
for each edge in (u,v) taken in nondecreasing order by weight do
if FIND-SET(u) <> FIND-SET(v) then
A = A u {(u,v)}
UNION(u,v)
return(A)
- Note that this a direct implementation of
the generic MST algorithm above.
- Running time =
|V|(T(MAKE-SET)) + 2|E|(T(FIND-SET)) + |E|(T)UNION)) + SORT(|E|)
= |V|(T(MAKE-SET)) + 2|E|(T(FIND-SET)) + |E|(T)UNION)) + |E| log |E|
= |V|(T(MAKE-SET)) + 2|E|(T(FIND-SET)) + |E|(T)UNION)) + |E| log |V|
- Prim's algorithm
- Start with a single vertex in G; add edges greedily,
minimum-weight first, growing minimum spanning tree
for successively larger subsets of V until you
have a minimum spanning tree for G.
- Algorithm:
MST-PRIM(G,w,r)
for each u in V do
key[u] = INFTY
pi[i] = nil
key[r] = 0
Q = V
while Q is not empty do
u = EXTRACT-MIN(Q)
for each v in adj[u] do
if v in Q and w(u,v) < key[v] then
pi[v] = v
key[v] = w(u,v)
- Note that in this algorithm (unlike
Kruskal's), the spanning forest has
exactly one non-trivial component with
edges at any point during processing.
- Running time =
T(BUILD) + |V|T(EXTRACT-MIN)) +
|E|(T(DECREASE-KEY))
- Note that Kruskal's and Prim's algorithms are, in a
broad sense, converses of each other -- by adding
minimum-weight edges,
Kruskal's algorithm coalesces a (initially trivial)
spanning forest for the given graph G into a spanning tree
for G and Prim's algorithm extends spanning tree for
successively larger subgraphs of G until it has a spanning
tree for G.
There is usually a quicker way of deriving a greedy
algorithm than developing and subsequently "pruning"
a dynamic programming algorithm (see Section 16.2);
however, the examples given here are useful as they
show the intimate relationship between dynamic
programming and greedy algorithms.
The relationship between CST, DP, Divide-and-Conquer, and
Greedy algorithms:
Optimal
Substructure
AND
Independent
Subproblems?
/ \
/ YES \ NO
/ \
Repeated CST
Subproblems?
/ \
/ NO \ YES
/ \
Greedy DP
Choice?
/ \
/ YES \ NO
/ \
Greedy D&C
Monday, March 3
- Midterm Exam Notes
I've finished making up the upcoming midterm exam. This exam
will be closed-book and has a total of 60 marks (note that I
have tried to make the number of marks for a question roughly
equivalent to the number of minutes it should take you to do
it). The current layout is as follows:
- Asymptotic notation (3 parts) (9 marks)
- Asymptotic worst-case time complexity (3 parts)
(9 marks)
- Recurrences (2 parts) (12 marks)
- Algorithm design techniques (2 questions, 4 parts)
(30 marks)
Things that will not be covered are the substitution
method, detailed derivation of "raw" worst-case time
complexity functions, or very complex algorithms like the DP
algorithm for matrix-chain parenthesization. However, all else
is fair game. The questions on the first two assignments
are guides to what I have in mind; you should also find the
questions on previous CS 3711 exams of some help:
I hope the above helps, and I wish you all the best of luck
with this midterm.
Tuesday, March 4 (Lecture #14)
[Section 16.2; Class Notes]
- Algorithm Design Techniques (Cont'd)
- Greedy Algorithms (Cont'd)
- The differences that make related versions of a
problem amenable to
greedy algorithms instead of just dynamic
programming algorithms can be very subtle, and
do not appear to be (at this point in time,
anyway) either systematic or comprehensible.
- Example: Versions of the Knapsack problem
that allow greedy and dynamic programming
algorithms.
- Exploiting Data Structures
- The techniques we've seen so far use properties
of solution spaces to organize efficient searches
of these spaces; in many cases, we have lowered
time complexities from exponential to low-order
polynomial.
- Lower bound on time complexity of any algorithm for
a problem is typically linear (must at least read
all the input). To get as close as possible to this
bound, need to choose data structures that provide
best time complexities for operations required by
an algorithm.
- Example: Dynamic sets
- Variety of data structures for implementing
dynamic sets:
Data Structure |
Time Complexity (Worst Case) |
INSERT |
SEARCH |
DELETE (key) |
DELETE (pos) |
MIN |
Generic Sorting |
Array (Unsorted) |
O(n) |
O(n) |
O(n) |
O(n) |
O(n) |
O(n^2) |
Linked list (Unsorted) |
O(1) |
O(n) |
O(n) |
O(1) |
O(n) |
O(n^2) |
Array (Sorted) |
O(n) |
O(log n) |
O(n) |
O(n) |
O(1) |
O(n^2) |
Linked list (Sorted) |
O(n) |
O(n) |
O(n) |
O(1) |
O(1) |
O(n^2) |
Search tree |
O(n) |
O(n) |
O(n) |
O(1) |
O(n) |
O(n^2) |
Balanced search tree |
O(log n) |
O(log n) |
O(log n) |
O(log n) |
O(log n) |
O(n log n) |
Binary heap |
O(log n) |
O(n) |
O(n) |
O(log n) |
O(1) |
O(n log n) |
- Binary search trees
- Structured such that for any node x
in the subtree with subtrees L and R,
key(l) <= key(x) < key(r) where key(y)
is the key labelling node y and l and r
are nodes in L and R, respectively.
- This property has nice consequences:
- Minimum key at leftmost node in tree.
- Maximum key at rightmost node in tree.
- Search for a particular key is very
easy!
- Time complexity of operations tied to
tree height, i.e., number of levels of
nodes in tree; though height can be as
low as log n for n nodes, can be n
in worst case => all operations require
O(n) time!
- Balanced binary search trees, e.g.,
AVL trees, red-black trees
- Employ relatively sophisticated insertion
and deletion operations that automatically
rebalance tree and guarantee O(log n)
height => all operations require O(log n)
time!
- To obtain good time complexities for all set
operations can require fairly intricate
algorithmic machinery, e.g.,
balanced search trees. However, if only
require good times for a subset of these
operations, can do much better, e.g.,
heaps.
- (Binary) Heaps
- Structured such that for any node x
in the subtree with subtrees L and R,
either key(l), key(r) <= key(x) (max-heap)
or key(l), key(r) >= key(x) (min-heap).
- Note that is a weakened version of the
binary search tree property!
- Store keys in array such that no node
pointers needed to implement tree; under
this scheme, heap is nearly complete
binary tree with height O(log n).
- This property has some nice consequences:
- Minimum key at root node in tree
(min-heap).
- Maximum key at root node in tree
(max-heap).
- This doesn't seem like much compared to
binary search trees, as there is no
O(log n) search algorithm for heaps;
however, the weakened property does allow
for O(log n) insertion and pos-deletion
operations!
- Trades off time complexity of several
operations for ludicrously simple and
efficient algorithms for the remaining
operations.
- Insert(key) operation
- Intuition: Add new key to end of
heap (last leaf on bottom level)
and "trickle up" until parent
is less (min-heap) / greater
(max-heap).
- Implemented as INCREASE-KEY in
textbook.
- Requires time proportional to height
of tree => O(log n) time!
- Delete(pos) operation
- Intuition: Replace key at pos with
last key in heap (last leaf on bottom
level) and "trickle down" until
left and right children are
is less (max-heap) / greater
(min-heap).
- Implemented as HEAPIFY in textbook.
- Requires time proportional to height
of tree => O(log n) time!
- As heap property is weaker version of
binary search tree property, cannot
exploit heap property to do fast search on
heaps; however, as weaker properties are
easier to maintain during changes to a
data structure, gives simple and efficient
algorithms for insert and delete on
heaps!
- BUILD-MAX-HEAP algorithm
- Takes unordered list and does in situ
"trickle down" operation / HEAPIFY
moving back from position n/2 to
1 to make a heap.
- Naive analysis predicts O(n log n)
running time; however, is actually
O(n)!
- Priority queues
- Require subset of dynamic set
operations, e.g.,
insert(key), min(), extract-min().
- Used in many greedy algorithms,
e.g,, Prim's algorithm
for minimum spanning trees.
- There are several lessons encoded in the table above:
- To obtain the best overall set of running
times for a set of operations relative to
an abstract data type, a complex underlying
data structure may be required, e.g.,
dynamic sets and balanced search trees.
- When one is dealing with an algorithm that
only invokes a subset of the operations
relative to an abstract data type, the best
overall data structure implementing that
data type may not be appropriate -- rather,
one looks for a data structure that gives the
best running times for the required subset of
operations, e.g., generic sorting and
binary heaps.
Thursday, March 6
(Lecture #15) Tuesday, March 11
[Section 22.1; Class Notes]
- Algorithm Design Techniques (Cont'd)
- Exploiting Data Structures (Cont'd)
- Example: Disjoint sets
- Instead of maintaining a dynamic set,
consider maintaining a dynamic partition
of a (fixed) set, i.e., a
collection of disjoint sets whose union is
a particular set.
- View set elements as pointers into sets.
- Operations:
- Make-set(x): Create set with element x.
- Find-set(x): Return the representative
element of the set containing x.
- Union(x,y): Create the union-set of the
sets containing x and y.
- Useful in a variety of algorithms, e.g.,
Kruskal's algorithm for minimum spanning trees,
determining the connected components of an
undirected graph.
- Variety of data structures for implementing
disjoint sets:
Data Structure |
Time Complexity (Worst Case) |
MAKE-SET |
FIND-SET |
UNION |
List (naive) |
O(1) |
O(n) |
O(1) |
List (rep-ptr) |
O(1) |
O(1) |
O(n) |
Forest (naive) |
O(1) |
O(n) |
O(1) |
Forest (rep-ptr) |
O(1) |
O(1) |
O(n) |
Forest (UR + PC) |
O(1) |
O(n) |
O(1) |
- In tree representation of sets, nodes are
set elements and we only need to travel
from any node to the root => store tree as
set of node-parent relations!
- Consider forest augmented by union-by-rank (UR)
and path compression (PC) heuristics.
- Implement dynamic rebalancing of trees
in forest; effectively collapse them
to O(1) depth over a sequence of
disjoint set operations.
- Individual operation time complexities
not improved, but can show that
any sequence of m operations
on a base set of n elements can be done in
O(m alpha(n)) time, where alpha(n) <= 4
for any application =>
Heuristic-augmented forest allows
constant-time disjoint set
operations!.
- Example: Exploiting data structures to
solve the Minimum Spanning Tree problem
- Running time of Kruskal's algorithm
- Running time =
|V|(T(MAKE-SET)) + 2|E|(T(FIND-SET)) + |E|(T)UNION)) + SORT(|E|)
= |V|(T(MAKE-SET)) + 2|E|(T(FIND-SET)) + |E|(T)UNION)) + |E| log |E|
= |V|(T(MAKE-SET)) + 2|E|(T(FIND-SET)) + |E|(T)UNION)) + |E| log |V|
- Running time relative to various data
structures for disjoint sets:
Data Structure |
Time Complexity (Worst Case) |
MAKE-SET |
FIND-SET |
UNION |
Kruskal's Algorithm |
List (naive) |
O(1) |
O(n) |
O(1) |
O(|E||V|) |
List (rep-ptr) |
O(1) |
O(1) |
O(n) |
O(|E||V|) |
Forest (naive) |
O(1) |
O(n) |
O(1) |
O(|E||V|) |
Forest (rep-ptr) |
O(1) |
O(1) |
O(n) |
O(|E||V|) |
Forest (UR + PC) |
O(1) |
O(n) |
O(1) |
O(|E| log |V|) |
- Running time of Prim's algorithm
- Running time =
T(BUILD) + |V|T(EXTRACT-MIN)) +
|E|(T(DECREASE-KEY))
- Running time relative to various data
structures for priority queues:
Data Structure |
Time Complexity (Worst Case) |
BUILD |
EXTRACT- MIN |
DECREASE- KEY |
Prim's Algorithm |
Array (Unsorted) |
O(n) |
O(n) |
O(n) |
O(|E||V|) |
Linked list (Unsorted) |
O(n) |
O(n) |
O(n) |
O(|E||V|) |
Array (Sorted) |
O(n log n) |
O(n) |
O(n) |
O(|E||V|) |
Linked list (Sorted) |
O(n log n) |
O(1) |
O(n) |
O(|E||V|) |
Search tree |
O(n) |
O(n) |
O(n) |
O(|E||V|) |
Balanced search tree |
O(n) |
O(log n) |
O(log n) |
O(|E| log |V|) |
Binary heap |
O(n) |
O(log n) |
O(1) |
O(|E| log |V|) |
- In general, an efficient algorithm is a collaboration
between an appropriate algorithm design technique
and one or more data structures whose operation-sets
are optimally efficient for the subset of operations
required by the algorithm.
- Need good knowledge of both algorithm design
techniques and data structures to create efficient
algorithms.
- There is actually a third area of expertise needed by
an algorithm designer -- namely, a knowledge of
the best-known algorithms for standard problems.
After all, who wants to reinvent the wheel?
- Example: New problems arising at Bob's Trucking
Company.
- Such knowledge tends to be domain-specific. Let's
concentrtate on the domain of graphs and standard
operations on graphs.
- Graph algorithms
- Why use graphs?
- Graphs are the natural way to represent any situation
in which there are emtities and relationships
between entities.
- Representations of graphs (Section 23.1)
- Two basic choices:
- Adjacency list
- Adjacency matrix
- Adjacency list more compact for sparse graphs,
i.e., graphs such that |E| << |V|(|V| - 1)/2;
however, as they store only one bit per adjacency,
adjacency matrices better for non-sparse graphs.
- Common operations on graphs:
- next-adj(v): Return next vertex in enumeration
of vertices adjacent to v.
- adj(v, v'): Return TRUE if v is adjacent to v'
in G and FALSE otherwise.
- Representations differ in terms of the efficiency of
these operations.
Data Structure |
Time Complexity (Worst Case) |
NEXT-ADJ |
ADJ |
Adjacency List |
O(1) |
O(|V|) |
Adjacency Matrix |
O(|V|) |
O(1) |
- As with any data structure, need to optimize graph
representation to algorithm in terms of number of
type of operations used.
Thursday, March 13 (Lecture #16)
[Sections 22.2; Class Notes]
- Graph algorithms (Cont'd)
- Graph traversal
- Graph traversal useful in two ways:
- Visit all nodes in a graph.
- Impose annotation / structure on the vertices
and/or edges of a graph that has subsequent
uses.
- Two types of graph traversal: breadth-first search
(BFS) and depth-first search (DFS).
- Breadth-first search on general graphs
- Intuition:
Exploration of shells of vertices at successively
larger distances from a selected point.
- Basic BFS algorithm:
BFS(G, s)
for each vertex u in G - {s} do
color[u] = white
color[s] = gray
Q = {}
ENQUEUE(Q, s)
while Q is not empty do
u = DEQUEUE(Q)
for each v in adj[u] do
if color[v] = white then
color[v] = gray
ENQUEUE(Q,v)
color[u] = black
- Interpretation of colors:
- black = fully explored (node + children)
- gray = partially explored (node only)
- white = unexplored
- Visualize as ink spreading from a central point
outwards along edges until it has reached all
vertices in a graph.
- Analysis of running time
- Running time = |Q|(T(adj) + O(1)) + |E|(O(1))
= |V|(T(adj) + O(1)) + |E|(O(1))
- Adjacency list: O(|V| + |E|) time
- Adjacency matrix: O(|V|^2 + |E|) time
- Though |E| <= |V|)|V| - 1)/2 = O(|V|^2),
often keep |E|-term in expressions to
show running time relative to sparse
graphs.
- Did not analyze algorithm in terms of
rigorously-derived T(n) function --
worked less formally and directly in terms
of big-Oh behavior of individual algorithm
components.
- Did not analyze algorithm by nested control
structure -- had to reason about behavior
of algorithm as a whole to derive certain
quantities, e.g., total length of
queue over execution of algorithm.
- Full BFS algorithm (page 532):
BFS(G, s)
for each vertex u in G - {s} do
color[u] = white
** d[u] = INFTY
** pi[u] = NIL
color[s] = gray
** d[s] = 0
** pi[s] = ni;
Q = {}
ENQUEUE(Q, s)
while Q is not empty do
u = DEQUEUE(Q)
for each v in adj[u] do
if color[v] = white then
color[v] = gray
** d[v] = d[u] + 1
** pi[v] = u
ENQUEUE(Q,v)
color[u] = black
- What are the pi and d arrays for?
- BFS defines a search tree based on all
vertices of G that is rooted at the start
vertex s; the pi-array defines the
node-parent relations in this tree.
- This search tree is structured such that
the length of the path from s to u in this
tree is the length of the shortest path
(in terms of # edges) between s and u
in G; the d-array contains these
shortest-path length values.
- Intuition behind proof of correctness for
d-values: If there were a
shorter path from s to v via some
vertex u, during exploration of children
of u, v would have previously been
colored gray!
- Can use BFS to derive shortest paths
between a specified vertex s and all other
vertices in a graph.
Tuesday, March 18 (Lecture #17)
[Sections 22.3; Class Notes]
- Graph algorithms (Cont'd)
- Graph traversal (Cont'd)
- What are the pi, d, and f arrays for?
- Instead of a search tree, we now have a
search forest; the pi-array defines
the node-parent relations in this
forest.
- The d- and f-values indicate discovery and
finishing times of vertex-processing
during depth-first search.
- Informal argument for why d[u] < d[v] < f[v] < f[u]
if u is an ancestor of v in a depth-first search
tree; formal proof given
in Parenthesis Theorem (Theorem 22.7, pages 543, 545)
and Corollary 22.8 (page 545).
- Classification of edges relative to a DFS-derived
search forest (tree / back / forward / cross edges).
- Can classify edges as tree,
back. or forward / cross during depth-first
search as a function of new-vertex color; can
then differentiate forward and cross edges
using vertex discovery times => constant
time edge-classification test!!
- Application: Testing graph acyclicity
- Do depth-first search, and if there are no back
edges, graph is acyclic! (Lemma 22.11).
Thursday, March 20 (Lecture #18)
[Sections 22.4 and 24.1]
- Graph algorithms (Cont'd)
- Graph traversal (Cont'd)
- Depth-first search on general graphs (Cont'd)
- Application: Topological sorting
- Useful in deriving event-orderings that are
consistent with a given set (graph) of
pairwise event-orderings.
- Can be done very simply using a slightly modified
depth-first search -- yet another use of vertex
finishing times!
- Minimum Spanning Trees
- Note that Prim's algorithm is somewhat like BFS;
both are expanding a frontier outwards from
an initial vertex in the graph,
but while BFS expands this frontier uniformly
by distance from the start point, the expansion
by Prim is relative the the smallest-weight
edge extending into the frontier.
- Shortest Paths in Edge-Weighted Graphs
- Has many applications depending on how you interpret
the edge-weights, e.g., distances, costs.
- Can solve by applying edge-expansion and
ordinary BFS; however, this yields an exponential
time algorithm. Can we do better?
- Two types of problems:
- Single-source shortest paths
- All-pairs shortest paths
- Can further break down algorithms based on whether
or not they detect negative-weight cycles. Such
cycles are problematic because they render
shortest-paths estimates meaningless, e.g.,
can make length of path arbitrarily short by
appropriate number of traversals around the cycle.
- Properties of shortest paths:
- Triangle inequality: d(s,v) <= d(s,u) + w(u,v)
- Shortest path has at most |V| - 1 edges,
i.e., can eliminate negative-, zero-,
and positive-weight cycles from shortest paths.
- Shortest paths satisfy the optimal substructure
property, i.e., given a shortest path
from s to v and a vertex u on this path, the
subpaths s->u and u->v are also shortest paths;
this means that there are many nice algorithmic
possibilities, i.e., DP, D&C, Greedy.
- The Relax operation
- Algorithm (pages 585-586):
Initialize-Single-Source(G, s)
for each vertex v in adj[v] do
d[v] = INFTY
pi[v] = nil
d[s] = 0
Relax(u, v, w)
if (d[v] > d[u] + w(u,v)) then
d[v] = d[u] + w(u,v)
pi[v] = u
- The relax operation is trivial; however,
sequences of relaxations can be quite powerful.
- All algorithms we consider for the next 1 1/2
lectures will essentially differ only in the
the order in which edges are relaxed and the
number of times each edge is relaxed.
Tuesday, March 25 (Lecture #19)
[Sections 24.1 and 24.3]
- Graph algorithms (Cont'd)
- Shortest Paths in Edge-Weighted Graphs (Cont'd)
- Single-source shortest paths algorithms
- Dijkstra's algorithm
- This algorithm essentially applies relaxations in
a systematic matter to grow a set of vertices with
known shortest-path distances outwards from the
source vertex to encompass the whole graph. In
particular, relaxations are applied to vertices
(or rather, the adjacencies of those vertices)
as they are added to this set and can be seen as
extending out from the frontier of this region into
the remainder of the graph.
- Algorithm:
Dijkstra(G, w, s)
Initialize-Single-Source(G, s)
S = {}
Q = V[G]
while Q is not empty do
u = EXTRACT-MIN(Q)
S = S u {u}
for each vertex v in adj[u] do
Relax(u, v, w)
- Analysis of running time
- Running time
= T(INITIALIZE) + |V|T(EXTRACT-MIN)) + |E|T(RELAX))
= T(INITIALIZE) + |V|T(EXTRACT-MIN)) + |E|T(DECREASE-KEY))
- Runs in O(|E| log |V|) time relative to a
binary heap.
- Cannot detect negative cycles, and will produce
erroneous results if one or more such cycles
are present.
- Doesn't Dijkstra's algorithm look an awful lot
like Prim's algorithm for minimum
spanning trees? It should!
Try writing the code for the initialization and
relaxation procedures into Dijkstra's
algorithm, eliminating the S-lines (which are
not being used), and then comparing with Prim's
algorithm -- oddly enough, you'll see that
relaxation is a generalization of the key-value
adjustment done by Prim on lines 9 and 11,
e.g.,
Dijkstra-Rewrite(G, w, s)
for each u in V do
d[u] = INFTY
pi[u] = nil
d[s] = 0
Q = V
while Q not empty do
u = EXTRACT-MIN(Q)
for each v in adj[u] do
if d[u] + w(u,v) < d[v] then
d[v] = d[u] + w(u, v)
pi[v] = u
- The Bellman-Ford algorithm
- This algorithm essentially adopts the philosophy that
if you invoke relaxation often enough, all will be
well. Oddly enough, it turns out that if you relax
each edge in the graph |V| - 1 times, the d-values
associated with all vertices are guaranteed to settle
to the actual shortest-distance values. This
works because each round of relaxations
relaxes one edge further along a shortest path
and each shortest path has at most
|V| - 1 edges.
- Algorithm:
Bellman-Ford(G, w, s)
Initialize-Single-Source(G,s)
for i = 1 to |V| - 1 do
for each edge (u,v) in E do
Relax(u, v, w)
for each edge (u,v) in E do
if d[v] > d[u] + w(u,v) then
return FALSE
return(TRUE)
- Running time = O(|V||E|)
- This algorithm does (|V| - 1)|E| relaxations, as compared to
the |E| relaxations done by Dijkstra's algorithm,
and hence runs much slower. However, it has the
advantage of being able to detect negative-weight
cycles.
- The intuition behind this is almost the same as
for the Bellman-Ford algorithm in general as
given above -- in addition to propagating correct
shortest distances along a path, the
invoked relaxations propagate
the negative summed weights around negative-weight
cycles and dump these negative weights onto
regular paths, where the resulting screwup in the
relation between d-values of adjacent vertices can
be detected by a simple scan of all edges
(lines 5-7 of the algorithm)
- Shouldn't Dijkstra's algorithm be able to detect
negative-weight cycles by the same argument? I
suspect not, as Dijkstra's algorithm probably does
not invoke enough relaxations
to guarantee that the influences
of negative-weight cycles will necessarily be
dumped onto all (or even any) paths --
however, I could be wrong in this.
- Note that the running times given for both Dijkstra's
algorithm and the Bellman-Ford algorithm assume
that the input graph is represented as an adjacency
list -- if the graph is instead represented as an
adjacency matrix, the running time of both
algorithms becomes O(|V|^3).
Thursday, March 27 (Lecture #20)
[Sections 25.1-25.2; Class Notes]
- Graph algorithms (Cont'd)
- Shortest Paths in Edge-Weighted Graphs (Cont'd)
- The Floyd-Warshall algorithm
- Running time = O(|V|^3)
- Note that both of these algorithms are technically
operating on a 3-D matrix of size O(|V|^3); however,
as we only need
to save adjacent 2-D "slices" of this matrix,
we actually use O(|V|^2) space.
- Note that the running times given for the three
algorithms above assume that the input graph is
represented as an adjacency matrix -- if the graph
is instead represented as an adjacency list, the
running times of each algorithm increase by a
a multiplicative factor of |V|!
Consider some examples of how to solve problems using standard
graph algorithms.
Example: Solving Problem 1(c), Assignment #1 with
a MST subroutine (one call, instance / solution
transformation required):
PROB1C(G, w, B)
for each edge e in E' do
w'(e) = B - w(e)
T = MST(G, w')
return(T)
Example: Solving Problem 1(d), Assignment #1 with
a BFS subroutine (O(|V|) calls, no instance / solution
transformation required):
PROB1D(G, S)
for each vertex v in S do
SP(v) = BFS(G, v)
return(union of all SP(v))
Example: Solving the single-source shortest paths
problem with a BFS subroutine (one call, instance /
solution transformation required):
SS-SP(G, w, s)
V' = V
E' = {}
for each edge (x,y) in E do
add vertices {v_xy1,..., v_xy(w(x,y) - 1)} to V'
add edges (x,v_xy1),
(v_xyi,v_xy(i + 1)) for 1 <= i <= w(x,y) - 2,
(v_xy(w(x,y) - 1),y) to E'
SP(s) = BFS(G', s)
return(SP(s))
Note that as long as all transformations involved and the
invoked subroutine run in polynomial time, the whole
process runs in polynomial time. For example, the first
two examples above yield polynomial-time algorithms but
the third example yields an exponential-time algorithm.
Such a process can be viewed as relating problems in the sense
that if A reduces to B, B is at least as computationally
difficult as A -- indeed, it
seems to rank problems by relative
computational difficulty. Remember this -- it will come
in handy later.
Using the techniques we've talked about so far in this course,
we can often derive a polynomial-time algorithm for a problem
of interest and hence prove that this problem is solvable in
polynomial time.
However, it is also often the case that we run across problems
for which we cannot find a polynomial time algorithm. Are we
just not thinking hard enough or is it really the case that
there isn't a polynomial-time algorithm for that problem?
That is, how can we prove that a problem is not
solvable in polynomial time?
Tuesday, April 1 (Lecture #21)
[Sections 34.1-34.2; Class Notes]
Example: Class EXPTIME is the set of all decision
problems that have exponential-time algorithms.
Decision versions of all problems we've looked at so far
are in EXPTIME.
P is properly included in EXPTIME.
Example: Class NP is the set of all decision
problems that have nondeterministic polynomial-time
algorithms, i.e., problems such that candidate
solutions can be verified in polynomial time.
Decision versions of all problems we've looked at so far
are in NP.
Example: VERTEX COVER is in NP courtesy of the
O(|E||V'|) time verification algorithm that determines
if every edge in G has an endpoint in the given candidate
vertex cover V'.
Example: KNAPSACK is in NP courtesy of the
O(|U'|) time verification algorithm that determines if
the summed size of the items in candidate solution U' is
<= B and the summed value of the items in U' is >= k.
Example: STEINER TREE IN GRAPHS is in NP courtesy
of the polynomial time verification algorithm that
determines if the specified set of candidate solution
tree edges has
size <= k and forms a tree in G that connects all
vertices in V'.
P is certainly included in NP; however. it is not known
if this inclusion is proper (hence, the "P = NP?"
conjecture). Still, if EXPTIME-complete problems are
not useful to us, NP-complete problems might be.
Now, all we need is one EXPTIME- or NP-hard problem
to start the ball rolling ...
Thursday, April 3 (Lecture #22)
[Sections 34.3-34.5; Class Notes]
- Proving Polynomial-Time Unsolvability (Cont'd)
- One strategy for obtaining an initial C-hard problem X
for class C is to create a generic reduction from any
problem in C to X. How might one do this, though?
- Such a generic reduction is relatively easy to construct
relative to a class C of decision problems relative to
some type of language-recognition machine.
- Equivalence of decision problems and language
acceptance problems (map decision problem to
language of bitstrings consisting of
instance-encodings of all instances for which
answer is "yes").
- All problems in a class C of decision problems
relative to a language-recognition-machine M are
reducible to the following problem (which is also
typically itself in class C):
M-Acceptance
Input: Machine m of type M, input x.
Question: Does m accept, i.e.,
produce output "yes", on
input x?
- Example: P is the set of all decision problems
whose associated languages of "yes"-instances can be
recognized by a polynomial-time deterministic Turing
machine (DTM).
- Example: EXPTIME is the set of all decision
problems whose associated languages of "yes"-instances
can be recognized by a exponential-time deterministic
Turing machine (EDTM).
- Hence, we have one polynomial-time unsolvable problem --
EDTM-ACCEPTANCE! However, there are many problems of
interest for which one cannot construct reductions from
EDTM-ACCEPTANCE. Perhaps these problems are not
EXPTIME-hard; need something a bit less difficult,
computationally-speaking.
- Example: NP is the set of all decision problems
whose associated languages of "yes"-instances can be
recognized by a polynomial-time nondeterministic Turing
machine (NDTM).
- Nondeterministic TM may have arbitrary-choice
conditional statements; machine computes "yes" on
an input if there is some set of choices at
these arbitrary-choice conditionals that yields
an accepting computation for the given string.
- Arbitrary-choice conditionals = way of looking at all
possible solutions for a given instance within
an NDTM! Hence, nicely encodes notion of
polynomial-time verification.
- NDTM ACCEPTANCE is NP-complete!
- NDTM ACCEPTANCE may appear as unnatural a problem as EDTM
ACCEPTANCE at first glance; however, it is not.
- Two routes to useful NP-complete problems:
- Historical route
- NTDM ACCEPTANCE many-one reduces to CNF SATISFIABILITY
(Cook (1971))
- CNF SATISFIABILITY many-one reduces to 3-CNF
SATISFIABILITY (Karp (1972))
- The real genius of Karp (1972) was to show reductions from
CNF SATISFIABILITY and 3-CNF SATISFIABILITY to 18
problems of interest drawn from graph theory, scheduling,
and network design. This initial list of 18 has since
blossomed to several thousand (see the Appendix of
Garey and Johnson (1979) for a small but nonetheless
significant list of known NP-hard problems).
- Note that one of Karp's reductions (MINIMUM COVER to
STEINER TREE IN GRAPHS) is wrong (Question 2(a),
Assignment 5); however, with a slight modification,
the general idea works (Question 2(b),
Assignment 5).
- Stresses the need for proofs of correctness for
reductions.
- Textbook route
- NDTM ACCEPTANCE many-one reduces to CIRCUIT
SATISFIABILITY (Lemma 34.6)
Circuit Satisfiability
Input: A circuit C.
Question: Is there a set of input-values for
C that
makes C output true on its output line?
- CIRCUIT SATISFIABILITY many-one reduces to GENERAL
SATISFIABILITY (Theorem 34.9)
General Satisfiability
Input: A combinational logic formula F.
Question: Is there a satisfying assignment
for F?
- GENERAL SATISFIABILITY many-one reduces to 3-CNF
SATISFIABILITY (Theorem 34.10)
- By transitivity of reductions, all problems above are
NP-hard. This thus makes for the second (and usually
far easier) strategy for proving a problem X C-hard:
- Strategy #1: Give a generic reduction from every
problem Y in C to X.
- Strategy #2: Give a reduction from known C-hard
problem Y to X.
- Example: 3-CNF SATISFIABILITY many-one reduces to CLIQUE
(Theorem 34.11)
- Example: 3-CNF SATISFIABILITY many-one reduces to SUBSET SUM
(Theorem 34.15)
- Example: CLIQUE many-one reduces to VERTEX COVER
(Lemma 3.1, Garey and Johnson (1979))
- Relies on a nice graph-theoretic characterization
of cliques and vertex covers on the same set of
vertices.
- Define the complement Gc = (V,Ec) of a graph
G = (V,E) as the graph on V with edge-set
Ec = {(u,v) | (u,v) is not in E}.
- Consider vertices not in a vertex cover V' for
graph G = (V,E); none of these vertices are
connected by an edge in E (else, one of them
would have to be in the vertex cover). Hence,
there is an edge between each pair of these
vertices in Gc, and indeed these vertices form a
clique in Gc. Hence, V' is a vertex cover in G iff
V - V' is a clique in Gc!
- Given an instance (G,k) of CLIQUE, the reduction
constructs an instance (Gc,|V| - k) of VERTEX COVER.
This reduction is correct by the property noted
above.
- As VERTEX COVER many-one reduces to STEINER TREE IN
GRAPHS, SUBSET SUM many-one reduces to KNAPSACK (see
previous lecture), and both VERTEX COVER and SUBSET SUM
are NP-hard, STEINER TREE IN GRAPHS and KNAPSACK are
NP-hard! Which means that these problems are not
solvable in polynomial time!
... Doesn't it? ...
- We don't know whether or not P is properly
contained in NP. Hence, there must always be a
qualifying clause attached to an interpretation of an
NP-hardness result -- namely, an NP-hard problem X is
not solvable in polynomial time unless P = NP.
- To this day, it is not known if P = NP or not -- however,
the best
evidence that P != NP is that no-one has yet found a
polynomial-time algorithm for an NP-hard problem.
- All this is fine and good; however, the inconvenient
fact remains that even if a problem is NP-hard, we
still need to be able to solve it.
- Are there alternatives to exponential-time
optimal-solution algorithms for NP-hard problems? Yes,
there are! To see why, we need to understand
exactly what an NP-hardness result means:
if a problem is NP-hard
then there is no deterministic algorithm
that gives optimal solutions
for all instances of the problem
in polynomial time
unless P = NP.
- Note, however, that there may still be usable
algorithms that violate one or more of the
italicized clauses above, e.g.,
polynomial-time randomized optimal-solution
algorithms, polynomial-time deterministic
approximate-solution algorithms.
- There are many such options; we will look at several
of them in our remaining lectures.
Monday, April 7
- Final Exam Notes
I've in the process of making up our final exam.
The exam will be closed-book.
It will be 120 minutes long and has a total of 120 marks
(this is not coincidental; I have tried to make the number
of marks for a question equivalent to the number of minutes
it should take you to do it). There
may still be some minor adjustments of marks allotted to
particular questions, but the current layout is as follows:
- Asymptotic notation / Asymptotic worst-case
time complexity (30 marks)
- Algorithm Design Techniques, e.g.,
CST, dynamic programming, greedy algorithms
(14 marks)
- Graph algorithms (16 marks)
- Algorithm design (40 marks)
- Proving polynomial-time
unsolvability (20 marks)
Things that will not be covered are
detailed derivation of "raw" worst-case time
complexity functions.
However, all else is fair game. You should
look over and make sure you understand all assignment answers
as well as those for the midterm; in particular, anything that
people had problems with on the midterm is just crying
out to be asked again.
The various midterm exams and last year's final exam
may also be of help:
I wish you all the best of luck with both this final exam and
all of your other final exams.
Tuesday, April 8 (Lecture #23)
[Section 35.1; Class Notes]
- Dealing with Polynomial-Time Unsolvability (Cont'd)
- Polynomial-time randomized algorithms
- Algorithms that are allowed to make random choices; in
conjunction with assumption of various distributions
(typically uniform) on the space of inputs, can
derive algorithms with an expected probability of
producing a correct (optimal) answer or an expected
running time.
- Two types of algorithms:
- Monte Carlo: Running time fixed,
expected probability of producing a correct
(optimal) answer (you don't know if you'll
win).
- Las Vegas: Expected running time,
produces correct (optimal) answer (you don't
know when you'll win).
- Typically used to give faster algorithms for problems
already known to be solvable in low-order polynomial
time, e.g., an O(|V| + |E|) expected time
Las Vegas algorithm for deriving minimum spanning
trees (Section 10.3, Motwani (Motwani and
Raghavan (1995)).
- Also useful for certain problems not known to be
in P, e.g., the Miller-Rabin Monte Carlo
algorithm for primality testing (Section 31.8).
- Polynomial-time bounded-cost approximation algorithms
- Example: A 2-approximation algorithm for
KNAPSACK (Garey and Johnson (1979), page 135;
Section 16.2).
- Without loss of generality, assume that each
item in U has size <= B, i.e., each
item can individually fit in the knapsack.
APPROX-KNAPSACK(U,s,v,B)
Order items in u in decreasing ratio of value/size, i.e.,
v(u1)/s(u1) >- v(u2)/s(u2) >= ... >= v(un)/s(un)
U' = {}
for i = 1 to n do
if size(U') + s(ui) <= B then
U' - U' u {ui}
max = 1
for i = 2 to n do
if v(u_max) < v(ui) then
max = i
if (value(U') > v(u_max))
return(U')
else
return({u_max})
- Very few problems have additive approximation
algorithms.
- Example: If P != NP then there is no
polynomial time approximation algorithm with
an additive polynomial factor p(|I|) for KNAPSACK
(Theorem 6.6, Garey and Johnson (1979)).
- Given an instance I of KNAPSACK, create a new
instance I' such that v'(ui) =
(p(|I|) + 1)v(ui). It is obvious that C_O(I') =
(p(|I|) + 1)C_O(I). As all solution values
for I' are multiples of p(|I|) + 1 and
solution values are integers, C_O(I') -
C_A(I') <= p(|I|) = 0. As any solution for I'
is also a solution for I, the approximation
algorithm actually produces an optimal solution
for KNAPSACK in polynomial time, which (as
KNAPSACK is NP-complete) implies that P = NP.
- Few problems have FPTAS or PTAS -- KNAPSACK is
one of them (Garey and Johnson (1979),
pages 135-137).
- Use appropriately-defined reductions and complexity
classes to show that certain types of approximation
algorithms do not exist for certain problems,
e.g., use of MAX SNP-hardness relative to
polynomial-time L-reducibility to show that
VERTEX COVER and STEINER TREE IN GRAPHS do not
have PTAS unless P = NP (see Chapter 5,
Wareham (1993) and references).
- Fixed-parameter algorithms
(Downey and Fellows (1999), Fellows (2001))
- An algorithm whose non-polynomial running time
terms are expressed in terms of aspects of the
input that are of very small size in practice.
- Such algorithms are of interest, as many
problems that are polynomial-time unsolvable in
general occur in restricted versions in practice.
- Example: 3-D Robot Motion Planning
3-D Robot Motion Planning (3DRMP)
Input: A jointed robot R, a set of obstacles
O embedded in some 3-D environment E,
and initial and final positions p_I
and p_F of R in E.
Question: Is there a sequence of robot
motions that move R from p_I to p_F
without colliding with any obstacles
in O?
- Is PSPACE-hard for arbitrary robots and
environments. However, robots are seldom
arbitrary -- for example, the number of
joints in a robot arm <= 7 and the number of
joints in a robot hand <= 20.
- Is there a non-polynomial-time algorithm for
3DRMP whose non-polynomial terms are expressed
as functions of the number of joints in
the robot?
- Two key notions:
- A parameterized (decision) problem has
instances of the form (x,k) where x is the
main part and k is the parameter.
- A parameterized problem is fixed-parameter
tractable if it has an algorithm whose
running time is O(f(k)x^c) where f() is an
arbitrary function and c is some constant
independent of k.
- Example: (k)-VERTEX COVER is the parameterized
version of VERTEX COVER in which the size k of the
vertex cover is the parameter.
- Example: (k)-CLIQUE is the parameterized
version of CLIQUE in which the size k of the clique
is the parameter.
- Example: (d)-3DRMP is the parameterized
version of 3DRMP in which d, the number of joints
in the robot, is the parameter.
- Show fixed-parameter tractability of a parameterized
problem by giving a fixed-parameter algorithm for
that problem.
- Example: (k)-VERTEX COVER is fixed-parameter
tractable.
FP-VERTEX-COVER(G,k)
L = {({{},G)}
VC = {}
for i = 1 to k do
for each item(S',G') in L do
select arbitrary edge e = (u,v) in E'
if G - {u} is empty then
VC = VC u {S' u {u}}
else
L = L u {({S' u {u}}, G - {u})}
if G - {v} is empty then
VC = VC u {S' u {v}}
else
L = L u {({S' u {v}}, G - {v})}
return(VC)
- Implicitly creates binary search tree of
depth k and spends O(|V| + |E|) time
processing each node in that tree; hence,
algorithm runs in O(2^k(|V| + |E|)) time.
- Show that parameterized problems are not
fixed-parameter tractable by showing these
problems complete or hard for the appropriate
classes of problems that are either known or
strongly conjectured to properly include FPT,
the class of all fixed-parameter tractable
parameterized problems.
- Example: (k)-CLIQUE is W[1]-complete and
hence is not fixed-parameter tractable unless
FPT = W[1].
- Example: (d)-3DRMP is W[SAT]-hard and
hence is not fixed-parameter tractable unless
FPT = W[SAT] (Cesati and Wareham (1995)).
- Parameters can be combinations of aspects, so single
fixed-parameter tractability or intractability
results relative to particular parameters are only
spurs to further research ...
The Future of Algorithm Analysis:
Experimental Algorithmics (Moret (2001))
- Worst-case time complexity has served us well; however,
there are undeniable problems:
- Asymptotic behavior may only matter at absurdly large input
sizes.
- Worst-case behavior may only hold for a very small
and infrequently-encountered set of inputs.
- Absurdly large constants may be hidden in the
big-Oh notation.
- Deriving bounds on running time is very difficult for
complex algorithms.
- Machine model underlying asymptotic worst-case
complexity is inadequate.
- Need to balance theoretical analysis with implementation
of and well-designed experiments on the behavior of
algorithms => experimental algorithmics!
- Example: Experimental study of MST algorithms.
- Examined a variety of algorithms across a variety of
platforms (computers / operating systems /
compilers).
- Accounted for platform-dependent effects
by comparing algorithm on each platform against
simple linear-time graph-traversal algorithm with
same memory-access patterns, e.g., graph
traversal.
- Prim's algorithm with heap priority queue
performed best => when algorithms get sufficiently
complex, theoretical logarithmic-factor improvements
do not matter in practice.
- Implementation of Prim's algorithm sped up 10x over
course of experiments.
- Benefits of experimental algorithmics:
- Better assessment of relative utility of algorithms.
- Better implementations of algorithms.
- Improved theories for assessing algorithm efficiency ...
... Which brings us back, tired but hopefully wiser, to where
we started in the first lecture of this course ...
To travel is to go far.
To go far is to return.
References
Unless otherwise stated, all references above are to Cormen
et al. (2001).
- Brassard, G. and Bratley, P. (1996) Fundamentals of
Algorithmics. Prentice-Hall; Englewood Cliffs, NJ.
[Abbreviated above as B&B]
-
Cesati, M. and Wareham, H.T. (1995) "Parameterized Complexity
Analysis in Robot Motion Planning." In Proceedings of the 25th IEEE
International Conference on Systems, Man, and Cybernetics: Volume 1.
IEEE Press; Los Alamitos, CA. 880-885.
(PostScript/6 pages)
- Cook, S. (1971) The complexity of theorem proving procedures.
In Proceedings of the Third Annual ACM Symposium on Theory of
Computing. ACM Press; New York. 151-158.
- Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C. (2001)
Introduction to Algorithms. The MIT Press; Cambridge, MA.
[Abbreviated above as CLRS]
- Downey, R.G. and Fellows, M.R. (1999) Parameterized
Complexity. Springer-Verlag; Berlin.
(
description)
- Fellows, M.R. (2001) "Parameterized Complexity: New Developments
and Research Frontiers". Manuscript.
(PostScript/22 pages)
- Garey, M.R. and Johnson, D.S. (1979) Computers and
Intractability: A Guide to the Theory of NP-Completeness.
W.H. Freeman; San Francisco, CA.
- Karp, R.M. (1972) Reducibility among combinatorial problems. In
R.E. Miller and J.W. Thatcher (eds.) Complexity of Computer
Computations. Plenum Press. 85-103.
- Moret, B.M.E. (2001) "Towards a discipline of experimental
algorithmics." To appear, DIMACS Monograph in Discrete
Mathematics and Theoretical Computer Science. AMS Press.
( PostScript/13 pages)
- Motwani, R. and Raghavan, P. (1995) Randomized Algorithms.
Cambridge University Press.
- Wareham, H.T. (1993) On the Computational Complexity of
Inferring Evolutionary Trees. M.Sc. thesis. Technical Report
no. 9301, Department of Computer Science, Memorial University of
Newfoundland, March 1993.
(
Abstract (text);
Document (PostScript/101 pages)
)
Created: January 7, 2003
Last Modified: April 7, 2003