ON THE COMPUTATIONAL COMPLEXITY OF INFERRING
EVOLUTIONARY TREES
Harold Todd Wareham
ABSTRACT
The process of reconstructing evolutionary trees can be viewed
formally as an optimization problem. Recently, decision problems
associated with the most commonly used approaches to
reconstructing such trees have been shown to be NP-complete
[Day87,DJS86,DS86,DS87,GF82,Kri88,KM86]. In this thesis, a
framework is established that incorporates all such problems
studied to date. Within this framework, the NP-completeness
results for decision problems are extended by applying theorems
from [CT91,Gas86,GKR92,JVV86,KST89,Kre88,Sel91] to derive bounds
on the computational complexity of several functions associated
with each of these problems, namely
# evaluation functions, which return the cost of the optimal
tree(s),
# solution functions, which return an optimal tree,
# spanning functions, which return the number of optimal
trees,
# enumeration functions, which systematically enumerate all
optimal trees, and
# random-selection functions, which return a randomly-
selected member of the set of optimal trees.
Where applicable, bounds are also presented for the versions of
these functions that are restricted to trees of a given cost or
of cost less than or greater than a given limit. Based in part
on these results and theorems from [BH90,GJ79,KMB81,Kre88],
bounds are derived on how closely polynomial-time algorithms can
approximate optimal trees. In particular, it is shown using the
recent results of [ALMSS92] that no phylogenetic inference
optimal-cost solution problem examined in this thesis has a
polynomial-time approximation scheme unless P = NP.
REFERENCES
[ALMSS92] Arora, S., Lund, C., Motwani, R., Sudan, M.,
and Szegedy, M. "Proof Verification and
Intractability of Approximation Problems".
Manuscript, 1992.
[BH90] Boppana, R. and Halldorsson, M.M.
"Approximating Maximum Independent Sets by
Excluding Subgraphs". In J.R. Gilbert and
R. Karlsson (eds.) SWAT 90. Lecture Notes
in Computer Science no. 447, Springer-Verlag,
Berlin, 1990. 13-25.
[CT91] Chen, Z-.Z. and Toda, S. "On the Complexity of
Computing Optimal Solutions". Manuscript, 1991.
[Day87] Day, W.H.E. "Computational Complexity of
Inferring Phylogenies from Dissimilarity
Matrices". Bulletin of Mathematical Biology,
49(4), 461-467, 1987.
[DJS86] Day, W.H.E., Johnson, D.S., and Sankoff, D.
"The Computational Complexity of Inferring
Rooted Phylogenies by Parsimony". Mathematical
Biosciences, 81, 33-42, 1986.
[DS86] Day, W.H.E. and Sankoff, D. "Computational
Complexity of Inferring Phylogenies by
Compatibility". Systematic Zoology, 35(2),
224-229, 1986.
[DS87] Day, W.H.E. and Sankoff, D. "Computational
Complexity of Inferring Phylogenies from
Chromosome Inversion Data". Journal of
Theoretical Biology, 124, 213-218, 1987.
[GJ79] Garey, M.R. and Johnson, D.S. COMPUTERS AND
INTRACTABILITY. W. H. Freeman, New York, 1979.
[Gas86] Gasarch, W.I. "The Complexity of Optimization
Functions". Technical Report no. 1652,
University of Maryland, Department of Computer
Science, 1986.
[GKR92] Gasarch, W.I., Krentel, M.W., and Rappoport, K.J.
"OptP as the Normal Behavior of NP-Complete
Problems". Manuscript, 1992. To appear in
Mathematical Systems Theory.
[GF82] Graham, R.L. and Foulds, L.R. "Unlikelihood
That Minimal Phylogenies for a Realistic
Biological Study Can Be Constructed in
Reasonable Computational Time". Mathematical
Biosciences, 60, 133-142, 1982.
[JVV86] Jerrum, M.R., Valiant, L.G., and Vazirani, V.V.
"Random Generation of Combinatorial Structures
from a Uniform Distribution". Theoretical
Computer Science, 43, 169-188, 1986.
[KST89] Kobler, J., Schoning, U., and Toran, J. "On
Counting and Approximation". Acta Informatica,
26, 363-379, 1989.
[KMB81] Kou, L., Markowsky, G., and Berman, L. "A Fast
Algorithm for Steiner Trees". Acta Informatica, 15,
141-145, 1981.
[Kre88] Krentel, M.W. "The Complexity of Optimization
Problems". Journal of Computer and System
Sciences, 36(3), 490-509, 1988.
[Kri88] Krivanek, M. "The Complexity of Ultrametric
Partitions on Graphs". Information Processing
Letters, 27, 265-270, 1988.
[KM86] Krivanek, M. and Moravek, J. "NP-Hard Problems
in Hierarchical Tree Clustering". Acta
Informatica, 23, 311-323, 1986.
[Sel91] Selman, A.L. "A Taxonomy of Complexity Classes
of Functions". Technical report, State
University of New York at Buffalo, Department
of Computer Science, 1991. To appear in Journal
of Computer and System Sciences. An abbreviated
version appeared in Bulletin of the European
Association for Theoretical Computer Science,
45, 114-130, 1991.